33
Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods Irene Moulitsas – George Karypis Department of Computer Science and Engineering Army HPC Research Center University of Minnesota Work supported by: NSF, DOE, AHPCRC

Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

  • Upload
    alyson

  • View
    38

  • Download
    2

Embed Size (px)

DESCRIPTION

Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods. Irene Moulitsas – George Karypis Department of Computer Science and Engineering Army HPC Research Center University of Minnesota. Work supported by: NSF, DOE, AHPCRC. Outline. Problem Definition - Motivation - PowerPoint PPT Presentation

Citation preview

Page 1: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

Irene Moulitsas – George KarypisDepartment of Computer Science and Engineering

Army HPC Research CenterUniversity of Minnesota

Work supported by: NSF, DOE, AHPCRC

Page 2: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

2

Outline

Problem Definition - Motivation

Serial Multilevel Coarse Grid Construction

Parallel Formulation

Experimental Results

Summary

Page 3: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

3

Motivation…

Geometric Multigrid Methods

Widely used as they exhibit fast convergence rates, O(n) work for a problem of n unknowns

Level 3

Coarse Grid

Level 2

Level 1

Fine Grid

Page 4: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

4

…Motivation…

Structured .vs. Unstructured Grids

For structured grids, there is an optimal way to generate the coarse grid

Most real life problems need unstructured grids Generating a sequence of coarse unstructured grids is far from obvious !

Page 5: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

5

…Motivation…

Performance of Geometric Multigrid Methods on Unstructured Grids is highly dependent on the quality of the grids.

Agglomeration Techniques They use the connectivity of the dual graph They start from a vertex of this graph and fuse

together some of its adjacent vertices into a new control volume. This is repeated until all vertices have been fused into

control volumes. The quality of the control volumes can be

optimizedSteve, Lallemand, Dervieux (Computer and Fluids, 1992), Venkatakrishnan, Mavriplis (NASA, 1994)

Page 6: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

6

…Motivation

Limitations…

They are serial in nature

Greedy algorithms operate locally

Page 7: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

7

Our contribution…

A multilevel approach for coarse grid construction

We formulate it as an optimization problem that Optimizes a particular measure of the overall quality of the

coarse grid. Subject to the constraint that each control volume contains

between Lmin and Lmax elements.

We use the multilevel paradigm to solve this optimization problem

Page 8: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

8

Challenges…

Key Issues… How to measure the quality of a coarse grid ? How to use the multilevel paradigm to optimize the

quality of a coarse grid ? How to parallelize our algorithm ?

Design Objectives… Robust Algorithms Highly Accurate Computationally Efficient

Page 9: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

9

How to Measure the Quality of the Coarse Grid

Individual Element Aspect Ratios

VS 2

3

A

2D

3D

sl2A

!!!n combinatio F

maxF

F

F

4

...13

12

11

iNCoarsei

NCoarse

iii

NCoarse

ii

A

Aw

A

Control Volume Aspect Ratios

Page 10: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

10

How to use the Multilevel Paradigm to Optimize the Quality of the Coarse Grid?

The Multilevel Paradigm For Graph Partitioning

Coarsening P

hase

Initial Partitioning Phase

Ref

inem

ent P

hase

A sequence of coarse graphs is constructed.

A partitioning of the coarsest graph is computed quickly.

The partitioning is successively

projected back to the original graph. At each finer graph,

a refinement algorithm is applied.

METIS

CHACO

JOSTLE

Page 11: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

11

Modeling the Grid via a Graph…

Weighted Dual Graph

Vertex weight

Vertex boundary surface

Vertex volume

Edge weight

Page 12: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

12

Can our problem be solved using the existing graph partitioning algorithms? Is it an instance of a k-way partitioning problem in

which k=N/Lmin?

No! The objectives of the two problems do no match!

How to use the Multilevel Paradigm to Optimize the Quality of the Coarse Grid?

Page 13: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

13

Coarsening Phase

Compute a maximal matching of the vertices.

Collapse together matched vertices.

Ensure edge and vertex attributes of the coarsened graph accurately reflect those of the finer graph.

Random Matching

Globular Matching2

[2] [1]

Original graph

Matching computed

Graph coarsened

Page 14: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

14

Refinement phase

Compute the gain of each vertex.

The gain of a vertex is equal to the reduction in the objective function that will result from moving it to a different control volume.

Move the highest gain vertex to the adjacent subdomain subject to control volume size constraints.

Update the gains of each neighboring vertex.

Page 15: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

15

Example: Graph Coarsening and Refining

2

2

3

2

2

3

2

2

3

3 2

42

Page 16: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

16

Enforcing the Lmin and Lmax constraints

The coarsening and refinement techniques do not guarantee that our size constraints are enforced

A “merge” phase follows the last step of refinement. All undersized control volumes are merged

If there still are undersized elements, a “contribute” phase follows

Page 17: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

17

How to Parallelize our Algorithm…

Highly parallel formulations for multilevel graph partitioning already exist (e.g., ParMETIS).

Answer : NO !!!Answer : NO !!!

Number of control volumes is too large !

QuestionQuestionCan we use the basic structure of these Can we use the basic structure of these algorithms to parallelize our coarse algorithms to parallelize our coarse grid construction algorithm?grid construction algorithm?

Page 18: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

18

…Parallel Formulation…

k - way partitioning algorithms are designed for problems in which k << n (n is the size of the problem)

These algorithms are highly unscalable if k~n

They cannot ensure the Lmin and Lmax size constraints

We need to find a new parallel formulation

Page 19: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

19

Our Parallel Formulation…

Call a graph partitioning algorithm to find a good p-way partitioning of the grid.

The graph is then moved such that each partition becomes local to a single processor.

Each processor finds a good coarse grid for its locally stored elements.

Page 20: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

20

…Parallel Implementation…

Features :

Leads to good control volumes for most of the internal regions of the grid.

Embarrassingly parallel procedure. Very Fast

The control volumes of the regions along and near the partition interfaces are not the best.

Page 21: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

21

… Parallel Implementation…

We readjust the partitioning boundaries so that interface nodes become well internal

We employ an adaptive repartitioning algorithm from ParMetis for that

Page 22: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

22

…Parallel Implementation

This is the overview of the parallel formulation method

Page 23: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

23

Experimental Results…

Evaluate the different objective functions

Evaluate the scalability of the parallel formulation

Name #Elements Description

M6 94,493 M6 Wing

F22 428,748 F22 Wing

F16 1,124,648 F16 Wing

Data Sets

Page 24: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

24

…Experimental Results…

We tested the quality of the coarse grids in the simulation of an unsteady flow of moving grids arising in aero-elasticity problems, using an edge based multigrid solver (from Daimler Chrysler Aerospace Military Aircraft, Germany)

Data Set

Residual Norm

#Coarse Levels Lmin Lmax

M6

F22

F16

10 5

10 10

10 5

4

4

4

6

6

6

1

1

1

Page 25: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

25

Serial Algorithm Evaluation

M6

215

160146 149 156 148

0

50

100

150

200

250

300

350

400

Trad1 Trad2 ML_F1 ML_F2 ML_F3 ML_F3_F2

Iter

atio

ns

F22

181153 155 159 157 160

0

50

100

150

200

250

300

350

400

Trad1 Trad2 ML_F1 ML_F2 ML_F3 ML_F3_F2

Iter

atio

ns

F16

399

358 349 345 349 339

0

50

100

150

200

250

300

350

400

Trad1 Trad2 ML_F1 ML_F2 ML_F3 ML_F3_F2

Iter

atio

ns

We evaluate the performance of our serial algorithm by looking at the number of iterations the multigrid algorithm needs to converge

Trad1Traditional Agglomeration Technique

Trad2Agglomeration Technique based on Aspect Ratios

Page 26: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

26

CRAY T3E Performance

M6

103.17

9.95

50.43

22.3

4.38 2.24 1.68 1.06 1.060

20

40

60

80

100

120

2 4 8 16 32 64 128 256 512

number of processors

tim

e (

in s

ec

)

F22

256.13

125.21

3.55

61.06

29.7114.86 7.11 4.52

0

50

100

150

200

250

300

4 8 16 32 64 128 256 512

number of processors

time

(in

se

c)

F16

163.14

90.08

7.1110.7219.15

40.47

0

30

60

90

120

150

180

16 32 64 128 256 512

number of processors

time

(in

se

c)

We measured the actual time our multilevel algorithm needs to produce a coarse grid

Page 27: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

27

Linux Cluster Performance

M6

46.14

28.21

14.18

18.64

0

10

20

30

40

50

2 4 8 16

number of processors

time

(in s

ec)

F22

246.66

153.21

74.71 50.55

0

50

100

150

200

250

300

2 4 8 16

number of processors

time

(in s

ec)

We measured the actual time our multilevel algorithm needs to produce a coarse grid

Page 28: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

28

Parallel Algorithm Evaluation

M6

146 147 149 150147 146 148 149148 148 150 152146 146 147 152

0

50

100

150

200

250

300

350

p=2 p=4 p=8 p=16

Itera

tions

ML_F1

ML_F2

ML_F3

ML_F3_F2

F22

158 159 160 157158 157 158 156157 156 156 156159 158 161 158

0

50

100

150

200

250

300

350

p=2 p=4 p=8 p=16

Itera

tions

ML_F1

ML_F2

ML_F3

ML_F3_F2

F16

353 352 350 344341 342 344 355339 348 345 354

338 341 349 357

0

50

100

150

200

250

300

350

400

p=2 p=4 p=8 p=16

Itera

tions

ML_F1

ML_F2

ML_F3

ML_F3_F2

We evaluate the performance of our parallel algorithm by looking at the number of iterations the serial multigrid algorithm needs to converge for the set of coarse grids we produced in parallel

Page 29: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

29

Quality Measures

M6 F22 F16

#PES F3 F2 F3 F2 F3 F21 24.3 1.82e+06 - - - -2 22.6 1.82e+06 - - - -4 22.5 1.82e+06 27.1 8.29e+06 - -8 22.7 1.82e+06 29.3 8.28e+06 - -16 22.6 1.81e+06 23.1 8.25e+06 22.4 2.02e+0732 22.6 1.80e+06 23.6 8.23e+06 26.4 2.02e+0764 22.6 1.80e+06 24.0 8.21e+06 71.1 2.01e+07128 22.6 1.80e+06 23.1 8.20e+06 22.8 2.01e+07256 23.0 1.79e+06 168 8.19e+06 28.4 2.01e+07512 24.3 1.78e+06 45.7 8.18e+06 35.0 2.01e+07

Quality measures on Cray T3E

Page 30: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

30

Summary of Work

New multilevel algorithms for generating coarse grids Coarse grids with well shaped elements

Parallel multilevel algorithms Highly scalable algorithm that creates coarse

grids of the same quality as the serial algorithm

MGridGen / ParMGridGen http://www.cs.umn.edu/~moulitsa/software.html

[email protected], [email protected]

Thank You !

Page 31: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

31

Serial Algorithm Evaluation

Trad1 : Traditional Agglomeration Technique

Trad2 : Agglomeration Technique based on Aspect Ratios

M6 F22 F16

Technique #Iterations #Iterations #Iterations

Trad1 215 181 399

Trad2 160 153 358

ML_F1 146 155 349

ML_F2 149 159 345

ML_F3 156 157 349

ML_F3_F2 148 160 339

Convergence of serial multigrid algorithm

Page 32: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

32

Parallel Algorithm Evaluation

M6 #Iterations F22 #Iterations F16 #Iterations

P=2 P=4 P=8 P=16 P=2 P=4 P=8 P=16 P=2 P=4 P=8 P=16

F1 146 147 149 150 158 159 160 157 353 352 350 344

F2 147 146 148 149 158 157 158 156 341 342 344 355

F3 148 148 150 152 157 156 156 156 339 348 345 354

F3_F2 146 146 147 152 159 158 161 158 338 341 349 357

Convergence of parallel multigrid algorithm

Page 33: Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods

33

Parallel Algorithm Timings

CRAY T3E BEO

M6 F22 F16 M6 F22

#PES Time Time Time Time Time

1 80.66 - - 32.74 173.63

2 103.17 - - 46.14 246.66

4 50.43 256.13 - 28.21 153.21

8 22.30 125.21 - 14.18 74.71

16 9.95 61.06 163.14 18.64 50.55

32 4.38 29.71 90.08 - -

64 2.24 14.86 40.47 - -

128 1.68 7.11 19.15 - -

256 1.06 4.52 10.72 - -

512 1.06 3.55 7.11 - -

Run Times (in sec)