Upload
tracey-carroll
View
216
Download
1
Embed Size (px)
Citation preview
Parallelization of 2D Lid-Driven Cavity Flow
Asif SalahuddinAhmad Sharif
Jens Kehne
04 December 2008 1Parallelization of 2D Lid-Driven Cavity Flow
OBJECTIVES
04 December 2008 2Parallelization of 2D Lid-Driven Cavity Flow
3
Our objectives
• Numerical simulation of fluid dynamics, using the Lattice-Boltzmann method
• Parallelize the code using MPI– Study speedup and scalability
• Allow to run large problem sizes in reasonable time– Allow to run them at all, for that matter (memory
requirements)
04 December 2008 Parallelization of 2D Lid-Driven Cavity Flow
CONCEPT
04 December 2008 4Parallelization of 2D Lid-Driven Cavity Flow
The Lattice-Boltzmann method
• The Lattice-Boltzmann equation:
• Velocity directions:
),(),(1
),()1,( )0( txftxftxftexf iiiii
04 December 2008 5Parallelization of 2D Lid-Driven Cavity Flow
Top-down vs. bottom-up
Partial differential equations
(Navier-Stokes)
Differenceequations (Conserved Quantities?)
Discretization
Partial differential equations
(Navier-Stokes)
Discrete model (LGCA or LBM)
Multi-scale analysis
04 December 2008 Parallelization of 2D Lid-Driven Cavity Flow 6
Fluid nodes
• The entire problem is represented as a grid of fluid nodes– Fluid nodes hold velocities towards all neighbors
• New grid state computed for discrete time steps
04 December 2008 7Parallelization of 2D Lid-Driven Cavity Flow
Wall bounceback
• The fluid domain is surrounded by walls
• On each timestep, the direction of links hitting a wall is reversed
• Walls may be moving– Changes the momentum of the fluid close to it
04 December 2008 8Parallelization of 2D Lid-Driven Cavity Flow
IMPLEMENTATION
04 December 2008 9Parallelization of 2D Lid-Driven Cavity Flow
Domain decomposition
• Each processor processes part of the grid
: Ghost nodes– Represent border nodes
of the neighbors
: Border nodes– Updated by neighbors
: Inner nodes– We can update these alone
04 December 2008 10Parallelization of 2D Lid-Driven Cavity Flow
Automatic decomposition
• Factorize and merge– Factorize x and y dimension and #procs– Divide x and y by prime factors of #procs
• Goal: Try to keep the processor’s grids as square as possible– Best relation between inner and border nodes– Minimizes communication
04 December 2008 11Parallelization of 2D Lid-Driven Cavity Flow
Automatic decomposition - demo
#CPUs: 6 = 2 * 3X-axis: 30 = 2 * 3 * 5Y-axis: 20 = 2 * 2 * 5
04 December 2008 Parallelization of 2D Lid-Driven Cavity Flow 12
30
20 # CPUs: 6
1010
Optimizations
• Overlapping wall bounceback and communication– About 5% speedup
• Overlapping inner node computation with communication– Massive slowdown!– Probably due to cache effects
• Making use of regular communication pattern– Slower (we have no idea why!)
04 December 2008 13Parallelization of 2D Lid-Driven Cavity Flow
EXPERIMENTAL RESULTS
04 December 2008 14Parallelization of 2D Lid-Driven Cavity Flow
Experimental setup
• Lonestar Linux cluster @ University of Texas– Part of the Teragrid project
• 1300 compute nodes• 2 Intel Xeon 2.66 GHz dual-core CPUs per node– 42.6 GFLOPS/node
• 8GB RAM/node• Linux kernel 2.6, 64 bit• Infiniband interconnect, fat tree topology
04 December 2008 Parallelization of 2D Lid-Driven Cavity Flow 15
Actual speedup
04 December 2008 16Parallelization of 2D Lid-Driven Cavity Flow
Relation to expected speedup
04 December 2008 17Parallelization of 2D Lid-Driven Cavity Flow
QUESTIONS
04 December 2008 18Parallelization of 2D Lid-Driven Cavity Flow