Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Performance comparison between MPI and AMPI
Rabin RanabhatLoras College
[email protected] 27, 2011
Mentors: Jim Edwards: John Dennis
Overview
• Motivation
• Objective
• Parallel Program
• MPI
• AMPI– Virtualization
– Migration
Motivation
Objective
• Investigate performance of AMPI and MPI with a parallel program
Parallel Program
• Mandelbrot used as the parallel program
• 2- dimensional decomposition
HighComputationarea
LowComputation Area
Message Passing Interface (MPI)
• Defacto standard for running parallel programs
• First implemented by William Gropp, Ewing Lusk
• Allows processes to communicate
MPI tasks vs. time
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 100 200 300 400 500 600
Tim
e ta
ken
for
exec
utio
n (s
econ
ds)
Total number of processes
Adaptive MPI (AMPI)
• MPI implementation based on Charm++
• Developed by University of Illinois – Urbana Champaign
• AMPI was introduced because– Processor set available may not be what the
application needs.
AMPI
• has virtualizations– Number of tasks implemented as user level
threads
MPI tasks
Physical Processor #1 Physical Processor # 2
VP ratio vs. time
0
10
20
30
40
50
60
0 5 10 15 20 25 30 35
Tim
e ta
ken
(in s
econ
ds)
Ratio of Virtual Processor and Physical Processor*400 x 400 grid size*32 physical processors
Adaptive MPI: migration
• Over-decomposition contribute to migrating threads which lead to dynamic load balancing
• Tradeoff between load balancing overhead and performance degradation
• Subroutine call for migration – MPI_Migrate()
Migration frequency
Case 1
Case 2
MPI and AMPI task with migration
0
1
2
3
4
5
6
7
0 100 200 300 400 500 600
Tim
e ta
ken
(in s
econ
ds)
Number of processes
MPI
AMPI
Conclusion
• AMPI does a better job granted certain conditions
• On average, AMPI jobs with load balancing 5% faster than MPI
• Not successful to use MPI-IO with AMPI
Questions
Email: [email protected]