Upload
alexa-brooks
View
42
Download
0
Tags:
Embed Size (px)
DESCRIPTION
A Framework for Elastic Execution of Existing MPI Programs. Aarthi Raveendran Graduate Student Department Of CSE. Motivation. Emergence of Cloud Computing Including for HPC Applications Key Advantages of Cloud Computing Elasticity (dynamically acquire resources) Pay-as-you model - PowerPoint PPT Presentation
Citation preview
1
A Framework for Elastic Execution of Existing MPI Programs
Aarthi Raveendran Graduate Student
Department Of CSE
A Framework for Elasti c Executi on of Existi ng MPI Programs 2
Motivation
Emergence of Cloud Computing• Including for HPC Applications
Key Advantages of Cloud Computing• Elasticity (dynamically acquire resources) • Pay-as-you model • Can be exploited to meet cost and/or time constraints
Existing HPC Applications• MPI-based, use fixed number of nodes
Need to make Existing MPI Applications Elastic
A Framework for Elasti c Executi on of Existi ng MPI Programs 3
Outline
Research Objective Framework Design Run time support modules Experimental Platform: Amazon Cloud Services Applications and Experimental Evaluation Decision layer Design Feedback Model Decision Layer Implementation Experimental Results for Time and Cost Criteria Conclusion
A Framework for Elasti c Executi on of Existi ng MPI Programs 4
Detailed Research Objective
To make MPI applications elastic • Exploit key advantage of Cloud Computing• Meet user defined time and/or cost constraints • Avoid new programming model or significant recoding
Design a framework for• Decision making
When to expand or contract• Actual Support for Elasticity
Allocation, Data Redistribution, Restart
A Framework for Elasti c Executi on of Existi ng MPI Programs 6
Framework design – Approach and Assumptions
Target – Iterative HPC Applications Assumption : Uniform work done at every iteration Monitoring at the start of every few iterations of the time-step
loop Checkpointing Resource allocation and redistribution
A Framework for Elasti c Executi on of Existi ng MPI Programs 7
Framework design - A Simple Illustration of the Idea
Progress checked based on current average iteration time Decision made to stop and restart if necessary Reallocation should not be done too frequently If restarting is not necessary, the application continues running
A Framework for Elasti c Executi on of Existi ng MPI Programs 8
Framework design – A Simple illustration of the idea
A Framework for Elasti c Executi on of Existi ng MPI Programs 10
Other Runtime Steps
Steps taken to perform scaling to a different number of nodes:• Live variables and arrays need to be collected at the master
node and redistributed • Read only need not be restored – just retrieve• Application is restarted with each node reading the local
portions of the redistributed data.
A Framework for Elasti c Executi on of Existi ng MPI Programs 11
Background – Amazon cloud
Amazon Elastic compute cloud (EC2)
Small instances : 1.7 GB of memory, 1 EC2 Compute Unit, 160 GB of local instance storage, 32-bit platform
Large instances : 7.5 GB of memory, 4 EC2 Compute Units, 850 GB of local instance storage, 64-bit platform
On demand , reserved , spot instances
Amazon Simple Storage Service (S3)
Provides key - value store Data stored in files Each file restricted to 5 GB Unlimited number of files
A Framework for Elasti c Executi on of Existi ng MPI Programs 12
Runtime support modulesResource allocator
Elastic execution• Input taken from the decision layer on the number of resources• Allocating de- allocating resources in AWS environment• MPI configuration for these instances
Setting up of the MPI cluster Configuring for password less login among nodes
A Framework for Elasti c Executi on of Existi ng MPI Programs 13
Runtime support modules Check pointing and redistribution
Multiple design options feasible with the support available on AWS• Amazon S3
Unmodified Arrays Quick access from EC2 instances Arrays stored in small sized chunks
• Remote file copy Modified arrays (live arrays) File writes and reads
A Framework for Elasti c Executi on of Existi ng MPI Programs 14
Runtime support modules Check pointing and redistribution
Current design • Knowledge of division of the original dataset necessary• Aggregation and redistribution done centrally on a single node
Future work • Source to source transformation tool • Decentralized array distribution schemes
A Framework for Elasti c Executi on of Existi ng MPI Programs 15
Experiments
Framework and approach evaluated using • Jacobi• Conjugate Gradient (CG )
MPICH 2 used 4, 8 and 16 small instances used for processing the data Observation made with and without scaling the resources -
Overheads 5-10% , which is negligible
A Framework for Elasti c Executi on of Existi ng MPI Programs 16
Experiments – Jacobi
No. NodesW/O Redist. (sec)
W/Redist. (sec)
Data Movement (sec)
Overhead (%)
4 2810 2850 71 18 1649 1720 89 4
16 1001 1087 87 9
JACOBI APPLICATION WITHOUT SCALING THE RESOURCES
Starting Nodes
Final Nodes
MPI Config. (sec)
Data Movement (sec)
Total (sec)Overhead (%)
4 8 81 3 2301 34 16 84 3 1998 58 4 80 3 2267 28 16 95 3.8 1386 4
16 4 99 3.5 2004 516 8 97 3 1390 5
JACOBI APPLICATION WITH SCALING THE RESOURCES
A Framework for Elasti c Executi on of Existi ng MPI Programs 17
Experiments – Jacobi
Matrix updated at every iteration Updated matrix collected and redistributed at node change Worst case total redistribution overhead – less than 2% Scalable application – performance increases with number of
nodes
A Framework for Elasti c Executi on of Existi ng MPI Programs 18
Experiments - CG
No. NodesW/O Redist. (sec)
W/Redist. (sec)
Data Movement (sec)
Overhead (%)
4 834 879 2.5 58 997 980 3 0
16 1030 1105 2.7 7
CG APPLICATION WITHOUT SCALING THE RESOURCES
Starting Nodes
Final Nodes
MPI Config. (sec)
Data Movement (sec)
Total (sec)Overhead (%)
4 8 43 3 930 24 16 60 3 999 78 4 40 4 942 38 16 81 3 1060 5
16 4 58 3 1003 816 8 82 3 1080 7
CG APPLICATION WITH SCALING THE RESOURCES
A Framework for Elasti c Executi on of Existi ng MPI Programs 19
Experiments - CG
Single vector which needs to be redistributed Communication intensive application Not scalable Overheads are still low
A Framework for Elasti c Executi on of Existi ng MPI Programs 20
Decision Layer - Design
Main Goal – To meet user demands Constraints – Time and Cost – “Soft” and not “Hard” Measuring iteration time to determine progress Measuring communication overhead to estimate scalability Moving to large – type instances if necessary
A Framework for Elasti c Executi on of Existi ng MPI Programs 21
Feedback Model (I)
Dynamic estimation of node count based on inputs :• Input time / Cost• Per iteration time• Current node count• Communication time per iteration• Overhead costs – restart , redistribution, data read
A Framework for Elasti c Executi on of Existi ng MPI Programs 22
Feedback Model (II)
Move to large instances if communication time is greater than 30 % of total time
Time Criteria :• New node count found based on the current progress• If time criteria cannot be met with max nodes also, shift to max
nodes to get best results Cost Criteria :• Change at the end of billing cycle• If cost criteria cannot be met with min nodes also, shift to min
nodes.
A Framework for Elasti c Executi on of Existi ng MPI Programs 23
Decision layer - Implementation
Input : Monitoring Interval, Criteria, Initial Node Count, Input Time / Cost
Output : Total Process Time, Total Cost
A Framework for Elasti c Executi on of Existi ng MPI Programs 24
Experiments – Time CriteriaJacobi
800 900 1000 1100 1200 14000
200
400
600
800
1000
1200
1400
Input Time vs Output Time
Input Time (tip in secs)
Out
put T
ime
( top
in se
cs)
A Framework for Elasti c Executi on of Existi ng MPI Programs 25
Experiments – Time CriteriaJacobi
0 20 40 60 80100
120140
160180
200220
240260
280300
320340
360380
400420
440460
480500
5200
2
4
6
8
10
12
14
16
18
20 Nodechange for Different Input Time
8009001000110012001400
Iteration Number
Nod
ecou
nt
Input Time (In secs)
A Framework for Elasti c Executi on of Existi ng MPI Programs 26
Experiments – Time CriteriaCG
Start Node
I/p Time NodechangeIteration Number
o/p Time NodechangeIteration Number
o/p Time
3004 small
to 4 large
5 448remains at
4 large- 357
6004 small
to 4 large
35 592remains at
4 large- 371
800remains at
4 small- 764
4 large to
4 small5 690
4 Small 4 large Node Changes for different Input Time for different types of Start Node
A Framework for Elasti c Executi on of Existi ng MPI Programs 27
Experiments – Cost CriteriaJacobi
3 4 5 6 7 100
1
2
3
4
5
6
7
8
9
10
Input Cost vs Output Cost
Input Cost (Cip in $)
Out
put C
ost (
Cop
in $
)
A Framework for Elasti c Executi on of Existi ng MPI Programs 28
Experiments – Cost CriteriaJacobi
0 20 40 60 80100
120140
160180
200220
240260
280300
320340
360380
400420
440460
480500
5200
2
4
6
8
10
12
14
16
18
20 Node Changes for different Input Cost
3456710
Iteration Number
Nod
ecou
nt
Input Cost (in $)
A Framework for Elasti c Executi on of Existi ng MPI Programs 29
Experiments – Cost CriteriaJacobi
3 4 5 6 7 100
500
1000
1500
2000
2500
3000
Input Cost vs Output Time
Input Cost (Cin in $)
Out
put T
ime
(top
) in
sec
A Framework for Elasti c Executi on of Existi ng MPI Programs 30
Experiments – Cost CriteriaCG
4 5 6 7 8 90
1
2
3
4
5
6
7
8
9
10Input Cost vs Output Cost
Input Cost (Cip in $)
Out
put C
ost (
Cop
in $
)
A Framework for Elasti c Executi on of Existi ng MPI Programs 31
Experiments – Cost CriteriaCG
I/P Cost (in $) Iteration No. Node Change4 Never 4 Small5 45 4 Small to 4 Large6 40 5 Small to 4 Large7 35 6 Small to 4 Large8 25 7 Small to 4 Large9 5 8 Small to 4 Large
Node Changes For different Input Cost
A Framework for Elasti c Executi on of Existi ng MPI Programs 32
Experiments – Cost CriteriaCG
4 5 6 7 8 90
100
200
300
400
500
600
700
800
900
Input Cost vs Output Time
Input Cost (Cin in $)
Out
put T
ime
(top
) in
sec
A Framework for Elasti c Executi on of Existi ng MPI Programs 33
Conclusion
An approach to make MPI applications elastic and adaptable An automated framework for deciding the number of instances for
execution based on user demands – time / cost Framework tested using 2 MPI applications showing low overheads
during elastic execution and best efforts to meet user constraints