Upload
roland-kuebert
View
754
Download
2
Tags:
Embed Size (px)
DESCRIPTION
How to simulate job scheduling using SLAs in a high-performance computing environment by extending the Alea Grid Scheduling Simulator.
Citation preview
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Simulating the usage of SLAs for job scheduling inan HPC environment
Roland Kubert
Hochstleistungsrechenzentrum Stuttgart
January 31, 2010
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
1 Introduction
2 Job Scheduling - with and without SLAs
3 Simulating SLAs-based scheduling
4 Conclusions and next steps
5 Discussion
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
1 Introduction
2 Job Scheduling - with and without SLAs
3 Simulating SLAs-based scheduling
4 Conclusions and next steps
5 Discussion
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Motivation
HPC services are only offered on best-effort basis
Scheduling parameters are few and only trivial
Work about SLAs has been performed at HLRS. . .
. . . but is on a higher level
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Job scheduling
scheduling: “to plan (something) at a certain time”
Scheduling is used in many fields
Job scheduling assigns computational jobs to processing units
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Service Level Agreements in one sentence
“The purpose of [a] Service Level Agreement (SLA) is to definethe services and responsibilities of the [service provider] and itsclients.” (Michigan State University High Performance ComputingCenter Service Level Agreement)
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
1 Introduction
2 Job Scheduling - with and without SLAs
3 Simulating SLAs-based scheduling
4 Conclusions and next steps
5 Discussion
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Classical job scheduling
Objective is mostly to maximize utilization or minimizewaiting time
Various algorithms with different advantages
Either schedule-based or queue-based
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Job scheduling - with SLAs
A quite popular field
Two main streams
SLAs per jobTrivial QoS parameters (Timing and resource requirements)
Relies on precise specification of job execution times
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
1 Introduction
2 Job Scheduling - with and without SLAs
3 Simulating SLAs-based scheduling
4 Conclusions and next steps
5 Discussion
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Simulating SLA-based job scheduling
Just implementing some scheduling won’t work
Production use cannot be done without previous investigations
Therefore, use a simulation tool: Alea
Needs to be extended in order to investigate SLAs
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Alea’s features
Supports different workload formats
Various scheduling algorithms already implemented
Visualization features
Free software (LGPL)
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Alea’s graphs
Figure: Screenshot of Alea
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Alea’s shortcomings
Many hard-coded settings (magic numbers)
No extensibility foreseen
Not really user-friendly
No further developments
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Alea’s architecture
Figure: High-level architecture of Alea 2.1
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Simulation of service levels
Simulation of three different service levels: gold, silver, bronze
Different service level distribution were generated andsimulated against a workload format (San DiegoSupercomputer Center’s Blue Horizon (144 nodes x 8 CPUs))
Investigated changes of waiting time with differentdistributions of service levels
Example: Gold-Silver-Bronze 0-0-100, 0-5-95, 1-4-95, 2-3-95,etc.)
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Simulation results
Machine usage did not change
Introducing service level increases average wait time
Increasing number of prioritized jobs increases wait time forlower-prioritized classes
Ensuring that not too many high-priority jobs exist enablesthe service provider to give “soft” guarantees on wait time
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
1 Introduction
2 Job Scheduling - with and without SLAs
3 Simulating SLAs-based scheduling
4 Conclusions and next steps
5 Discussion
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Conclusions
Using SLAs for scheduling is possible (duh)
Can range from trivial to complex
Simulation is a good way to examine different parameters,combinations, workloads, objective functions, ...
Publication has been accepted at PARENG 2011
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Next steps
Improvements on Alea
Conceptual implementation
Queue-based against schedule-based algorithms
Additional, more complex service levels
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
1 Introduction
2 Job Scheduling - with and without SLAs
3 Simulating SLAs-based scheduling
4 Conclusions and next steps
5 Discussion
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment
OutlineIntroduction
Job Scheduling - with and without SLAsSimulating SLAs-based scheduling
Conclusions and next stepsDiscussion
Questions
Figure: Flammarions Holzstich
Roland Kubert Simulating the usage of SLAs for job scheduling in an HPC environment