Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide

Meta SchedulingMeta Scheduling

Sathish VadhiyarSathish Vadhiyar

Sources/Credits/Taken from: Sources/Credits/Taken from: Papers listed in “References” slidePapers listed in “References” slide

Evaluation of Job Scheduling Evaluation of Job Scheduling Strategies for Grid Computing – Strategies for Grid Computing –

Hamscher et. al (2000)Hamscher et. al (2000)Scheduling structuresScheduling structures Centralized schedulersCentralized schedulers

Single-site scheduling – a job does not span across sitesSingle-site scheduling – a job does not span across sitesMulti-site – the oppositeMulti-site – the opposite

Hierarchical structures - A central scheduler Hierarchical structures - A central scheduler (metascheduler) for global scheduling and local (metascheduler) for global scheduling and local scheduling on individual sitesscheduling on individual sites

Decentralized scheduling – distributed schedulers Decentralized scheduling – distributed schedulers interact, exchange information and submit jobs to interact, exchange information and submit jobs to remote systemsremote systems

Direct communication – local scheduler directly contacts Direct communication – local scheduler directly contacts remote schedulers and transfers some of its jobsremote schedulers and transfers some of its jobsCommunication via central job pool – jobs that cannot be Communication via central job pool – jobs that cannot be immediately executed are pushed to a central pool, other immediately executed are pushed to a central pool, other local schedulers pull the jobs out of the poollocal schedulers pull the jobs out of the pool

Various Scheduling ArchitecturesVarious Scheduling Architectures

Various Scheduling ArchitecturesVarious Scheduling Architectures

Multiple simultaneous requests – Multiple simultaneous requests – Subramani et. al (2002) Subramani et. al (2002)

Job Scheduling Representation

Metascheduler across MPPsMetascheduler across MPPs

TypesTypes CentralizedCentralized

A meta scheduler and local dispatchersA meta scheduler and local dispatchersJobs submitted to meta schedulerJobs submitted to meta scheduler

HierarchicalHierarchicalCombination of central and local schedulersCombination of central and local schedulersJobs submitted to meta schedulerJobs submitted to meta schedulerMeta scheduler sends job to the site for which earliest start Meta scheduler sends job to the site for which earliest start time is expectedtime is expectedLocal schedulers can follow their own policiesLocal schedulers can follow their own policies

DistributedDistributedEach site has a metascheduler and a local schedulerEach site has a metascheduler and a local schedulerJobs submitted to local metaschedulerJobs submitted to local metaschedulerJobs can be transffered to sites with lowest loadJobs can be transffered to sites with lowest load

Evaluation of schemesEvaluation of schemesCentralized

Hierarchical

Distributed

1. Global knowledge of all resources – hence optimized schedules

2. Can act as a bottleneck for large number of resources and jobs

3. May take time to transfer jobs from meta scheduler to local dispatchers – need strategic position of meta scheduler

1. Medium level overhead

2. Sub optimal schedules

3. Still need strategic position of central scheduler

1. No bottleneck – workload evenly distributed

2. Needs all-to-all connections between MPPs

ExperimentsExperiments

Experiments to evaluate slowdowns in the 3 schemesExperiments to evaluate slowdowns in the 3 schemesBased on actual trace from a supercomputer centre – Based on actual trace from a supercomputer centre – 5000 job set5000 job set4 sites were simulated – 2 with the same load as trace, 4 sites were simulated – 2 with the same load as trace, other 2 where run time was multiplied by 1.7other 2 where run time was multiplied by 1.7FCFS with EASY backfilling was usedFCFS with EASY backfilling was usedslowdown = (wait_time + run_time) / run_timeslowdown = (wait_time + run_time) / run_time2 more schemes2 more schemes

Independent – when local schedulers acted independently, i.e. Independent – when local schedulers acted independently, i.e. sites are not connectedsites are not connected

United – resources of all processors are combined to form a United – resources of all processors are combined to form a single sitesingle site

ResultsResults

ObservationsObservations1. Centralized and hierarchical performed slightly better than united

a. Compared to hierarchical, scheduling decisions have to be made for all jobs and all resources in united – overhead and hence wait time is highb. Comparing united and centralized.

i. 4 categories of jobs corresponding to 4 different combinations of 2 parameters – execution time (short, long) and number of resources requested (narrow, wide)

ii. Usually larger number of long narrow jobs than short wide jobs

iii. Why is centralized and hierarchical better than united?

2. Distributed performed poorly

a. decision based on summary information may not yield good results

b. Back filling dynamics are complex

Newly Proposed ModelsNewly Proposed Models

K-distributed modelK-distributed model Distributed scheme where local metascheduler Distributed scheme where local metascheduler

distributes jobs to k least loaded sitesdistributes jobs to k least loaded sites When job starts on a site, notification is sent to the When job starts on a site, notification is sent to the

local metascheduler which in turn asks the k-1 local metascheduler which in turn asks the k-1 schedulers to dequeueschedulers to dequeue

K-Dual queue modelK-Dual queue model 2 queues are maintained at each site – one for local 2 queues are maintained at each site – one for local

jobs and other for remote jobsjobs and other for remote jobs Remote jobs are executed only when they don’t affect Remote jobs are executed only when they don’t affect

the start times of the local jobsthe start times of the local jobs Local jobs are given priority during backfillingLocal jobs are given priority during backfilling

Results – Benefits of new schemesResults – Benefits of new schemes

45% improvement 15% improvement

Results – Usefulness of K-Dual Results – Usefulness of K-Dual schemescheme

Grouping jobs submitted at lightly loaded sites and Grouping jobs submitted at lightly loaded sites and heavily loaded sitesheavily loaded sites

ReferencesReferences

A taxonomy of scheduling in general-purpose distributed A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Transactions on Software computing systems. IEEE Transactions on Software Engineering. Engineering. Volume 14 , Issue 2 (February 1988) Pages: 141 - Volume 14 , Issue 2 (February 1988) Pages: 141 - 154 Year of Publication: 1988 154 Year of Publication: 1988 AuthorsAuthors T. L. Casavant J. G. Kuhl T. L. Casavant J. G. KuhlEvaluation of Job-Scheduling Strategies for Grid Evaluation of Job-Scheduling Strategies for Grid ComputingSourceLecture Notes In Computer Science. ComputingSourceLecture Notes In Computer Science. Proceedings of the First IEEE/ACM International Workshop on Proceedings of the First IEEE/ACM International Workshop on Grid Computing. Grid Computing. Pages: 191 - 202 Year of Publication: 2000 Pages: 191 - 202 Year of Publication: 2000 ISBN:3-540-41403-7. Volker Hamscher Uwe Schwiegelshohn ISBN:3-540-41403-7. Volker Hamscher Uwe Schwiegelshohn Achim Streit Ramin YahyapourAchim Streit Ramin Yahyapour"Distributed Job Scheduling on Computational Grids using Multiple "Distributed Job Scheduling on Computational Grids using Multiple Simultaneous Requests" Vijay Subramani, Rajkumar Kettimuthu, Simultaneous Requests" Vijay Subramani, Rajkumar Kettimuthu, Srividya Srinivasan, P. Sadayappan, Proceedings of 11th IEEE Srividya Srinivasan, P. Sadayappan, Proceedings of 11th IEEE Symposium on High Performance Distributed Computing (HPDC Symposium on High Performance Distributed Computing (HPDC 2002), July 20022002), July 2002

Taxonomy of scheduling for Taxonomy of scheduling for distributed heterogeneous systems distributed heterogeneous systems

– Casavant and Kuhl (1988)– Casavant and Kuhl (1988)

TaxonomyTaxonomy

Local vs GlobalLocal vs Global Local – scheduling processes to time slices on a single Local – scheduling processes to time slices on a single

processorprocessor Global – deciding which processor should a job go toGlobal – deciding which processor should a job go to

Approximate vs heuristicApproximate vs heuristic Approximate – stop when you find a “good” solution. Uses same Approximate – stop when you find a “good” solution. Uses same

formal computational model . The ability to succeed depends on. formal computational model . The ability to succeed depends on. Availability of a function to evaluate a solutionAvailability of a function to evaluate a solutionThe time required to evaluate a solutionThe time required to evaluate a solutionThe ability to judge according to some metric valueThe ability to judge according to some metric valueMechanism to intelligently prune the solution spaceMechanism to intelligently prune the solution space

HeuristicsHeuristicsWorks on assumptions about the impact of “important” parametersWorks on assumptions about the impact of “important” parametersCannot quantize the assumption and the amount of impact all the Cannot quantize the assumption and the amount of impact all the timestimes

Also…Also…

Flat characteristicsFlat characteristics Adaptive vs. non-adaptiveAdaptive vs. non-adaptive Load balancingLoad balancing Bidding – e.g. CondorBidding – e.g. Condor Probabilistic – random searchesProbabilistic – random searches One time assignment vs. dynamic One time assignment vs. dynamic

reassignmentreassignment

Documents

Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide