RMS and Scheduling for Future Generation Grids
Ramin Yahyapour
University DortmundLeader CoreGRID Institute
on Resource Management and Scheduling
CoreGRID – Summer SchoolBonn, 24 July 2006
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
2
Introduction
We all know what “the Grid” is…– one of the many definitions:
“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations” (Ian Foster)
– however, the actual scope of “the Grid” is still quite controversial
Many people consider High Performance Computing (HPC) as the main Grid application.
– today’s Grids are mostly Computational Grids or Data Grids with HPC resources as building blocks
– thus, Grid resource management is much related to resource management on HPC resources (our starting point).
– we will return to a broader Grid scope and its implications later
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
3
Key Question
“Which services/resources to use for an activity, when, where, how?”
Typically: A particular user, or business application, or component applicationneeds for an activity one or several services/resourcesunder given constraints
• Trust & Security• Timing & Economics• Functionality & Service level• Application-specifics & Inter-dependencies• Scheduling and Access Policies
This question has to be answered in an automatic, efficient, and reliable way.
Part of the invisible and smart infrastructure!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
4
Motivation
Resource Management for Future/Next Generation Grids!
But what are Future Generation Grids?
HPC Computing– Parallel Computing– Cluster Computing– Desktop Computing
HPC Computing– Parallel Computing– Cluster Computing– Desktop Computing
Enterprise Grids– Business Services– Application Server– Webservices
Enterprise Grids– Business Services– Application Server– Webservices
Ambient IntelligenceUbiquitous Computing
– PDA, Mobile Devices
Ambient IntelligenceUbiquitous Computing
– PDA, Mobile Devicesdepends on who you ask!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
5
Resource Definition
Concluding from the different interpretations of “Grid”:for broad acceptance Grid RMS should probably cover the whole scope;
Resources:
Compute
Network
Storage
Data
Software
– components, licenses
Services
– functionality, ability
Management of some resources is less complex,
while other resources require coordination and orchestration to be effective (e.g. HW and SW).
Management of some resources is less complex,
while other resources require coordination and orchestration to be effective (e.g. HW and SW).
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
6
Resource Management LayerGrid Resource Management System consists of :Local resource management system (Resource Layer)
– Basic resource management unit – Provide a standard interface for using remote resources– e.g. GRAM, etc.
Global resource management system (Collective Layer)– Coordinate all Local resource management system within multiple or
distributed Virtual Organizations (VOs)– Provide high-level functionalities to efficiently use all of resources
• Job Submission• Resource Discovery and Selection• Scheduling• Co-allocation• Job Monitoring, etc.
– e.g. Meta-scheduler, Resource Broker, etc.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
7
ResourceBroker
Grid Resource Manager
Grid Resource Manager
Grid Resource Manager
Information Services
MonitoringServices
SecurityServices
Core Grid Infrastructure Services
Grid Middlewar
e
PBS LSF …
Resource Resource Resource
Local Resource
Management
Higher-Level Services
User/Application
Grid RMS
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
8
Core Functionalities of a Grid RMS
Resource Discovery
– online, on-demand process
Access to Resource Information
– static and dynamic information
Status Monitoring
– general resource monitoring
– monitoring with respect to a job
Allocation/Scheduling
– coordination is required
SLA Management
– reliable agreements
Execution Management/Provisioning
– start of a job / use of a resource
Accounting and Billing
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
9
Case 1: RMS for specialized Applications
Specialized resource management dedicated to a single application domain.
– Goal: high efficiency
– Cost: higher development effort
The RMS is adapted to:
– application and its workflow
– resource configuration
There is need for specific interfaces to the resources.
Highly specialized for the application and therefore easier to handle for the user.
– The know-how has been built into the system.
Only certain types of jobs and resources are considered.
Only certain types of jobs and resources are considered.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
10
Case 2: RMS as Generic Grid-Middleware
Grid RMS is open for many applications
This may be less efficient than Case 1.
Generic interfaces are required that are adapted to many front- and backends.
This approach requires additional user-/application supplied information:
– job description• workflow, objectives, requirements, constraints
Consideration of security is an integral aspect
– wide variety of security levels
RMS for Future Generation Grids needs the flexibility to cover all kind of jobs and resources
RMS for Future Generation Grids needs the flexibility to cover all kind of jobs and resources
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
11
FGG Resource Management Need for well-defined interfaces to core services
Inherent support for different implementations
While maintaining cooperation between these implementations
Resource DiscoveryAccess to Resource InformationStatus MonitoringAllocation/SchedulingSLA ManagementExecution Management/ProvisioningAccounting and Billing
Resource DiscoveryAccess to Resource InformationStatus MonitoringAllocation/SchedulingSLA ManagementExecution Management/ProvisioningAccounting and Billing
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
12
Requirements
Resource Discovery:– scalable
• from cluster grids,• business grids• to global grids
– centralized or decentralized implementations, P2P
– unified naming scheme
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
flexibility
scalability
efficiency
Aspects:
flexibility
scalability
efficiency
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
13
Requirements
Resource Discovery:– scalable
• from cluster grids,• business grids• to global grids
– centralized or decentralized implementations, P2P
– unified naming scheme
Access to resource information:– static and historic information,– dynamic (future) information:
• planned, predicted
– may be subject to privacy concerns
• user and owner dependent
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
flexibility
scalability
efficiency
Aspects:
flexibility
scalability
efficiency
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
14
Problem: Job Submission Descriptions differ
The deliverables of the GGF/OGF Working Group JSDL:
A specification for an abstract standard Job Submission Description Language (JSDL) that is independent of language bindings, including; – the JSDL feature set and attribute semantics, – the definition of the relationship between attributes, – and the range of attribute values.
A normative XML Schema corresponding to the JSDL specification.
A document of translation tables to and from the scheduling languages of a set of popular batch systems for both the job requirements and resource description attributes of those languages, which are relevant to the JSDL.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
15
JSDL Attribute Categories
The job attribute categories include:
– Job Identity Attributes• ID, owner, group, project, type, etc.
– Job Resource Attributes• hardware, software, including applications, Web and Grid Services, etc.
– Job Environment Attributes• environment variables, argument lists, etc.
– Job Data Attributes• databases, files, data formats, and staging, replication, caching, and disk
requirements, etc.
– Job Scheduling Attributes• start and end times, duration, immediate dependencies etc.
– Job Security Attributes• authentication, authorisation, data encryption, etc.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
16
Requirements
Status monitoring:
– job and resource condition
– SLA status
Autonomic aspects:
– detection of unexpected changes
– allows prediction of system behavior
• related to an individual job• and to general demand
– trigger of re-scheduling/re-allocation
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
reliability
scalability
Aspects:
reliability
scalability
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
17
Requirements
Allocation/Scheduling:– Different application scenarios
• parallel, sequential jobs
• co-allocation and orchestration
• workflows
– Provider policies• access, cost, security
– User/application policies• scheduling objectives,
• cost/budget management
• deadlines
– Cooperation between RM systems– Support for different (= individual)
algorithms and strategies
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
flexibility, easy-to-use
support business models
person-centric
efficiency
Aspects:
flexibility, easy-to-use
support business models
person-centric
efficiency
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
18
Different Level of Scheduling
Resource-level scheduler
– low-level scheduler, local scheduler, local resource manager
– scheduler close to the resource, controlling a supercomputer, cluster, or network of workstations, on the same local area network
– Examples: Open PBS, PBS Pro, LSF, SGE
Enterprise-level scheduler
– Scheduling across multiple local schedulers belonging to the same organization
– Examples: PBS Pro peer scheduling, LSF Multicluster
Grid-level scheduler
– also known as super-scheduler, broker, community scheduler
– Discovers resources that can meet a job’s requirements
– Schedules across lower level schedulers
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
19
Grid-Level Scheduler
Discovers & selects the appropriate resource(s) for a job
If selected resources are under the control of several local schedulers, a meta-scheduling action is performed
Architecture:– Centralized: all lower level schedulers are under the
control of a single Grid scheduler• not realistic in global Grids
– Distributed: lower level schedulers are under the control of several grid scheduler components; a local scheduler may receive jobs from several components of the grid scheduler
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
20
Grid Scheduling
Scheduler
Schedule
tim
e
Job-Queue
Machine 1
Scheduler
Scheduleti
me
Job-Queue
Machine 2
Scheduler
Schedule
tim
e
Job-Queue
Machine 3
Grid-SchedulerGrid User
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
21
Activities of a Grid Scheduler
GGF Document: “10 Actions of Super Scheduling (GFD-I.4)”
1. Authorization Filtering
3. Min. Requirement Filtering
2. Application Definition
Phase One-Resource Discovery
5. System Selection
4. Information Gathering
Phase Two - System Selection
7. Job Submission
6. Advance Reservation
9. Monitoring Progress
8. Preparation Tasks
11. Clean-up Tasks
10 Job Completion
Phase Three- Job Execution
Source: Jennifer Schopf
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
22
Select a Resource for Execution
Most systems do not provide advance information about future job execution– user information not accurate as mentioned before– new jobs arrive that may surpass current queue entries due to
higher priority
Grid scheduler might consider current queue situation, however this does not give reliable information for future executions:– A job may wait long in a short queue while it would have been
executed earlier on another system.Available information:
– Grid information service gives the state of the resources and possibly authorization information
– Prediction heuristics: estimate job’s wait time for a given resource, based on the current state and the job’s requirements.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
23
Requirements (contd)
SLA management:– reliability– orchestration of services– quality of service– business models– accountability
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
persistence
support business models
Aspects:
persistence
support business models
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
24
Co-allocation
It is often requested that several resources are used for a single job.– that is, a scheduler has to assure that all resources are
available when needed.• in parallel (e.g. visualization and processing)
• with time dependencies (e.g. a workflow)
The task is especially difficult if the resources belong to different administrative domains.– The actual allocation time must be known for co-allocation– or the different local resource management systems must
synchronize each other (wait for availability of all resources)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
25
Example Multi-Site Job Execution
Scheduler
Scheduleti
me
Job-Queue
Machine 2
Scheduler
Schedule
tim
e
Job-Queue
Machine 3
A job uses several resources at different sites in parallel.Network communication is an issue.
Scheduler
Schedule
tim
e
Job-Queue
Machine 1
Grid-Scheduler
Multi-Side Job
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
26
Advanced Reservation
Co-allocation and other applications require a priori information about the precise resource availability
With the concept of advanced reservation, the resource provider guarantees a specified resource allocation.– includes a two- or three-phase commit for agreeing on
the reservation
Implementations:– GARA/DUROC/SNAP provide interfaces for Globus to
create advanced reservation– implementations for network QoS available.
• setup of a dedicated bandwidth between endpoints– “WS-Agreement” defines a protocol for agreement
management
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
27
Using Service Level Agreements
The mapping of jobs to resources can be abstracted using the concept of Service Level Agreement (SLAs)
SLA: Contract negotiated between– resource provider, e.g. local scheduler– resource consumer, e.g., grid scheduler, application
SLAs provide a uniform approach for the client to– specify resource and QoS requirements, while– hiding from the client details about the resources,– such as queue names and current workload
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
28
GGF/OGF – GRAAP Working GroupGoal: Defining WebService-based protocols for negotiation and agreement
management
WS-Agreement Protocol:
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
29
Requirements
SLA management:– reliability– orchestration of services– quality of service– business models– accountability
Execution Management– services, software,
data/storage, compute, network
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
persistence
support business models
Aspects:
persistence
support business models
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
30
GGF/OGF-WG DRMAA
GGF Working Group “Distributed Resource Management Application API”
From the charter:
Develop an API specification for the submission and control of jobs to one or more Distributed Resource Management (DRM) systems.
The scope of this specification is all the high level functionality which is necessary for an application to consign a job to a DRM system including common operations on jobs like termination or suspension.
The objective is to facilitate the direct interfacing of applications to today's DRM systems by application's builders, portal builders, and Independent Software Vendors (ISVs).
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
31
RequirementsSLA management:
– reliability– orchestration of services– quality of service– business models– accountability
Execution Management– services, software,
data/storage, compute, network
Accounting and Billing– providing economic/financial
services– foundation of business models
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Resource Discovery
Access to Resource Information
Status Monitoring
Allocation/Scheduling
SLA Management
Execution Management/Provisioning
Accounting and Billing
Aspects:
persistence
support business models
Aspects:
persistence
support business models
Scheduling in Future Generation Grids
Outlook on future Grid Resource Management and Scheduling
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
33
Limitations of current Grid RMS
The interaction between local scheduling and higher-level Grid scheduling is currently a one-way communication– current local schedulers are not optimized for Grid-use– limited information available about future job execution– a site is usually selected by a Grid scheduler and the job
enters the remote queue.
The decision about job placement is inefficient.– Actual job execution is usually not known– Co-allocation is a problem as many systems do not
provide advance reservation
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
34
Example of Grid Scheduling Decision Making
Scheduler
Schedule
tim
e
Job-Queue
Machine 1
Scheduler
Schedule
tim
e
Job-Queue
Machine 2
Scheduler
Schedule
tim
e
Job-Queue
Machine 3
Grid-SchedulerGrid User
15 jobs running20 jobs queued
5 jobs running2 jobs queued
40 jobs running80 jobs queued
Where to put the Grid job?
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
35
Available Information from the Local Schedulers
Decision making is difficult for the Grid scheduler
– limited information about local schedulers is available
– available information may not be reliable
Possible information:
– queue length, running jobs
– detailed information about the queued jobs• execution length, process requirements,…
– tentative schedule about future job executions
These information are often technically not provided by the local scheduler
In addition, these information may be subject to privacy concerns!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
36
Consequence
Consider a workflow with 3 short steps (e.g. 1 minute each) that depend on each other
Assume available machines with an average queue length of 1 hour.The Grid scheduler can only submit the subsequent step if the previous job
step is finished.
Result:– The completion time of the workflow may be larger than 3 hours
(compared to 3 minutes of execution time)
– Current Grids are suitable for simple jobs, but still quite inefficient in handling more complex applications
Need for better coordination of higher- and lower-level scheduling!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
37
Example Grid Scenario
Remote CenterReads and Generates TB of Data
LAN/WAN Transfer
WAN Transfer Compute Resources
Visualization
Assume a data-intensive simulation that should be visualized and steered during runtime!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
38
Resource Request of a Simple Grid Job
A specified architecture with
48 processing nodes,
1 GB of available memory, and
a specified licensed software package
for 1 hour between 8am and 6pm of the following day • Time must be known in advance.
A specific visualization device during program execution
Minimum bandwidth between the VR device and the main computer during
program execution
Input: a specified data set from a data repository
at most 4 €
preference of cheaper job execution over an earlier execution.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
39
Example: Coordinated Simulation and VisualizationExpected output of a Grid scheduler:
time
Data Transfer
Loading Data Parallel Computation Providing Data
Data Transfer Network 1
Computer 1
Parallel ComputationComputer 2
Communication for Computation
Network 3
VR-Cave Visualization
Data Data Access Storing Data
Communication for Visualization
Network 2
Software UsageSoftware License
Data StorageStorage
resources
Reservations are necessary!
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
40
Conclusions for Grid Scheduling
Grids ultimately require coordinated scheduling services.
Support for different scheduling instances
– different local management systems
– different scheduling algorithms/strategies
For arbitrary resources
– not only computing resources, also
– data, storage, network, software etc.
Support for co-allocation and reservation
– necessary for coordinated grid usage (see data, network, software, storage)
Different scheduling objectives
– cost, quality, other
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
41
Grid-Level Scheduler
Discovers & selects the appropriate resource(s) for a job
If selected resources are under the control of several local schedulers, a meta-scheduling action is performed
Architecture:– Centralized: all lower level schedulers are under the
control of a single Grid scheduler• not realistic in global Grids
– Distributed: lower level schedulers are under the control of several grid scheduler components; a local scheduler may receive jobs from several components of the grid scheduler
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
42
Grid Scheduling Scenarios – Example I
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
43
Grid Scheduling Scenarios – Example II
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
44
Grid Scheduling Scenarios – Example III
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
45
Towards Grid Scheduling
Grid Scheduling Methods:
– Support for individual scheduling objectives and policies
– Multi-criteria scheduling models
– Economic scheduling methods to Grids
Architectural requirements:
– Generic job description
– Negotiation interface between higher- and lower-level scheduler
– Economic management services
– Workflow management
– Integration of data and network management
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
46
Scheduling Objectives in the GridIn contrast to local computing, there is no general scheduling objective
anymore
– minimizing response time, minimizing cost
– tradeoff between quality, cost, response-time etc.
Cost and different service quality come into play
– the user will introduce individual objectives
– the Grid can be seen as a market where resource are concurring alternatives
Similarly, the resource provider has individual scheduling policies
Problem:
– the different policies and objectives must be integrated in the scheduling process
– different objectives require different scheduling strategies
– part of the policies may not be suitable for public exposition(e.g. different pricing or quality for certain user groups)
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
47
Grid Scheduling Algorithms
Due to the mentioned requirements in Grids its not to be expected that a single scheduling algorithm or strategy is suitable for all problems.
Therefore, there is need for an infrastructure that – allows the integration of different scheduling algorithms– the individual objectives and policies can be included– resource control stays at the participating service
providers
Transition into a market-oriented Grid scheduling model
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
48
Economic Scheduling
Market-oriented approaches are a suitable way to implement the interaction of different scheduling layers– agents in the Grid market can implement different policies and
strategies– negotiations and agreements link the different strategies
together– participating sites stay autonomous
Needs for suitable scheduling algorithms and strategies for creating and selecting offers– need for creating the Pareto-Optimal scheduling solutions
Performance relies highly on the available information– negotiation can be hard task if many potential providers are
available.
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
49
Economic Scheduling (2)
Several possibilities for market models: auctions of resources/services auctions of jobs
Offer-request mechanisms support: inclusion of different cost models, price determination individual objective/utility functions for optimization goals
Market-oriented algorithms are considered: robust flexible in case of errors simple to adapt markets can have unforeseeable dynamics
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
24.07.06
50
Conclusions
Key Challenges for FGG RMS– Cooperation
• interoperability between Grid-RMS implementations and types• and between Grid-RMS and local RM systems
– Interoperability through well defined interfaces• identification and adaptation
– Scalability• domain-specific implementation may have limited scalability, • but the general architecture should cover millions of resources.
– Fault-tolerance• resources and instances of core services
– Common security model
The RMS should be invisible to the user andprovide a pervasive common architecture allowing different implementations while maintaining interoperability.