Upload
rae-roberson
View
24
Download
2
Embed Size (px)
DESCRIPTION
Grid Scheduler: Plan & Schedule. Adam Arbree Jang Uk In. Current System. User Request. VDC. RC. TC. Chimera. Abstract Planner. Concrete Planner. DAGMan. Condor-G. Globus (gahp-server). Remote Site. User Request. VDC. TC. RC. Data Rep. Service. Chimera Abstract Planner. - PowerPoint PPT Presentation
Citation preview
Grid Scheduler:Plan & Schedule
Adam Arbree
Jang Uk In
Current System
VDCChimera
AbstractPlanner
ConcretePlanner
RC TC
Condor-GDAGManGlobus
(gahp-server)
User Request
Remote Site
Proposed SystemUser Request
VDC
SchedulingClient
Globus(gahp-server)
Condor-G
Data Rep.Service
Grid MonitorInterface
Remote Site
ChimeraAbstract Planner
RCTC
JDBPRDB
SchedulingServer
GDB
Scheduling Server
DAG Reducer
Message Interface
Prediction Engine
Tracking System
Planner
DataReplication Server
Scheduling Client
JDB.RC
RC & TC PJDB
Grid Mon.
Grid Mon.
User Request
VDC
SchedulingClient
Globus(gahp-server)
Condor-G
Data Rep.Service
Grid MonitorInterface
Remote Site
ChimeraA-Planner
RCTC
JDBPRDB
SchedulingServer
GDB
Chimera Abstract Planner
• Input
– User virtual data request
• Output
– Abstract production plan
• Queries VDC for full dependency graph
Scheduling Client
• Input
– Parse abstract DAG
– Read run messages from server
• Output
– Send DAG to server
– Build and send jobs for Condor-G
• Maintain local image of DAG progress
• Refresh the scheduler data by request
• Choose scheduling server
User Request
VDC
SchedulingClient
Globus(gahp-server)
Condor-G
Data Rep.Service
Grid MonitorInterface
Remote Site
ChimeraA-Planner
RCTC
JDBPRDB
SchedulingServer
GDB
Scheduling Databases
• TC: Trans. Catalog– (LFN, site) (PFN, env)
• RC: Replica Catalog– (LFN, site) (PFN)– (LFN, site, copy) (PFN)
• PRDB: Prediction DB– (job, params, site)
• Execution Time• CPU use • Disk use • Bandwith
• JDB: Job DB– (job)
• Job state• Site• VO• User• Params• Prediction use• Current use
User Request
VDC
SchedulingClient
Globus(gahp-server)
Condor-G
Data Rep.Service
Grid MonitorInterface
Remote Site
ChimeraA-Planner
RCTC
JDBPRDB
SchedulingServer
GDB
Grid Monitor
• Input– Monitor data
• Output– Data to Data Rep.
Service– Data to Server– Data to grid cache
• Monitors – Cost Function– VO limits table– CPU load (by job)– Disk Usage (by job) – Job List– Bandwidth (by job)
User Request
VDC
SchedulingClient
Globus(gahp-server)
Condor-G
Data Rep.Service
Grid MonitorInterface
Remote Site
ChimeraA-Planner
RCTC
JDBPRDB
SchedulingServer
GDB
Message Interface• Input
– A-DAG (from client)– User status requests (from
client)– Job run requests (from
planner)– Job state request (from rep.
server)– Job state (from tracking)
• Output– Job run requests (to client)– Status updates (to client)– Pruned DAG (to client) – Job state (to rep. server)– Job state request (to
tracking)• Manages client connections• Provides incoming and out going
message queues• Checks connectivity of clients
DAG Reducer
Message Int.
Pred. Engine
Tracking Sys.
Planner
DataRep. Server
Sched. Client
JDBRC
PJDB
Grid Mon.
RC & TC Grid Mon.
Dag Reducer
• Input– Complete Abstract DAG
(from message int.)– Replica data (from RC)
• Output– DAG pruned for file
existance (to message int.)
DAG Reducer
Message Int.
Pred. Engine
Tracking Sys.
Planner
DataRep. Server
Sched. Client
JDBRC
PJDB
Grid Mon.
RC & TC Grid Mon.
Prediction Engine• Input
– Job description (from planner)
– Updated history information (from tracking system)
– History data (from PRDB)
• Output– Job prediction (to
planner)– History information (to
tracking sys.)– History Data (to PRDB)
• Predict the time for a job on each site in the grid
DAG Reducer
Message Int.
Pred. Engine
Tracking Sys.
Planner
DataRep. Server
Sched. Client
JDBRC
PJDB
Grid Mon.
RC & TC Grid Mon.
Tracking System• Input
– Pruned DAG (from DAG reducer)
– Job status (from planner)– Prediction information
(from pred. engine)– Status req. (from message
interface)– Job data (from JDB)
• Output– Job status (to planner)– New history information (to
pred. engine)– Status information (to
message interface)– Job data (to JDB)
• Periodically access grid monitor and update job status
DAG Reducer
Message Int.
Pred. Engine
Tracking Sys.
Planner
DataRep. Server
Sched. Client
JDBRC
RC & TC PJDB
Grid Mon.
Grid Mon.
Tracking System• Input
– Job status (from tracking system)
– Job predictions (from pred. engine)
– PFN’s (from TC and RC)– Grid status (from grid mon.)
• Output– Job status (to tracking
system)– Job run requests (to
message interface)• Scheduling process
– Check grid status– Determine next job to run
and its execution site– Transfer input files– Send message to client to
run job– Update tracking– Transfer files to storage– Clean up– Update RC
DAG Reducer
Message Int.
Pred. Engine
Tracking Sys.
Planner
DataRep. Server
Sched. Client
JDBRC
RC & TC PJDB
Grid Mon.
Grid Mon.
Data Replication Service
• Input– Grid status– Job queue
• Output– Entries to RC
• Monitor grid and determine hot spots
• Select sites to replicate data• Transfer data to replication
sites• Clean up unneeded data
User Request
VDC
SchedulingClient
Globus(gahp-server)
Condor-G
Data Rep.Service
Grid MonitorInterface
Remote Site
ChimeraA-Planner
RCTC
RJDBPRDB
SchedulingServer
GDB
Grid Simulation
• Only two outside interfaces– Condor-G
– Remote sites
• Condor-G emulator takes real Condor-G submit files and sends fake jobs to remote site emulators
• Remote site emulators sleeps for designated periods for each job and send simulated data to the grid monitor
Development Schedule
1. Research ~ present-Jan 20th
• Survey existing monitoring sytems
• Decide what must be monitored
2. Initial framework ~ Jan 20th- end of Feb• Build grid monitor interface
• Build grid simulator
• Design scheduler and data replication service
3. Build scheduler ~ March
4. Build data replication service ~ April
5. Grid Testing ~ May