52
Jaime Frey Computer Sciences Department University of Wisconsin-Madison [email protected] http://www.cs.wisc.edu/condor Condor-G: An Update

Condor-G: An Update

  • Upload
    ronia

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Condor-G: An Update. Outline. What is Condor-G Past Present Future. What Is Condor-G. Use Condor to run jobs on the Grid Uses Globus Toolkit GRAM (submit a remote job) GASS (transfer job’s files) Two components Globus Universe GlideIn. Globus Universe. Run a job on a Grid resource - PowerPoint PPT Presentation

Citation preview

Page 1: Condor-G: An Update

Jaime FreyComputer Sciences DepartmentUniversity of Wisconsin-Madison

[email protected]://www.cs.wisc.edu/condor

Condor-G:An Update

Page 2: Condor-G: An Update

www.cs.wisc.edu/condor

Outline

› What is Condor-G

› Past

› Present

› Future

Page 3: Condor-G: An Update

www.cs.wisc.edu/condor

What Is Condor-G

› Use Condor to run jobs on the Grid› Uses Globus Toolkit

GRAM (submit a remote job) GASS (transfer job’s files)

› Two components Globus Universe GlideIn

Page 4: Condor-G: An Update

www.cs.wisc.edu/condor

Globus Universe

› Run a job on a Grid resource

› Features Job management Fault tolerance Credential management

› Disadvantages No remote syscalls, checkpoint/migration, or

dynamic resource selection

Page 5: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd

LSFLSF

Condor-G Grid Resource

Page 6: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd

LSFLSF

Condor-G Grid Resource

600 Globusjobs

Page 7: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd

LSFLSF

Condor-G Grid Resource

GridManagerGridManager

600 Globusjobs

Page 8: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

Condor-G Grid Resource

GridManagerGridManager

600 Globusjobs

Page 9: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

User JobUser Job

Condor-G Grid Resource

GridManagerGridManager

600 Globusjobs

Page 10: Condor-G: An Update

www.cs.wisc.edu/condor

GlideIn

› Create your own personal Condor pool from temporarily-acquired Grid resources

› Brings the full power of Condor to the Grid

› Run a Condor startd on a Grid resource

› Startd reports back to your machine and runs Vanilla and Standard Universe jobs

Page 11: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd

LSFLSF

CollectorCollector

Condor-G

600 Condorjobs

Grid Resource

Page 12: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd

LSFLSF

CollectorCollector

600 Condorjobs

glide-ins

Condor-G Grid Resource

Page 13: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd

LSFLSF

CollectorCollector

GridManagerGridManager

600 Condorjobs

glide-ins

Condor-G Grid Resource

Page 14: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

CollectorCollector

GridManagerGridManager

600 Condorjobs

glide-ins

Condor-G Grid Resource

Page 15: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

StartdStartd

CollectorCollector

GridManagerGridManager

600 Condorjobs

glide-ins

Condor-G Grid Resource

Page 16: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

StartdStartd

CollectorCollector

GridManagerGridManager

600 Condorjobs

glide-ins

Condor-G Grid Resource

Page 17: Condor-G: An Update

www.cs.wisc.edu/condor

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

User JobUser Job

StartdStartd

CollectorCollector

Grid Resource

GridManagerGridManager

600 Condorjobs

glide-ins

Condor-G

Page 18: Condor-G: An Update

www.cs.wisc.edu/condor

Globus Grid

PBS LSF

Condor

Condor-G

Page 19: Condor-G: An Update

www.cs.wisc.edu/condor

Globus Grid

PBS LSF

Condor

600 Condorjobs

Condor-G

Page 20: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G

Globus Grid

PBS LSF

Condor

600 Condorjobs

Page 21: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G

Globus Grid

PBS LSF

Condor glide-ins

600 Condorjobs

Page 22: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G

Globus Grid

PBS LSF

Condor glide-ins

600 Condorjobs

Page 23: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G

Globus Grid

PBS LSF

Condor glide-ins

600 Condorjobs

Page 24: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G

Globus Grid

PBS LSF

Condor glide-ins

600 Condorjobs

Page 25: Condor-G: An Update

www.cs.wisc.edu/condor

Past

› GridManager daemon Runs Grid jobs using GRAM protocol Stages executable and standard I/O using

GASS protocol

› Globus GRAM 1.5 We added fault-tolerance to the GRAM

protocol Changes included in Globus Toolkit 2.0

release

Page 26: Condor-G: An Update

www.cs.wisc.edu/condor

Present

› Updated Condor-G to Globus Toolkit 2.0

› Enhanced GridManager

› GAHP

Page 27: Condor-G: An Update

www.cs.wisc.edu/condor

Enhanced GridManager

› Put problem jobs on hold

› No more stuck jobs

› Increase concurrency with GAHP

› Almost ready

Page 28: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 29: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 30: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 31: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 32: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 33: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 34: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 35: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 36: Condor-G: An Update

www.cs.wisc.edu/condor

Single-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 37: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 38: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 39: Condor-G: An Update

www.cs.wisc.edu/condor

Globus Application Helper Protocol (GAHP)› Condor is non-threaded

› Want to use multi-threaded libraries Increased concurrency

› Put libraries in external helper process

› Simple interface over pipes/sockets

Page 40: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager Grid Resource

Grid Resource

Grid Resource

Grid Resource

Job 1

Job 2

Job 3

Job 4

Page 41: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 42: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 43: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 44: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 45: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 46: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 47: Condor-G: An Update

www.cs.wisc.edu/condor

Multi-Threaded Execution with GAHP

GridManager

GAHP Client

Grid Resource

Grid Resource

Grid Resource

Grid Resource

GAHP Server

Job 1

Job 2

Job 3

Job 4

Page 48: Condor-G: An Update

www.cs.wisc.edu/condor

Future

› GRAM 1.6

› Condor-G on Windows

› Condor-G Grid service

Page 49: Condor-G: An Update

www.cs.wisc.edu/condor

Globus GRAM 1.6

› Working with Globus team to add additional features to GRAM protocol Credential refresh File staging Scheduler-specific options

Page 50: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G for Windows

› Condor Windows implementation available

› GRAM and GASS APIs No C implementation for Windows (yet) Java implementation (Java CoG)

› Condor-G Windows version possible by writing GAHP

server in Java

Page 51: Condor-G: An Update

www.cs.wisc.edu/condor

Condor-G Grid Service

› Reliable job submission service for higher-lever applications

› Open Grid Services Architecture (OGSA)

› SOAP, WSDL, WS-Inspection› Implement Grid service interface for

Condor-G (and Condor in general)

Page 52: Condor-G: An Update

www.cs.wisc.edu/condor

Thank You

› Condor-G demo on Wednesday 3351 CS

› Questions? Talk to me E-mail [email protected]