25
Derek Wright Computer Sciences Department University of Wisconsin-Madison [email protected] http://www.cs.wisc.edu/condor MPI Scheduling in Condor: An Update Paradyn/Condor Week Madison, WI 2002

MPI Scheduling in Condor: An Update Paradyn/Condor Week Madison, WI 2002

  • Upload
    hart

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

MPI Scheduling in Condor: An Update Paradyn/Condor Week Madison, WI 2002. Outline. Review of Dedicated/MPI Scheduling in Condor Dedicated vs. Opportunistic Backfill Supported MPI Implementations Supported Platforms Future Work. What is MPI?. MPI is the “Message Passing Interface” - PowerPoint PPT Presentation

Citation preview

Page 1: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

Derek WrightComputer Sciences DepartmentUniversity of Wisconsin-Madison

[email protected]://www.cs.wisc.edu/condor

MPI Scheduling in Condor: An Update

Paradyn/Condor WeekMadison, WI 2002

Page 2: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Outline

› Review of Dedicated/MPI Scheduling in Condor Dedicated vs. Opportunistic Backfill

› Supported MPI Implementations› Supported Platforms› Future Work

Page 3: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

What is MPI?

› MPI is the “Message Passing Interface”

› A library for writing parallel applications Fixed number of nodes Cannot be preempted

› Lots of scientists use it for large problems

› MPI is a standard with many different implementations

Page 4: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Dedicated Scheduling in Condor

› To schedule MPI jobs, Condor must have access to dedicated resources

› More and more Condor pools are being formed from dedicated resources

› Few schedulers handle both dedicated and non-dedicated resources at the same time

Page 5: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Problems with Dedicated Compute

Clusters› Dedicated resources are not really

dedicated Most software for controlling clusters relies

on dedicated scheduling algorithms Assume constant availability of resources to

compute fixed schedules

› Due to hardware and software failure, dedicated resources are not always available over the long-term

Page 6: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Look Familiar?

Page 7: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Two common views of a Cluster:

Page 8: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

The Condor Solution

› Condor overcomes these difficulties by combining aspects of dedicated and opportunistic scheduling into a single system Opportunistic scheduling involves placing

jobs on non-dedicated resources under the assumption that the resources might not be available for the entire duration of the jobs

This is what Condor has been doing for years

Page 9: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

The Condor Solution (cont’d)

› Condor manages all resources and jobs within a single system Administrators only have to maintain

one system, saving time and money Users can submit a wide variety of jobs:

• Serial or parallel (including PVM + MPI)• Spend less time learning different

scheduling tools, more time doing science

Page 10: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Claiming Resources for Dedicated Jobs

› When the dedicated scheduler (DS) has idle jobs, it queries the collector to find all dedicated resources

› DS does match-making to decide which resources it wants

› DS sends requests to the opportunistic scheduler to claim those resources

› DS claims resources and has exclusive control (until it releases them)

Page 11: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Backfilling: The Problem

› All dedicated schedulers leave “holes”

› Traditional solution is to use backfilling Use lower priority parallel jobs Use serial jobs

› However, if you can’t checkpoint the serial jobs, and/or you don’t have any parallel jobs of the right size and duration, you’ve still got holes

Page 12: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Backfilling: The Condor Solution

› In Condor, we already have an infrastructure for managing non-dedicated nodes with opportunistic scheduling, so we use that to fill the holes in the dedicated schedule Our opportunistic jobs can be checkpointed

and migrated when the dedicated scheduler needs the resources again

Allows dedicated resources to be used for opportunistic jobs as needed

Page 13: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Specific MPI Implementations

› Supported: MPICH

› Planned: MPIPro LAM

› Others?

Page 14: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Condor’s MPICH Support

› MPICH uses rsh to spawn jobs› Condor provides our own rsh tool

Older versions of MPICH need to be built without a hard-coded path to rsh

Newer versions of MPICH (1.2.2.3 and later) support an environment variable, P4_RSHCOMMAND, which specifies what program should be used

Page 15: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Condor and MPIPro

› We’ve investigated supporting MPIPro jobs with Condor

› MPIPro has some issues with selecting a port for the head node in your computation, and we’re looking for a good solution

Page 16: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Condor + LAM = "LAMdor”

› LAM's API is better suited for a dynamic environment, where hosts can come and go from your MPI universe

› Has a different mechanism for spawning jobs than MPICH

› Condor working to support their methods for spawning

Page 17: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

LAMdor (Cont’d)

› LAM working to understand, expand, and fully implement the dynamic scheduling calls in their API

› LAM also considering using Condor’s libraries to support checkpointing of MPI computations

Page 18: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Other MPI implementations

› What are people using?

› Do you want to see Condor support any other MPI implementations?

› If so, let us know by sending email to: [email protected]

Page 19: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Supported Platforms

› Condor’s MPI support is now available on all Condor platforms: Unix

• Linux, Solaris, Digital Unix, IRIX, HPUX Windows (new since last year)

• NT, 2000

Page 20: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Future work (short-term)

› Implementing more advanced dedicated scheduling algorithms Integrating Condor’s user priority

system with its dedicated scheduling Adding support for user-specified job

priorities (among their own jobs)

› Condor-MPI support for the Tool Daemon Protocol

Page 21: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Future work (longer term)

› Solving problems w/ MPI on the Grid "Flocking" MPI jobs to remote pools, or

even spanning pools with a single computation

Solving issues of resource ownership on the Grid (i.e. how do you handle multiple dedicated schedulers on the grid wanting to control a given resource?)

Page 22: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

More Future work

› Support for other kinds of dedicated jobs: Generic dedicated jobs

• We gather and schedule the resources, then call your program, give it the list of machines, and let the program spawn itself

Linda (parallel programming interface)• Gaussian (computational chemistry)

Page 23: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

More Future work

› Better support for preempting opportunistic jobs to facilitate running high-priority dedicated ones “Checkpointing” vanilla jobs to swap space

› Checkpointing entire MPI computations

› MW using Condor-MPI

Page 24: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

How do I start using MPI with Condor?

› MPI support added and tested in the current development series (6.3.X)

› MPI support is a built-in feature of the next stable series of Condor (6.4.X)

› 6.4.0 will be released Any Day Any Day Now™Now™

Page 25: MPI Scheduling in Condor: An Update  Paradyn/Condor Week Madison, WI 2002

www.cs.wisc.edu/condor

Thanks for Listening!

› Questions? Come to the MPI “BoF”, Wednesday,

3/6/02, 11am-noon, 3385 CS

› For more information: www.cs.wisc.edu/condor [email protected]