64
A Statistical Scheduling Technique for a Computational Market Economy Neal Sample Stanford University

A Statistical Scheduling Technique for a Computational Market Economy Neal Sample Stanford University

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

A Statistical Scheduling Technique for a Computational

Market Economy

Neal SampleStanford University

2 UCSC 2002

Research Interests Compositional Computing (GRID)

Reliability and Quality of Service Value-based and model-based mediation Languages:

“Programming for the non-programmer expert”

Database Research Semistructured indexing and storage Massive table/stream compression Approximate algorithms for streaming data

3 UCSC 2002

Why We’re Here

Coding

Integration/Composition

1970 1990 2010

4 UCSC 2002

GRID: Commodity Computing

5 UCSC 2002

GRID: Commodity Computing

6 UCSC 2002

GRID: Commodity Computing

On Demand

High Throughput

Collaborative

Distributed Supercomputing

Data Intensive(Large Hadron Collider) (Computer-in-the-loop)

(FightAIDSAtHome, Nug30)

(Chip design, cryptography)

(Data exploration, Education)

7 UCSC 2002

Remote, autonomous Services are not free

Fee ($), execution time 2nd order dependencies

“Open Service Model” Principles:GRID, CHAIMS Protocols: UDDI, IETF SLP Runtime: Globus, CPAM

Composition of Large Services

8 UCSC 2002

Grid Life is Tough Increased complexity throughout

New tools and applications Diverse resources such as computers,

storage media, networks, sensors Programming

Control flow & data flow separation Service mediation

Infrastructure Resource discovery, brokering, monitoring Security/authorization Payment mechanisms

9 UCSC 2002

Our GRID Contributions Programming models and tools System architecture Resource management Instrumentation and performance

analysis Network protocols and infrastructure Service mediation

10 UCSC 2002

Other GRID Research Areas

The nature of applications Algorithms and problem solving methods Security, payment/escrow, reputation End systems

Programming models and tools System architecture Resource management Instrumentation and performance

analysis Network protocols and infrastructure Service mediation

11 UCSC 2002

Roadmap Brief introduction to CLAM language Some related scheduling methods Surety-based scheduling

Sample program Monitoring Rescheduling

Results A few future directions

12 UCSC 2002

Decomposition of CALL-statement Parallelism by asynchrony in sequential

program Reduction of complexity of invoke statements Control of new GRID requirements

(estimation, trading, brokering, etc.) Abstract out data flow

Mediation for data flow control and optimization

Extraction model mediation Purely compositional

No primitives for arithmetic No primitives for input/output Targets the “non-programmer expert”

CLAM Composition Language

13 UCSC 2002

Pre-invocation:SETUP: set up the connection to a service

SET-, GETPARAM: in a service

ESTIMATE: service cost estimation

Invocation and result gathering:INVOKE

EXAMINE: test progress of an invoked method

EXTRACT: extract results from an invoked method

Termination:TERMINATE: terminate a method invocation/connection to

a service

CLAM Primitives

14 UCSC 2002

Resources + Scheduling Computational Model

Multithreading Automatic

parallelization

Resource Management Process creation OS signal delivery OS scheduling

endsyste

m

15 UCSC 2002

Resources + Scheduling Computational Model

Synchronous communication

Distributed shared memory

Resource Management Parallel process creation Gang scheduling OS-level signal

propagation

cluster

endsyste

m

16 UCSC 2002

Resources + Scheduling Computational Model

Client/server Loosely synchronous:

pipelines IWIM

Resource Management Resource discovery Signal distribution

networks

cluster

intranet

endsyste

m

17 UCSC 2002

Resources + Scheduling Computational Model

Collaborative systems Remote control Data mining

Resource Management Brokers Trading Mobile code negotiation

cluster

intranet

endsyste

m

Internet

18 UCSC 2002

Scheduling Difficulties Adaptation: Repair and Reschedule

Schedules for T0 are only guesses Estimates for multiple stages may become

invalid => Schedules must be revised during

runtime

t0 tfinish

schedule

work

reschedulehazard

work work

TIME

19 UCSC 2002

Scheduling Difficulties Service Autonomy: No Resource Allocation

The scheduler does not handle resource allocation

Users observe resources without control

Means: Competing objectives have orthogonal scheduling techniques Changing goals for tasks or users means

vastly increased scheduling complexity

20 UCSC 2002

Some Related Work

R

A

M

Q

Rescheduling

Autonomy of Services

Monitoring Execution

QoS, probabilistic execution

21 UCSC 2002

Some Related Work

R

A

M

Q

Rescheduling

Autonomy of Services

Monitoring Execution

QoS, probabilistic execution

PERT

Q

A

M

22 UCSC 2002

Some Related Work

R

A

M

Q

Rescheduling

Autonomy of Services

Monitoring Execution

QoS, probabilistic execution

PERT

Q

A

M

CPM

M

R

A

23 UCSC 2002

Some Related Work

R

A

M

Q

Rescheduling

Autonomy of Services

Monitoring Execution

QoS, probabilistic execution

ePERT(AT&T)Condor

(Wisconsin)

M

R

Q

PERT

Q

A

M

CPM

M

R

A

24 UCSC 2002

Some Related Work

R

A

M

Q

Rescheduling

Autonomy of Services

Monitoring Execution

QoS, probabilistic execution

ePERT(AT&T)Condor

(Wisconsin)

M

R

Q

PERT

Q

A

M

CPM

M

R

A

Mariposa

(UCB)

R

Q

A

25 UCSC 2002

Some Related Work

R

A

M

Q

Rescheduling

Autonomy of Services

Monitoring Execution

QoS, probabilistic execution

ePERT(AT&T)Condor

(Wisconsin)

M

R

Q

Mariposa

(UCB)

R

Q

A

PERT

Q

A

M

CPM

M

R

A

SBS(Stanfor

d)

R

Q

A

M

26 UCSC 2002

Sample Program

C

A

D

B

27 UCSC 2002

Budgeting Time

Maximum allowable execution time Expense

Funding available to lease services

Surety Goal: schedule probability of success Assessment technique

28 UCSC 2002

Program Schedule as a Template

Instantiated at runtime Service provider selection,

etc.

C

A

D

B

CCCC

C

A

A A

A

B

B B

B

B

DD

DD

29 UCSC 2002

Program Schedule as a Template

Instantiated at runtime Service provider selection,

etc.

C

A

D

B

CCCC

C

A

A A

A

B

B B

B

B

DD

DD

30 UCSC 2002

Program Schedule as a Template

Instantiated at runtime Service provider selection,

etc.

C

A

D

B

CCCC

C

A

A A

A

B

B B

B

B

DD

DD

31 UCSC 2002

Program Schedule as a Template

Instantiated at runtime Service provider selection,

etc.

C

A

D

B

CCCC

C

A

A A

A

B

B B

B

B

DD

DD

32 UCSC 2002

t0 Schedule Selection

Guided by runtime “bids” Constrained by budget

C

A

D

B

CCCC

C

A

A A

A

B

B B

B

B

DD

DD

7±2h$50

6±1h$40

5±2h$30

3±1h$30

33 UCSC 2002

t0 Schedule Constraints

Budget Time: upper bound - e.g. 22h Cost: upper bound - e.g. $250 Surety: lower bound - e.g. 90% {Time, Cost, Surety} = {22, 250, 90}

Steered by user preferences/weights <Time, Cost, Surety> = <10, 1, 5>

Selection S1est [20, 150, 90] = (22-20)*10 + (250-150)*1 + (90-90)*5 = 120 S2est [22, 175, 95] = (22-22)*10 + (250-175)*1 + (95-90)*5 = 100 S3est [18, 190, 96] = (22-18)*10 + (250-190)*1 + (96-90)*5 = 130

34 UCSC 2002

budget time

bu

dg

et

cost

Budget

User Pref.

Pareto

Search Space

Expected Program Execution Time

Exp

ect

ed P

rog

ram

Cost

0

0

Plans

35 UCSC 2002

Program Evaluation and Review Technique

Service times:most likely(m), optimistic(a) and pessimistic(b)

32

2 iii

i

bam

e

6ii

i

ab

and iee 2iprogram

;programxet

program

etx

programprogram

eteTtTprob

)(

program

et

N(0, 1)

(1) expected duration (service)

(2) standard deviation

(3) expected duration (program)

(4) test value

(5) expectation test

(6) ~expectation test

36 UCSC 2002

t0 Complete Schedule Properties

0

5

10

15

20

25

30

13 14 15 16 17 18 19 20 21 22 23

Pro

bab

ility

Den

sity

Probable Program Completion Time

deadlineBank = $100 userspecified

surety

37 UCSC 2002

Individual Service Properties

C

A

B

7±2h

6±1h

5±2h

0 10~finish time

pro

babili

ty d

en

sity

0

1.20

1.20

1.2

38 UCSC 2002

14 23probable finish time0

1

t0 Combined Service Properties

0 10~finish time

pro

babili

ty d

en

sity

0

1.20

1.20

1.2Deadline

(22h)Surety(90%)

Current Surety(99.6%)

pro

babili

ty d

en

sity

39 UCSC 2002

Tracking Suretysu

rety

%

80

100

90

pro

bab

ility

den

sity

User-specifiedsurety

40 UCSC 2002

Runtime Hazards With control over resource allocation or

without runtime hazards Scheduling becomes much easier

Runtime implies t0 schedule invalidation Sample hazards

Delays and slowdowns Stoppages Inaccurate estimations Communication loss Competitive displacement… OSM

41 UCSC 2002

Progressive Hazard

execution time080

100

minimumsurety

hazard

90

sure

ty %

Definition + Detection

serviceAstart

serviceBstart

(serviceB slow)

42 UCSC 2002

Catastrophic Hazard

execution time080

100

minimumsurety

hazard

90

sure

ty %

Definition + Detection

0%

serviceAstart

serviceBstart

(serviceB fails)

43 UCSC 2002

Pseudo-Hazard

execution time080

100

minimumsurety

pseudo-hazard

90

sure

ty %

Definition + Detection

serviceAstart

serviceBstart

(serviceB communication failure)0%

44 UCSC 2002

Monitoring + Repair

Observe, not control Complete set of repairs

Sufficient (not minimal)

Simple cost model: early termination = linear cost recovery

Greedy selection of single repair -O(s*r)

C

A

D

B

45 UCSC 2002

Schedule Repair

execution time080

100

thazard

90

sure

ty %

C

A

D

B

trepair

46 UCSC 2002

Strategy 0: baseline (no repair)

pro: no additional $ cost pro: ideal solution for partitioning hazards

con: depends on self-recovery

execution time080

100

thazard

90

sure

ty %

trepair

C

A

D

B

47 UCSC 2002

Strategy 1: service replacement

pro: reduces $ lost

con: lost investment of $ and time con: concedes recovery chance

execution time080

100

thazard

90

sure

ty %

C

A

D

B

trepair

B’

48 UCSC 2002

Strategy 2: service duplication

pro: larger boost surety; leverages recovery chance

con: large $ cost

execution time080

100

thazard

90

sure

ty %

C

A

D

B

trepair

B’

49 UCSC 2002

Strategy 3: pushdown repair

pro: cheap, no $ lost pro: no time lost con: cannot handle catastrophic hazards con: requires recovery chance

execution time080

100

thazard

90

sure

ty %

C

A

D

B

trepair

C’

x

50 UCSC 2002

Experimental Results Rescheduling options

Baseline: no repairs Single strategy repairs

Limits flexibility and effectiveness Use all strategies

Setup 1000 random DAG schedules, 2-10

services 1-3 hazards per execution Fixed service availability All schedules are repairable

51 UCSC 2002

“The Numbers”

What is the value of a close finish? ( late)

0

200

400

600

800

1000

do nothing push-down replacement duplication all IDEAL

Repair Strategy

Sc

he

du

les

Fin

ish

ed

On

tim

e

On Time

52 UCSC 2002

“The Numbers”

What is the value of a close finish? ( late)

0

200

400

600

800

1000

do nothing push-down replacement duplication all IDEAL

Repair Strategy

Sc

he

du

les

Fin

ish

ed

On

tim

e

On Time On Time+stdev

53 UCSC 2002

Why the Differences? Catastrophic hazard

Service provider failure - “do nothing”: no solution to hazard

Pseudo-hazard Communication failure, network partition Looks exactly like catastrophic hazard - “do nothing” : the ideal solution

Slowdown hazard Not a complete failure, multiple solutions - “do nothing”: ideal or futile or

acceptable

54 UCSC 2002

A Challenge Observations of progress are only

secondary indicators of current work rate

0

10

20

30

40

50

60

70

80

90

100

0 20 40 60 80 100 120 140 160 180 200

execution time

prog

ress

%

projected finish

finish time

projected finish

55 UCSC 2002

Open Questions Simultaneous rescheduling

Use more than one strategy for a hazard NP to find the optimal solution NP here might not be that hard…

Approximations are acceptable Small set Strong constraints NP is worst case, not average case?

(e.g., DFBB search)

Global impact of local schedule preferences How do local preferences interact in/reshape

the global market?

56 UCSC 2002

Open Questions Monitoring resolution adjustments

Networks are not free or zero latency Account cost of monitoring

Frequent monitoring = more cost Frequent monitoring = greater accuracy

Unstudied effect delayed status information

Accuracy of t0 service cost estimates Model as a hazard with delayed detection “1-way hazard” Penalty adjustments

57 UCSC 2002

Deeper Questions User preferences only used in generating

initial (t0) schedule fixed least cost repair ( = surety / repair

cost) Best cost repair (success sensitive to

preference?) Second order cost effects

$ left over in budget is purchasing power What is the value of that purchasing power? Sampling for cost estimates during runtime surety =time + progress (+ budgetBalance/valuation)

58 UCSC 2002

Conclusions Novel statistical method for service

scheduling Effective strategies for varied hazard mix Achieves per-user-defined Quality of Service Should translate well “out of the sandbox”

Clear directions for continued research

More information http://www.db.stanford.edu/~nsample/ http://www.db.stanford.edu/CHAIMS/

59 UCSC 2002

60 UCSC 2002

Steps in Scheduling

Estimation

Planning

Invocation

Monitoring

Completion

Rescheduling

61 UCSC 2002

CHAIMS Scheduler

ProgramAnalyzer

Input program

Planner

Requirements

Estimator/Bidder

Monitor Dispatcher

StatusCosts/Times Control

observe invokehaggle

User Requirements(e.g., Budget)

62 UCSC 2002

Simplified Cost Model

on time

target

start/run

finish

+

data transportation costs+

Completing the cost model

63 UCSC 2002

Full Cost Model

client ready to start

hold fee

late

early on time

target

start/run

reservation

finish

client ready for data

+ -+

+ +

data transportation costs+

Completing the cost model

64 UCSC 2002

The Eight Fallacies of Distributed Computing

-- Peter Deutsch 1. The network is reliable 2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn't change 6. There is one administrator 7. Transport cost is zero 8. The network is homogeneous