Sparrow Distributed Low-Latency Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

SparrowDistributed Low-Latency Scheduling

Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Sparrow schedules tasks in clusters

using a decentralized, randomized approach

Sparrow schedules tasks in clusters

using a decentralized, randomized approach

support constraints and fair sharing

and provides response times

within 12% of ideal

Scheduling Setting

…

…

…

Map Reduce/Spark/

Dryad Job

…

Task

Task

Task

Map Reduce/Spark/

Dryad JobTask

Task

Job Latencies Rapidly Decreasing

10 min.

10 sec.

100 ms

1 ms

2004: MapReducebatch job

2009: Hive

query

2010: Dremel Query

2012: Impala query

2010:In-

memory Spark query

2013:Spark

streaming

Scheduling challenges:

Millisecond Latency

Quality Placement

Fault Tolerant

High Throughput

10 min.

10 sec.

100 ms

1 ms

2004: MapReducebatch job

2009: Hive

query

2010: Dremel Query

2012: Impala query

2010:In-

memory Spark query

2013:Spark

streaming

1000 16-core machines

26decisio

ns/second

Scheduler

throughput

1.6K

decisions/

second

160Kdecisio

ns/second

16M

decisions/

second

Millisecond Latency

Quality Placement

Fault Tolerant

High Throughput

Today: Completely Centralized Less centralization

Sparrow:Completely

Decentralized

✗

✗

✓

✗ ✓

✓

✓

?✓

SparrowDecentralized approach

Existing randomized approachesBatch SamplingLate BindingAnalytical performance evaluation

Handling constraints

Fairness and policy enforcement

Within 12% of ideal on 100 machines

Scheduling with Sparrow

WorkerWorkerWorkerWorkerWorker

Scheduler

Scheduler

Scheduler

SchedulerJob

Worker


Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

Random

Simulated Results

100-task jobs in 10,000-node cluster, exp. task durations

Omniscient: infinitely fast centralized

scheduler

Per-task sampling


Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

Power of Two Choices

Per-task sampling


Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

Power of Two Choices

Simulated Results


70% cluster load

Response Time Grows with Tasks/Job!

Per-Task Sampling


Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

✓

✓

Task 1

Task 2

Per-ta

sk

Per-task Sampling


Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

Place m tasks on the least loaded of dm slaves

Per-ta

sk✓

✓

4 probes (d =

2)

Batch

Per-task versus Batch Sampling

70% cluster load

Simulated Results


Queue length poor predictor of wait time

Worker

Worker

80 ms155

ms

530 ms

Poor performance on heterogeneous workloads

Late Binding

Worker

Worker

Worker

Worker

Worker

Scheduler

Scheduler

SchedulerSchedulerJob

Worker


4 probes (d =

2)

Late Binding

Scheduler

Scheduler



4 probes (d =

2)

Worker

Worker

Worker

Worker

Worker

Worker

Late Binding

Scheduler

Scheduler



Worker

requests

task

Worker

Worker

Worker

Worker

Worker

Worker

Simulated Results


What about constraints?

Job Constraints

Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

Worker

Worker

Worker

Worker

Worker

Restrict probed machines to those that satisfy the constraint

Per-Task Constraints

Scheduler

Scheduler

Scheduler

SchedulerJob

Worker

Worker

Worker

Worker

Worker

Worker

Probe separately for each task

Technique Recap

Scheduler

Scheduler

Scheduler

SchedulerBatch

sampling+

Late binding+

Constraint handling


Worker

How does Sparrow perform on a real cluster?

Spark on Sparrow


Worker

Query: DAG of Stages

Sparrow

Scheduler

Job

Spark on Sparrow


Worker


Sparrow

Scheduler

Job

Spark on Sparrow


Worker


Sparrow

Scheduler

Job

How does Sparrow compare to Spark’s native scheduler?

100 16-core EC2 nodes, 10 tasks/job, 10 schedulers, 80% load

TPC-H Queries: Background

TPC-H: Common benchmark for analytics workloads

Sparrow

Spark: Distributed in-memory analytics framework

Shark: SQL execution engine

TPC-H Queries

100 16-core EC2 nodes, 10 schedulers, 80% load

95

75

25

50

Percentiles

5

TPC-H Queries

100 16-core EC2 nodes, 10 schedulers, 80% load

Within 12% of ideal

Median queuing delay of 9ms

Fault Tolerance

Scheduler 1

Scheduler 2

Spark Client 1 ✗Spark

Client 2

Timeout: 100msFailover: 5ms

Re-launch queries: 15ms

When does Sparrow not work as well?

High cluster load

Related Work

Centralized task schedulers: e.g., Quincy

Two level schedulers: e.g., YARN, Mesos

Coarse-grained cluster schedulers: e.g., Omega

Load balancing: single task

Batch sampling

+Late binding

+Constraint handling

www.github.com/radlab/sparrow

Sparrows provides near-ideal job response

times without global visibility

Scheduler

Scheduler

Scheduler

Scheduler


Worker

Backup Slides

Can we do better without losing simplicity?

Policy Enforcement

SlaveHigh Priority

Low Priority SlaveUser A (75%)

User B (25%)

Fair SharesServe queues using

weighted fair queuing

PrioritiesServe queues based on strict priorities

Documents

Sparrow Distributed Low-Latency Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica