Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion...

Preview:

Citation preview

SparrowDistributed Low-Latency Spark Scheduling

Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Outline

The Spark scheduling bottleneck

Sparrow’s fully distributed, fault-tolerant technique

Sparrow’s near-optimal performance

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

1User

2User

3

Query Compilation

Storage

Scheduling

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

1User

2User

3

Query Compilation

Storage

Scheduling

Job Latencies Rapidly Decreasing

10 min.

10 sec.

100 ms

1 ms

2004: MapReducebatch job

2009: Hive

query

2010: Dremel Query

2012: Impala query

2010:In-

memory Spark query

2013:Spark

streaming

Job latencies rapidly decreasing

Job latencies rapidly decreasing+

Spark deployments growing in size

Scheduling bottleneck!

Spark scheduler throughput:

1500 tasks / second

1 second 100100 ms

10

10 second 1000

Task DurationCluster size(# 16-core machines)

Optimizing the Spark Scheduler

0.8: Monitoring code moved off critical path

0.8.1: Result deserialization moved off critical path

Future improvements may yield 2-3x higher throughput

Is the scheduler the bottleneck in my cluster?

WorkerWorkerWorkerWorkerWorker

Worker

Cluster Scheduler

Task launch

Task completion

WorkerWorkerWorkerWorkerWorker

Worker

Cluster Scheduler

Task launch

Task completion

WorkerWorkerWorkerWorkerWorker

Worker

Cluster Scheduler

Task launch

Task completion

Scheduler

delay

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

1User

2User

3

Query Compilation

Storage

Scheduling

Future Spark

WorkerWorkerWorkerWorkerWorker

Worker

User 1

User 2

User 3

SchedulerQuery

compilation

SchedulerQuery

compilation

SchedulerQuery

compilation

Benefits:High

throughputFault

tolerance

Future Spark

WorkerWorkerWorkerWorkerWorker

Worker

User 1

User 2

User 3

SchedulerQuery

compilation

SchedulerQuery

compilation

SchedulerQuery

compilation

Storage:

Tachyon

Scheduling with Sparrow

WorkerWorkerWorkerWorkerWorker

Scheduler

Scheduler

Scheduler

SchedulerStage

Worker

Stage

Batch Sampling

WorkerWorkerWorkerWorkerWorker

Scheduler

Scheduler

Scheduler

Scheduler

Worker

Place m tasks on the least loaded of 2m workers

4 probes (d =

2)

Queue length poor predictor of wait time

Worker

Worker

80 ms155

ms

530 ms

Poor performance on heterogeneous workloads

Stage

Late Binding

Worker

Worker

Worker

Worker

Worker

Scheduler

Scheduler

SchedulerScheduler

Worker

Place m tasks on the least loaded of dm workers

4 probes (d =

2)

Late Binding

Scheduler

Scheduler

SchedulerScheduler

Place m tasks on the least loaded of dm workers

4 probes (d =

2)

Worker

Worker

Worker

Worker

Worker

Worker

Stage

Late Binding

Scheduler

Scheduler

SchedulerScheduler

Place m tasks on the least loaded of dm workers

Worker

requests

task

Worker

Worker

Worker

Worker

Worker

Worker

Stage

What about constraints?

Stage

Per-Task Constraints

Scheduler

Scheduler

Scheduler

Scheduler

Worker

Worker

Worker

Worker

Worker

Worker

Probe separately for each task

Technique Recap

Scheduler

Scheduler

Scheduler

Scheduler

Batch sampling

+Late binding

+Constraints

WorkerWorkerWorkerWorkerWorker

Worker

How well does Sparrow perform?

How does Sparrow compare to Spark’s native scheduler?

100 16-core EC2 nodes, 10 tasks/job, 10 schedulers, 80% load

TPC-H Queries: Background

TPC-H: Common benchmark for analytics workloads

Sparrow

Spark

Shark: SQL execution engine

TPC-H Queries

100 16-core EC2 nodes, 10 schedulers, 80% load

95

75

25

50

Percentiles

5

Within 12% of ideal

Median queuing delay of 9ms

Policy Enforcement

WorkerHigh Priority

Low Priority WorkerUser A (75%)

User B (25%)

Fair SharesServe queues using

weighted fair queuing

PrioritiesServe queues based on strict priorities

Weighted Fair Sharing

Fault Tolerance

Scheduler 1

Scheduler 2

Spark Client 1 ✗Spark

Client 2

Timeout: 100msFailover: 5ms

Re-launch queries: 15ms

Making Sparrow feature-complete

Interfacing with UI

Delay scheduling

Speculation

(2) Distributed,

fault-tolerant scheduling

with Sparrow www.github.com/radlab/sparrow

Scheduler

Scheduler

Scheduler

Scheduler

WorkerWorkerWorkerWorkerWorker

Worker

(1) Diagnosing a

Spark scheduling bottleneck

Recommended