Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion...

SparrowDistributed Low-Latency Spark Scheduling

Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Outline

The Spark scheduling bottleneck

Sparrow’s fully distributed, fault-tolerant technique

Sparrow’s near-optimal performance

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

Query Compilation

Storage

Scheduling

Spark Today

Worker

Spark ContextUser

Query Compilation

Storage

Scheduling

Job Latencies Rapidly Decreasing

10 min.

10 sec.

100 ms

2004: MapReducebatch job

2009: Hive

2010: Dremel Query

2012: Impala query

2010:In-

memory Spark query

2013:Spark

streaming

Job latencies rapidly decreasing

Job latencies rapidly decreasing+

Spark deployments growing in size

Scheduling bottleneck!

Spark scheduler throughput:

1500 tasks / second

1 second 100100 ms

10 second 1000

Task DurationCluster size(# 16-core machines)

Optimizing the Spark Scheduler

0.8: Monitoring code moved off critical path

0.8.1: Result deserialization moved off critical path

Future improvements may yield 2-3x higher throughput

Is the scheduler the bottleneck in my cluster?

Worker

Cluster Scheduler

Task launch

Task completion

Worker

Cluster Scheduler

Task launch

Task completion

Worker

Cluster Scheduler

Task launch

Task completion

Scheduler

Spark Today

Worker

Spark ContextUser

Query Compilation

Storage

Scheduling

Future Spark

Worker

User 1

User 2

User 3

SchedulerQuery

compilation

SchedulerQuery

compilation

SchedulerQuery

compilation

Benefits:High

throughputFault

tolerance

Future Spark

Worker

User 1

User 2

User 3

SchedulerQuery

compilation

SchedulerQuery

compilation

SchedulerQuery

compilation

Storage:

Tachyon

Scheduling with Sparrow

Scheduler

SchedulerStage

Worker

Batch Sampling

Scheduler

Worker

Place m tasks on the least loaded of 2m workers

4 probes (d =

Queue length poor predictor of wait time

Worker

80 ms155

530 ms

Poor performance on heterogeneous workloads

Late Binding

Worker

Scheduler

SchedulerScheduler

Worker

Place m tasks on the least loaded of dm workers

4 probes (d =

Late Binding

Scheduler

SchedulerScheduler

4 probes (d =

Worker

Late Binding

Scheduler

SchedulerScheduler

Worker

requests

Worker

What about constraints?

Per-Task Constraints

Scheduler

Worker

Probe separately for each task

Technique Recap

Scheduler

Batch sampling

+Late binding

+Constraints

Worker

How well does Sparrow perform?

How does Sparrow compare to Spark’s native scheduler?

100 16-core EC2 nodes, 10 tasks/job, 10 schedulers, 80% load

TPC-H Queries: Background

TPC-H: Common benchmark for analytics workloads

Sparrow

Shark: SQL execution engine

TPC-H Queries

100 16-core EC2 nodes, 10 schedulers, 80% load

Percentiles

Within 12% of ideal

Median queuing delay of 9ms

Policy Enforcement

WorkerHigh Priority

Low Priority WorkerUser A (75%)

User B (25%)

Fair SharesServe queues using

weighted fair queuing

PrioritiesServe queues based on strict priorities

Weighted Fair Sharing

Fault Tolerance

Scheduler 1

Scheduler 2

Spark Client 1 ✗Spark

Client 2

Timeout: 100msFailover: 5ms

Re-launch queries: 15ms

Making Sparrow feature-complete

Interfacing with UI

Delay scheduling

Speculation

(2) Distributed,

fault-tolerant scheduling

with Sparrow www.github.com/radlab/sparrow

Scheduler

Worker

(1) Diagnosing a

Spark scheduling bottleneck

Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion...

Documents

Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia

Matei Zaharia Amp Camp 2012 Advanced Spark

Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia

Orchestra Managing Data Transfers in Computer Clusters Mosharaf Chowdhury, Matei Zaharia, Justin Ma, Michael I. Jordan, Ion Stoica UC Berkeley

Introduc+ontoParallel Compu+ngwithApacheSpark · 7 Spark"Background" • AmplabUCBerkeley" • ProjectLead:"Dr." Matei&Zaharia& • Firstpaper"published"on"RDD’s"was"in"2012" •

Trends for Big Data and Apache Spark in 2017 by Matei Zaharia

Matei Zaharia Fast and Expressive Big Data Analytics with Python UC BERKELEY spark-project.orgUC Berkeley / MIT

Matei Zaharia UC Berkeley AMPLab spark-project.org UC BERKELEY Big Data Processing with MapReduce and Spark

Dominant Resource Fairness: Fair Allocation of …matei/papers/2011/nsdi_drf.pdfDominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin

Job Scheduling with the Fair and Capacity Schedulers Matei Zaharia

Spark-summit-2013 Matei Zaharia

Data-Centric Security Dawn Song UC Berkeley Collaboration with Lorenzo Martignoni, Stephen McCamant, Pongsin Poosankam, Matei Zaharia, Scott Shenker, Ion

Spark Summit EU 2015: Matei Zaharia keynote

Matei Zaharia University of California, Berkeley Spark in Action Fast Big Data Analytics using Scala UC BERKELEY

Cloud Computing with MapReduce and Hadoop Matei Zaharia UC Berkeley RAD Lab matei@eecs.berkeley.edu

Sparrow Distributed Low-Latency Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Matei Zaharia UC Berkeley Parallel Programming With Spark UC BERKELEY

Matei Zaharia, in collaboration with Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Cliff Engle, Michael Franklin, Haoyuan Li, Antonio Lupher, Justin Ma,

Sparrow: Distributed, Low Latency Scheduling - …istoica/classes/cs294/15/notes/... · Sparrow: Cluster Scheduling for Interactive Workloads Patrick Wendell, Kay Ousterhout, Matei

2016 Spark Summit East Keynote: Matei Zaharia