49
Sparrow: Distributed, Low Latency Scheduling Rinik Kumar <[email protected]>

Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Sparrow: Distributed, Low Latency Scheduling

Rinik Kumar <[email protected]>

Page 2: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Agenda

- Part A: Background

- Part B: Sparrow system design

- Part C: Sparrow experimental evaluation

Page 3: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Part A: Background

Page 4: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Data Processing Frameworks

• How to distribute data-parallel computations across multiple machines?• MapReduce (OSDI ‘04)

• Dremel (VLDB ‘10)

• Spark (NSDI ‘12)

• Convert high-level computation description into jobs

• Partition input data and assign jobs to multiple machines

Page 5: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Short Tasks

• Common challenges in data processing frameworks

• Problem 1: Stragglers• Job response times are dominated by stragglers

• Causes:• Machine performance (e.g. contended CPUS, congested networks, etc.)

• Data partitioning (tasks take increased time due to computational skew, etc.)

• Problem 2: Sharing• Long-running tasks block additional tasks from running

Reference: http://kayousterhout.org/publications/hotos13-final24.pdf

Solution: Shorter Tasks!

Page 6: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Solution 1: Straggler Mitigation

Reference: http://kayousterhout.org/talks/tinytasks-hotos-talk.pdf

Page 7: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Solution 1: Straggler Mitigation

Reference: http://kayousterhout.org/talks/tinytasks-hotos-talk.pdf

Page 8: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Solution 2: Improved Sharing

Reference: http://kayousterhout.org/talks/tinytasks-hotos-talk.pdf

Page 9: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Solution 2: Improved Sharing

Reference: http://kayousterhout.org/talks/tinytasks-hotos-talk.pdf

Page 10: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Q: Why don’t existing data processing frameworks use short tasks?

Page 11: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Short Tasks

• Architectural changes:• Cluster must support minimal task launch overhead

• Scalable storage systems:• Task runtime could be dominated by time taken to read input data

• Low-latency scheduling:• Scheduler must be able to make millions of low-latency scheduling decisions per second

• Framework-controlled I/O:• Framework should exploit smaller resource footprint of small tasks (e.g. pipeline reading

data input)

• And more…• Changes to execution and programming model

Page 12: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Scheduling

• Sparrow provides a solution to the scheduling problem!

• Restrictive time requirements:• Sparrow has around 1-10 milliseconds to make scheduling decisions

• High throughput requirements:• Sparrow must support millions of scheduling decisions per second

Page 13: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Spark

• Data processing framework; optimizing for efficient data reuse and in-memory computation• Resilient distributed datasets (RDDs)

• Express computation as a sequence of transformations (e.g. map, filter, join, etc.) on RDDs

• Scheduling:• Tasks assigned to machines based on delay scheduling

• Delay scheduling attempts to achieve both fair sharing and data locality• Fair sharing: If N jobs are running, each job receives 1/N share of resources

• Data locality: Place computations near their input data

Reference: https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf

Page 14: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Centralized vs. Decentralized

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 15: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Background: Centralized vs. Decentralized

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 16: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Part B: Sparrow System Design

Page 17: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Sparrow’s Execution Model

• Cluster composed of worker machines that execute tasks and schedulers that assign tasks

• Each job composed of 𝑚 tasks

• Wait time:• Time until the task begins executing

• Represents scheduler overhead

• Service time:• Time the task spends executing on a worker machine

Page 18: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Sparrow’s Optimizations

• Batch sampling• Optimization of “power of 2 choices” load balancing

• Place 𝑚 tasks in a job on the least loaded of 𝑑 ∙ 𝑚 randomly selected machines

• Late binding• Delays assignment of tasks to machines until the machine is ready to run the

task

Page 19: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Randomized Sampling

• Scheduler chooses a random machine to assign tasks

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 20: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Randomized Sampling: Analysis

• Let 𝑛 be the number of machines in the cluster

• Let 𝑝 be the probability that a randomly selected machine is loaded• Represents cluster load

• Probability that random sampling assigns 𝑚 tasks to an unloaded machine:• (1 − 𝑝)𝑚

Page 21: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Randomized Sampling: Results

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 22: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Power of 2 Choices• Suppose 𝑛 balls are inserted into 𝑛 bins:

• Each ball chooses 𝑑 = 2 bins uniformly at random• The ball is inserted into the bin that has lesser number of balls• If both bins have an identical number of balls, put the ball in either bin

• Azar et al. proved that the max load is log log 𝑛 + 𝑂(1) with high probability

• This is exponentially better compared to random allocation:• Max load is ≈

log 𝑛

log log 𝑛

• Increasing 𝑑 does not improve much:• Max load is

log log 𝑛

log 𝑑+ 𝑂(1)

Reference 1: http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdfReference 2: https://homes.cs.washington.edu/~karlin/papers/balls.pdf

Page 23: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Per-Task Sampling

• Scheduler chooses 2 random machines; assigns task to least loaded machine

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 24: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Per-Task Sampling: Analysis

• Let 𝑑 be the number of machines that are probed

• Probability that per-task sampling assigns 𝑚 tasks to an unloaded machine:• (1 − 𝑝𝑑)𝑚

• Q: Why not choose a larger 𝑑?

• Problems:• Job response time limited by longest wait time of any running task

• Sub-optimal placement of tasks

Page 25: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Per-Task Sampling: Analysis

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 26: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Per-Task Sampling: Results

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 27: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Batch Sampling

• Scheduler probes 2𝑚 random machines; assigns 𝑚 tasks to least loaded machines

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 28: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Batch Sampling: Analysis

• Probability that batch sampling assigns 𝑚 tasks to an unloaded machine:• Equivalent to probability that ≥ 𝑚 machines are unloaded

• σ𝑖=𝑚𝑑∙𝑚(1 − 𝑝)𝑖𝑝𝑑∙𝑚−𝑖 𝑑∙𝑚

𝑖

• Problems:• Estimating load based on queue length is inaccurate

• Queue 1 = [ 50 ms, 50 ms, 50 ms ]

• Queue 2 = [ 200 ms ]

• Multiple schedulers assign tasks to the same machine

Page 29: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Batch Sampling: Results

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 30: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Late Binding

• Scheduler probes 2𝑚 random machines; reserves task on all machines

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 31: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Late Binding

• Machine requests task once it reaches front of queue

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 32: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Late Binding: Analysis

• Problems:• Machines are idle during the RPC to request a task from the scheduler

• Machines might request tasks from schedulers that have already allocated alltasks in a job

• Solution: Proactive cancellation• Upon allocating all tasks in a job, send a cancellation RPC to machines that

have pending reservations

• Q: Does Sparrow’s design extend to microsecond-scale tasks?

Page 33: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Late Binding: Results

Reference: http://kayousterhout.org/talks/sparrow-sosp-talk.pdf

Page 34: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Placement constraints

• Per-job constraints:• E.g. Job must execute on machines that have a GPU

• Restrict batch sampling to machines that satisfy the constraint

• Per-task constraints:• E.g. Task must execute on machine that has input data

• Uses per-task sampling

• Probed information shared across tasks:• Probe Task 1: [A loaded, B loaded, C unloaded]

• Probe Task 2: [C unloaded, D unloaded, E loaded]

• Optimal placement?

Page 35: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Resource allocation policies

• Strict priorities:• Tasks are assigned priorities (e.g. high/low)

• Sparrow maintains separate high/low priority task queues at each machine

• High priority task queue emptied over low priority task queue

• Weighted fair sharing:• Idea from network scheduling

• Maintain separate queues per-user

• Each user assigned a percentage representing their allocated “bandwidth”• e.g. 10%, 30%, and 60% to different users

Page 36: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Implementation

• Front-end client converts high-level job descriptions to task specifications• Clients and scheduler run on the same machine

• Scheduler assigns tasks to machines

• Local node monitor running on each machine enqueues scheduled tasks

• Executor process on machines execute tasks

Page 37: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Implementation: Fault tolerance

• Schedulers do not maintain persistent state• Similar to stateless web server backends

• Client must send heartbeats to schedulers to detect failure

• Upon failure, front-end must choose how to handle in-flight tasks• Simplest approach is to restart all in-flight tasks

• Q: Is this a good design? Is it acceptable to restart in-flight tasks upon scheduler failure?

Page 38: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Example: Spark on Sparrow

• Front-end translates functional queries into parallel stages

• Sparrow receives task description and placement constraints

Page 39: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Part C: Sparrow evaluation

Page 40: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Experimental Setup: TPC-H

• Cluster running on Amazon EC2• 100 machines and 10 schedulers

• 8 cores and 68.4 GB memory per machine

• Performance evaluated using TPC-H benchmark• Representative of ad-hoc queries on business data

• Properties:• Cluster utilization fluctuates around 80%

• Non-uniform task durations (10-100 ms)

• Mixed constrained/unconstrained scheduling requests

Page 41: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Experimental Evaluation: TPC-H

Page 42: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Deconstructing Performance

Page 43: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

How do task constraints affect performance?

Page 44: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

How do scheduler failures impact job response time?

Page 45: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

How does Sparrow compare to Spark?

Page 46: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

How effective is Sparrow’s distributed fairness enforcement?

Page 47: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

How much can low priority users hurt response times for high priority users?

Page 48: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

How sensitive is Sparrow to the probe ratio?

Page 49: Sparrow: Distributed, Low Latency Scheduling · •Sparrow must support millions of scheduling decisions per second. Background: Spark •Data processing framework; optimizing for

Conclusion

• Sparrow presents a simple, scalable solution to task scheduling• Supports millions of scheduling requests per second

• Scheduling decisions can be made in the order of milliseconds

• Discussion:• Q: Suppose the cluster operates at max load (e.g. high job arrival rate). Is

Sparrow’s approach optimal?

• Q: How could data processing frameworks co-optimize with Sparrow to obtain higher performance?

• Q: Are there alternative solutions to the straggler problem?