21
UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong RAPIER: Integrating Routing and Scheduling for Co ow-aware Data Center Networks Yangming Zhao (UESTC), Kai Chen (HKUST), Wei Bai (HKUST), Minlan Yu (USC), Chen Tian (HUST), Yanhui Geng (Huawei), Yiming Zhang (NUDT), Dan Li (Tsinghua), Sheng Wang (UESTC) [email protected]

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong RAPIER: Integrating Routing and Scheduling for Coflow-aware Data Center

Embed Size (px)

Citation preview

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

RAPIER: Integrating Routing and Scheduling

for Coflow-aware Data Center Networks

Yangming Zhao (UESTC), Kai Chen (HKUST), Wei Bai (HKUST),Minlan Yu (USC), Chen Tian (HUST), Yanhui Geng (Huawei),

Yiming Zhang (NUDT), Dan Li (Tsinghua), Sheng Wang (UESTC)

[email protected]

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.2

Coflow-aware Traffic Optimization

• Why traffic optimization in data center networks?– Improve traffic scalability– Improve QoS

• Why coflow-aware?– Minimize average flow completion time– Minimize average coflow completion time

• How to optimize network traffic?– Routing (Hedera, Micro-TE)– Scheduling (Varys, Baraat, pFabric)

In cluster computing frameworks, a stage cannot complete, or sometimes even start, before it receives all the flows in a coflow from the previous stage

An individual flow can be treated as a special coflow

Why not joint optimization?

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.3

Motivation ExampleTwo coflows: Coflow a: fa1=40Mb, fa2=100Mb; Coflow b: fb1=60Mb, fb2=100Mb

Link bandwidths are all 100Mbps

Case 1: ECMP + Scheduling

Traffic unbalance may occur due to the route collision incurred by ECMP

Average CCT=1.5ms

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.4

Motivation ExampleTwo coflows: Coflow a: fa1=40Mb, fa2=100Mb; Coflow b: fb1=60Mb, fb2=100Mb

Link bandwidths are all 100Mbps

Case 2: Coflow-agnostic Load balancing + Scheduling Average CCT=1.5ms

Consider routing and scheduling separately cannot optimize average CCT

Routing should also take flow dependence in a coflow into account

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.5

Motivation ExampleTwo coflows: Coflow a: fa1=40Mb, fa2=100Mb; Coflow b: fb1=60Mb, fb2=100Mb

Link bandwidths are all 100Mbps

Case 3: Coflow-aware routing + scheduling Average CCT=1.3ms

Jointly optimize routing and scheduling can minimize average CCT

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Desirable Properties of RAPIER

p.6

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Main idea

• Coflow-level Routing– Distribute all the flows in a coflow evenly in the network

• Coflow-level Scheduling– Minimal remaining time first principle

• Starvation-free– Scheduling a coflow first if it is waiting for a long time

• Work-conserving– Distribute all the bandwidth if there is a demand to serve

• Coexistence– Route mice flows with ECMP and highest priority

p.7

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

RAPIER in a Nutshell

p.8

For starvation-free

For minimal remaining time first

For work-conserving

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Minimize single coflow completion time

p.9

Non-linear w

ith integer

variable

Let ai=1/ti

Non-linear with integer

variable

Relax integer constraint

Let mkij=aixk

ij

Linear programming

Route demand i to j on the path with largest x and resolve (2)

Non-linear without integer

variable

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Relaxation and Rounding

p.10

Problem (2)

Problem (4)

Theorem 1: Assume the minimum CCT is tmin and talg is the CCT obtained by Algorithm 2, then

where K is the number of candidate paths for each flow

lg minat Kt

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Bandwidth Allocation

p.11

Large coflow first for starvation-free

Large flow first to reduce CCT

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Implementation

• Central controller– Algorithm 1

• End host enforcement modules– OpenFlow based explicit

routing

– Bandwidth enforcement

p.12

No device modification is required!!

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Experiment on Testbed

• Pronto 3295 48-port Gigabit Ethernet switch with PicOS 2.04 system

• Each server has a 4-core Intel E5-1410 2.8GHz CPU, 8G memory, 500GB hard disk and 1G Ethernet NICs

• The OS of servers is Debian 6.0 64bit version with Linux 2.6.38.3 kernel

p.13

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Experiment Results

p.14

Coflow ID

Flow ID source Destination Volume(GB)Coflow Completion Time(s)

RAPIER Routing Baseline

1

1

2

3

M1

M2

M3

M4

M5

M9

3.17

5.29

5.29

50.6 84.1 107.1

24

5

M8

M6

M6

M5

10.6

5.29100.9 203.0 289.5

36

7

M7

M9

M4

M6

17.9

10.6201.1 204.1 289.2

Average completion time 117.5 163.7 228.6RAPIER can save 48.6% of the average CCT compared to the baseline scheme, and it can reduce the average CCT by 28.22% compared to the routing-only scheme

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Simulation Settings

• C/C++ based flow level simulator

• CPLEX 10.0 for solving LP

• Fattree 、 VL2 with 512 servers

• Flows in a coflow arrive simultaneously

• Inter-coflow arrival rate follows a Poisson distribution

p.15

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Impact of coflow width

• Reduce average CCT by up to 79.44% in Fattree, and 55.55% in VL2

• Routing-only scheme performs better when coflow width is small.

• Scheduling-only scheme performs better when coflow width is large.

p.16

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Impact of coflow number

• RAPIER keeps relatively stable performance with different coflow number.

• Scheduling-only scheme is more effective in VL2 than in Fattree

p.17

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Impact of inter-coflow arrival interval

• The average CCT is decreased with the increase of average inter-coflow arrival interval

• The same trend as scheduling-only scheme when the inter-coflow arrival interval is small

• The same trend as routing-only scheme when the inter-coflow arrival interval is large

p.18

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Simulation Results Summary

• In light-load scenario, routing contributes more by solving the flow path collision problem in ECMP.

• In heavy-load scenario, scheduling contributes more by determining the sending order of flows/coflows.

• RAPIER integrates both schemes and gets all the benefits from them.

p.19

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Conclusion

• RAPIER is a system which optimizes average coflow completion time in DCNs by integrating routing and scheduling.

• RAPIER follows the minimal remaining time first to reduce the average coflow completion time.

• We implement the prototype of RAPIER

• Simulation results show that RAPIER can greatly reduce the average coflow completion time in DCNs.

p.20

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

• The end!

• Thanks for your attention!

p.21