36
Data Center Switch Architecture in the Age of Merchant Silicon Nathan Farrington Erik Rubow Amin Vahdat

Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Embed Size (px)

Citation preview

Page 1: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Data Center Switch Architecturein the Age of Merchant Silicon

Nathan FarringtonErik Rubow

Amin Vahdat

Page 2: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

The Network is a Bottleneck

• HTTP request amplification– Web search (e.g. Google)

– Small object retrieval (e.g. Facebook)

– Web services (e.g. Amazon.com)

• MapReduce-style parallel computation– Inverted search index

– Data analytics

• Need high-performance interconnects

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

2

Page 3: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

The Network is Expensive

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

3

Rack 1 Rack 2 Rack 3 Rack N

8xGbE

. . . 48xGbE TOR Switch . . .

. . . 40x1U Servers . . .

10GbE

Page 4: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

What we really need: One Big Switch

• Commodity

• Plug-and-play

• Potentially no oversubscription

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

4

Rack 1 Rack 2 Rack 3 Rack N

Page 5: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Why not just use a fat tree of commodity TOR switches?

M. Al-Fares, A. Loukissas, A. Vahdat. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM ’08.

Hot Interconnects August 27, 2009

5Nathan Farrington

[email protected]

k=4,n=3

Page 6: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

10 Tons of Cable

• 55,296 Cat-6 cables

• 1,128 separate cable bundles

The “Yellow Wall”

Hot Interconnects August 27, 2009

6Nathan Farrington

[email protected]

Page 7: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Merchant Silicon gives usCommodity Switches

Maker Broadcom Fulcrum Fujitsu

Model BCM56820 FM4224 MB86C69RBC

Ports 24 24 26

Cost NDA NDA $410

Power NDA 20 W 22 W

Latency < 1 μs 300 ns 300 ns

Area NDA 40 x 40 mm 35 x 35 mm

SRAM NDA 2 MB 2.9 MB

Process 65 nm 130 nm 90 nm

Hot Interconnects August 27, 2009

7Nathan Farrington

[email protected]

Page 8: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Eliminate Redundancy

• Networks of packet switches contain many redundant components– chassis, power

conditioning circuits, cooling

– CPUs, DRAM

• Repackage these discrete switches to lower the cost and power consumption

CPUASIC

PHY

SFP+ SFP+ SFP+

FAN

FAN

FAN

FAN

PSU

8 Ports

Hot Interconnects August 27, 2009

8Nathan Farrington

[email protected]

Page 9: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Our Architecture, in a Nutshell

• Fat tree of merchant silicon switch ASICs• Hiding cabling complexity with PCB traces and

optics• Partition into multiple pod switches + single

core switch array• Custom EEP ASIC to further reduce cost and

power• Scales to 65,536 ports when 64-port ASICs

become available, late 2009

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

9

Page 10: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

3 Different Designs

• 24-ary 3-tree

• 720 switch ASICs

• 3,456 ports of 10GbE

• No oversubscription

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

10

1 2 3

Page 11: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Network 1: No Engineering Required

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

11

Cost of Parts $4.88M

Power 52.7 kW

Cabling Complexity 3,456

Footprint 720 RU

NRE $0

• 720 discrete packet switches, connected with optical fiber

Cabling complexity (noun): the number of long cables in a data center network.

Page 12: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Network 2: Custom Boards and Chassis

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

12

Cost of Parts $3.07M

Power 41.0 kW

Cabling Complexity 96

Footprint 192 RU

NRE $3M est

• 24 “pod” switches, one core switch array, 96 cables

This design is shown in more detail later.

Page 13: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Switch at 10G,but Transmit at 40G

SFP SFP+ QSFP

Rate 1 Gb/s 10 Gb/s 40 Gb/s

Cost/Gb/s $35* $25* $15*

Power/Gb/s 500mW 150mW 60mW

* 2008-2009 Prices

Hot Interconnects August 27, 2009

13Nathan Farrington

[email protected]

Page 14: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Network 3: Network 2 + Custom ASIC

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

14

Cost of Parts $2.33M

Power 36.4 kW

Cabling Complexity 96

Footprint 114 RU

NRE $8M est

• Uses 40GbE between pod switches and core switch array; everything else is same as Network 2.

EEP

This simple ASIC provides tremendous cost and power savings.

Page 15: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Cost of Parts

4.88

3.072.33

0

1

2

3

4

5

6

Cost of Parts (in millions)

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

15

Page 16: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Power Consumption

52.7

4136.4

0

10

20

30

40

50

60

Power Consumption (kW)

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

16

Page 17: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Cabling Complexity

3,456

96 960

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

Cabling Complexity

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

17

Page 18: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Footprint

720

192114

0

100

200

300

400

500

600

700

800

Footprint (in rack units)

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

18

Page 19: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Partially Deployed Switch

Hot Interconnects August 27, 2009

19Nathan Farrington

[email protected]

Page 20: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Fully Deployed Switch

Hot Interconnects August 27, 2009

20Nathan Farrington

[email protected]

Page 21: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Pod Switch

Hot Interconnects August 27, 2009

21Nathan Farrington

[email protected]

Page 22: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Logical Topology

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

22

Page 23: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Pod Switch Line Card

Hot Interconnects August 27, 2009

23Nathan Farrington

[email protected]

Page 24: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Pod Switch Uplink Card

Hot Interconnects August 27, 2009

24Nathan Farrington

[email protected]

Page 25: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Core Switch Array Card

Hot Interconnects August 27, 2009

25Nathan Farrington

[email protected]

Page 26: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Why an Ethernet Extension Protocol?

• Optical transceivers are 80% of the cost

• EEP allows the use of fewer and faster optical transceivers

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

26

EEP EEP40GbE

10GbE

10GbE

10GbE

10GbE

10GbE

10GbE

10GbE

10GbE

Page 27: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

How does EEP work?

• Ethernet frames are split up into EEP frames• Most EEP frames are 65 bytes

– Header is 1 byte; payload is 64 bytes

• Header encodes ingress/egress port

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

27

EEP EEP

Page 28: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

How does EEP work?

• Round-robin arbiter• EEP frames are transmitted as one large

Ethernet frame• 40GbE overclocked by 1.6%

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

28

EEP EEP

Page 29: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

29

EEP EEP

Ethernet Frames

Page 30: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

30

EEP EEP

EEP Frames

123

1

12

1

3

2

Page 31: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

31

EEP EEP

123

1

12

1

3

2

123

1

12

1

3

2

nfarring
Sticky Note
This slide shows an animation of EEP frames being selected round-robin by the EEP chip on the left, transmitted to the EEP chip on the right, and then being reassembled.
Page 32: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

EEP Frame Format

SOF: Start of Ethernet Frame

EOF: End of Ethernet Frame

LEN: Set if EEP Frame contains less than 64B of payload

Virtual Link ID: Corresponds to port number (0-15)

Payload Length: (0-63B)

Hot Interconnects August 27, 2009

32Nathan Farrington

[email protected]

Page 33: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Why not use VLANs?

• Because it adds latency and requires more SRAM

• FPGA Implementation– VLAN tagging

– EEP

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

33

Page 34: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Latency Measurements

Hot Interconnects August 27, 2009

34Nathan Farrington

[email protected]

Page 35: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Related Work

• M. Al-Fares, A. Loukissas, A. Vahdat. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM ’08.• Fat trees of commodity switches, Layer 3 routing, flow scheduling

• R. N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. In SIGCOMM ’09.

– Layer 2 routing, plug-and-play configuration, fault tolerance, switch software modifications

• A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. In SIGCOMM ’09.

– Layer 2 routing, end-host modifications

Hot Interconnects August 27, 2009

35Nathan Farrington

[email protected]

Page 36: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Conclusion

• General architecture– Fat tree of merchant silicon switch ASICs

– Hiding cabling complexity

– Pods + Core

– Custom EEP ASIC

– Scales to 65,536 ports with 64-port ASICs

• Design of a 3,456-port 10GbE switch

• Design of the EEP ASIC

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

36