Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Hot Chips 13, August 2001 1
Tyrant: A High Performance
Storage over IP Switch Engine
Stuart Oberman,Rodney Mullendore, Kamran Malik,
Anil Mehta, Keith Schakel,
Michael Ogrinc, Dane Mrazek
Hot Chips 13, August 2001 2
Background: Direct-Attached Storage
Direct-Attached Storage
– Expansion beyond server’s
internal drive capacity
– Tape backup
– Often used with Network-
Attached Storage (NAS)
– High performance SCSI or
Fibre Channel connections
LAN / NAS
Utilization Utilization
Hot Chips 13, August 2001 3
Storage Area Networks – SANs
LAN / NAS
SAN
Storage Area Networks
– Pooling of external storage
devices for better utilization
and availability
– LAN-free backup
– Serverless backup
– Non-disruptive expansion
and maintenance
– Leverage existing staff to
manage three or four times
more storage
– SAN ROI estimates* range
from 65-297 percent
*Source: CSFB, June 2001
Utilization Utilization
Hot Chips 13, August 2001 4
No Native Fibre Channel Extension
FC SAN
FC SAN
FC Switch
FC Switch FC Switch
FC Switch
Wide area networks are based on IP or
Gigabit Ethernet, not Fibre Channel
Hot Chips 13, August 2001 5
IP SAN
IP SAN
Integrated IP SAN, MAN, and WAN Fabric
Unified
IP SAN-MAN-WAN
Fabric
Optimal routing decisions
are made by the integrated
IP SAN-MAN-WAN fabric
No more
“islands”
Familiar IP and Ethernet management tools
Nishan
Nishan Nishan
Nishan
Hot Chips 13, August 2001 6
Nishan Systems’ Framework for IP Storage
OPEN
Standards-Based
� Future Interfaces
� SCSI Disk, RAID, Tape
� SCSI Host Bus Adapters
� FC Disk, RAID, Tape
� FC Host Bus Adapters
� FC SANs
� SAN, LAN, MAN, WAN
� IP Routers and Gigabit
Ethernet Switches
� 10 Gigabit Ethernet
� Network Attached
Storage
Hot Chips 13, August 2001 7
Tyrant: An Introduction
� Tyrant: a switch engine used to build many sizes of
IP Storage Area Network switches
� Two categories of switches:
– Low density: <=32 ports
– High density: > 32 ports
� Tyrant integrates hardware to build both categories
supporting
– Fibre Channel switch ports, 1Gbps and 2Gbps
– Fibre Channel arbitrated loop ports, 1Gbps and 2Gbps
– Gigabit Ethernet
– Switching between all port types
Hot Chips 13, August 2001 8
Tyrant: Major Features
� Four 1Gbps Ethernet MACs
� Four 1Gbps / 2Gbps Fibre Channel MACs
– 1Gbps or 2Gbps
– FC-AL and FC-SW
� Four Gigabit network processor interfaces
� Four flexible packet schedulers
� Four programmable encapsulation engines
� Distributed shared-memory controller supporting 2MB
SRAM per chip
� Chip-to-chip fabric interface
Hot Chips 13, August 2001 9
Tyrant Block Diagram
NP IF 0
ENCAP
ENG 0
FABRIC
IF
0
MEMORY
CONTROLLER
AND
CROSSBAR
Fibre channel
Gigabit Ethernet
Fibre channel
Gigabit Ethernet
Fibre channel
Gigabit Ethernet
Fibre channel
Gigabit Ethernet
To Network
Processors
To Other
Tyrants
(Ingress Flow)
SRAM
To Other
Tyrants
(Egress Flow)
From Other
Tyrants
(Ingress Flow)
From Other
Tyrants
(Egress Flow)
FC
MAC 0
GE
MAC 0
FC
MAC 1
GE
MAC 1
FC
MAC 2
GE
MAC 2
FC
MAC 3
GE
MAC 3
NP IF 1
ENCAP
ENG 1
NP IF 2
ENCAP
ENG 2
NP IF 3
ENCAP
ENG 3
FABRIC
IF
1
FABRIC
IF
2
FABRIC
IF
3
Hot Chips 13, August 2001 10
Cut-thru
� Each Tyrant has side-band cut-thru request / grant
interface
� Allows for each ingress port to request cut-thru to any
egress port
� If cut-thru is granted, packets are switched in
distributed fashion, with no writes to SRAM
� First-byte-in to MAC to first-byte-out of MAC latency:
– Less than 2us
– Any port to any port
– FC packets using no external network processing
Hot Chips 13, August 2001 11
Early Forwarding
� Reduced switch latency possible even when packet
stored to SRAM
� Entire store-and-forward not required
� Store only minimum amount of packet:
– Programmable stored amount based upon link speed
difference of ingress and egress ports (GE, FC, or 2G FC)
– Ensures no underrun of egress port
� 4-5us latency possible
Hot Chips 13, August 2001 12
Packet Scheduler
� True output queues at each egress port
– Requires fast and wide shared memory
– Guarantees maximum switching throughput
– Single trip to memory
� 256 unique output queues per port
� Scheduler features:
– Three level hierarchy
– Each level may separately use strict priority or weighted-fair-
queuing (WFQ) / weighted-round-robin (WRR)
– Virtual time based scheduling => very low jitter
– Minimum and maximum bandwidth guarantees
Hot Chips 13, August 2001 13
Scheduler Organization
L1 REG
L2
L3
L1
L1
L1
REG
REG
REG
L2
REG
REG
REG
8 Q’s
8 Q’s
8 Q’s
8 Q’s
32 L1
Schedulers
Selected
Queue
Hot Chips 13, August 2001 14
Full-Duplex FC-AL
� FC arbitrated loop supports 126 devices
� Loop operates either half-duplex or full-duplex
� In full-duplex configuration, target device may “open”
switch port to send it data
� If switch port can reorganize its output data while
receiving target device’s data, bandwidth is
maximized
Hot Chips 13, August 2001 15
Optimized Full-Duplex FC-AL Performance
� Bin packets destined for different target devices using
one output queue per loop device
� Separate packet scheduler interface allows FC MAC
to directly select particular queue for transmission
� Switch port opened by loop device:
– MAC “pulls” packet from correct queue
– No head-of-line blocking
Hot Chips 13, August 2001 16
Programmable Encapsulation Engine
� Programmable encapsulation engine per egress port
used to implement IP storage protocols
– Allows flexibility for evolving IP storage standards
� Network processor on ingress FC port adds
Encapsulation Instructions to header of packet
� Encapsulation engine at egress IP port executes
microcode to form encapsulated IP storage packets
� Encapsulation engine performs:
– IP address lookups
– Checksum and CRC computations
– GE and IP header creation
Hot Chips 13, August 2001 17
Application Example 1: 16-port Switch
0 1 2 3 4 5 6 7
Tyrant1
Tyrant2 Tyrant3
8 9 10 11 12 13 14 15
SRAM
Each port may be
configured to be GE,
1G FC, or 2G FC
NP
SRAM
NP
SRAM
NP
SRAM
NP
Tyrant0
Hot Chips 13, August 2001 18
Application Example 2: Chassis Line Card
0Tyrant0
SRAM
1
External
Fabric
Transceiver
2Tyrant1
3
4Tyrant2
5
6Tyrant3
7
Each port
may be
configured
to be GE,
1G FC, or
2G FC
Generic
Fabric Interface 10+ Gbps pipe
across backplane
to Fabric Card
using High Speed
Serial Links
NP
SRAM NP
SRAM NP
SRAM NP
Hot Chips 13, August 2001 19
Physical Details
� IBM 0.18um ASIC process: SA-27E
� 25 million transistors
� 15mm x 15mm die size
� 928 signal pins
� Nominal power consumption: 16W
� Three major clock domains
– 125 MHz
– 104 MHz
– 53.125 MHz
� Implementation status: internal testing
Hot Chips 13, August 2001 20
Chip Layout
Hot Chips 13, August 2001 21
Tyrant Photo