41
5/27/2002 Dragon Slayer Consulting 1 Dragon Slayer Consulting Dragon Slayer Consulting Dragon Slayer Consulting Introduction to the Value Proposition of InfiniBand Marc Staimer Marc Staimer [email protected] [email protected] (503) 579 (503) 579 - - 3763 3763

Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

Embed Size (px)

Citation preview

Page 1: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 1

Dragon Slayer ConsultingDragon Slayer ConsultingDragon Slayer Consulting

Introduction to the Value Proposition of InfiniBand

Marc Staimer Marc Staimer –– [email protected]@earthlink.net(503) 579(503) 579--37633763

Page 2: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 2

Introduction to InfiniBand (IB) Agenda

IB definedIB definedIB vs. FC & GbEIB vs. FC & GbEIB architectureIB architectureReal market problems IB solvesReal market problems IB solvesMarket projectionsMarket projectionsConclusionsConclusions

Page 3: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 3

Definition of Input/Output

““The transfer of data into and out of a computer”The transfer of data into and out of a computer”Maintain data integrityMaintain data integrityProtect all other data in the computer from corruption Protect all other data in the computer from corruption Through the use of Operating System defined mechanisms Through the use of Operating System defined mechanisms

UsuallyUsually

Page 4: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 4

Three (3) Distinct Classes of I/O

Block protocol Block protocol Typically disk orientedTypically disk oriented

Network protocol Network protocol Typically IP orientedTypically IP oriented

InterInter--Process Communication Process Communication IPCIPC

Page 5: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 5

Characteristics I/O Classes

Block Protocol Network Protocol IPCLatency Tolerance

Dozens of milliseconds 100s of Milliseconds Dozens of Microseconds

Avg Message Size

Very large Small to large Small to large

Context Data center/campus FC Global Server cluster/data

centerPredominate Protocol

Fibre Channel Protocol (FCP) Ethernet / TCP/IP Emerging - VI

Page 6: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 6

The 1st unified, simplified, & consolidated I/O FabricThe 1st unified, simplified, & consolidated I/O FabricDesigned from the ground up for all aspects of I/ODesigned from the ground up for all aspects of I/OShared memory vs. shared busShared memory vs. shared busLeverages virtual lanes or pipes Leverages virtual lanes or pipes

(multiple fabrics in one)(multiple fabrics in one)Spec’d for today & tomorrowSpec’d for today & tomorrow

1x = 2.5Gbps1x = 2.5Gbps4x = 10Gbps4x = 10Gbps12x = 30Gbps12x = 30Gbps

Native VI protocolNative VI protocolOS bypassOS bypass

Credit based flow controlCredit based flow controlKey: extends server I/O Key: extends server I/O

Outside the boxOutside the box

InfiniBand (VI Protocol)

Virtual Lanes

IB Defined

FC GbE SASSATAUltraSCSI

Page 7: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 7

Why Do We Need Yet Another Fabric?

The issue is not the fabric, the issue is server I/OThe issue is not the fabric, the issue is server I/OCurrent GbE & FC fabrics do not solve server I/O bottlenecksCurrent GbE & FC fabrics do not solve server I/O bottlenecks

Bus contentionBus contentionGbE & FC fabrics weren’t specifically designed for clusteringGbE & FC fabrics weren’t specifically designed for clustering

They can do it…ANDThey can do it…ANDMessage queue depths and performance not optimalMessage queue depths and performance not optimalPerformance is often inadequatePerformance is often inadequate

Page 8: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 8

IB vs. FC vs. GbE Conclusion

Initially complimentary Initially complimentary –– IB will not replace FC or GbEIB will not replace FC or GbEInvestment protectionInvestment protection

Eventually competitive and complimentaryEventually competitive and complimentaryThey will compete for some of the same budget dollarsThey will compete for some of the same budget dollars

Page 9: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 9

IB Architecture

SysSysMemMemCPUCPU

CPUCPU

Syste

m Bu

sSy

stem

Bus MemMem

CntlrCntlr

MgtMgtServicesServices

Target Channel AdapterInterface to I/O Controller

FC, GbE, SCSI, etc.

Host Channel AdapterProtocol Engine

Moves data via messages queued in memory

SwitchInternal or External

Simple, low cost Multi-stage

HCAHCA

Controller Controller RouterRouter

Link

LinkMultiMultiStageStageSwitchSwitch

Link TCATCA I/OI/OCntlrCntlr

Link

TCATCA

I/OI/OCntlrCntlr

Interconnect ControllerControllerRouterRouter Link

RoutersConnects subnets together

Page 10: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 10

Subnet BSubnet A

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

IB Fabric BW Increases as Switches are Added

routerrouter

routerrouterEndNode

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

EndNode

SwitchSwitch

Page 11: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 11

I/O Architecture Today

Traditional Server & Infrastructure w/dedicated I/OTraditional Server & Infrastructure w/dedicated I/O

CPU

Traditional Server Architecture

CPUStorageStorage

iSCSI? DAFS?

HBA FCFCSANSAN

NIC

EthernetEthernetLANLAN

RouterRouter

NIC

HBA

IPCIPCIPC

NetworkNetwork

IPC

Syste

m Bu

s

MemoryController

I/OBridge

Loca

l Bus

(PCI

)

Complex and expensive

Memory

Page 12: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 12

InfiniBand Based I/O

InfiniBand Server Hardware Architecture

Multiple IBA links • 2.5 Gbps• 10 Gbps

Solve redundancy problem once

CPU

CPU

Syste

m Bu

s

MemoryController

Memory

Host Channel Adapter

Page 13: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 13

InfiniBand Based I/O

InfiniBand Server Hardware Architecture

CPU

CPU

Syste

m Bu

s

MemoryController

Memory

Host Channel Adapter(HCA)

IBSwitch

RDMA based protocols

InfiniBand I/O Unit Hardware Architecture

Target Channel Adapter(TCA)

iSCSI I/O Controller

Fibre Channel I/O Controller

Ethernet I/O Controller

UltraSCSII/O Controller

Page 14: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 14

Market Problems IB Solves

Higher performance lower cost I/O (Shared I/O)Higher performance lower cost I/O (Shared I/O)Converges clustering, networking, & storage into one fabricConverges clustering, networking, & storage into one fabric

The IAN (I/O Area Fabric)The IAN (I/O Area Fabric)Reduces:Reduces:

IT management tasksIT management tasksServer workloadsServer workloadsTCOTCO

PCI Bus I/O constraintsPCI Bus I/O constraintsLow cost HP/HA server clusteringLow cost HP/HA server clustering

Lowers the cost of server blade systemsLowers the cost of server blade systemsEnables higher density server blade clustersEnables higher density server blade clusters

Page 15: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 15

IB Fabric

Higher Performance Lower Cost I/O (Shared I/O)

ModemRemote Monitoring

MgtMgtLANLAN

New IBServer

Clusters

I/O UnitI/O Unit

IB Storage

FCFCSANSAN

EthernetEthernetLAN/WANLAN/WAN

EthernetEthernetSANSAN

IB IB ⇒⇒ iSCSIiSCSI

IB IB ⇒⇒ IBIB

IB IB ⇒⇒ EE--netnet

IB IB ⇒⇒ FCFC

TCA

TCA

Page 16: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 16

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

Current High Availability I/O Configuration16 Rack mount servers with dedicated I/O per server16 Rack mount servers with dedicated I/O per server

= 210 connections= 210 connections(2) HBA FC paths/server to FC fabric(2) HBA FC paths/server to FC fabric(4) FC paths to storage to FC fabric(4) FC paths to storage to FC fabric(2) Ethernet paths/server to network(2) Ethernet paths/server to network(1) Ethernet maint path/server to network(1) Ethernet maint path/server to network

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Page 17: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 17

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

Non-Productive Costly Connectivity

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Non-productive connectivity

Productive connectivity

Page 18: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 18

InfiniBand Shared I/O Chassis Example19” rack mount environment19” rack mount environment3U high3U highIBA single high single wide stdIBA single high single wide stdIntegrated IBA fabricIntegrated IBA fabricUp to 45 watts / linecard slotUp to 45 watts / linecard slotHot swappable componentsHot swappable componentsChassis Management Entity Chassis Management Entity (CME)(CME)

Fabric Card InfiniBand ports

Fabric Card InfiniBand ports

Line CardsUp to 8 - GE/FC/IB ports

3U

Front-to-Back cooling

Page 19: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 19

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

IB Enabled High Availability Shared I/OAdd dual redundant IB I/O ChassisAdd dual redundant IB I/O Chassis

10 slots each10 slots eachIB form factor I/O cardsIB form factor I/O cardsMultiMulti--protocolprotocol

FC, GigE, FastE, iSCSI, etc.FC, GigE, FastE, iSCSI, etc.Eliminate FC edge switchesEliminate FC edge switches

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Eliminate FC Edge Switches

Page 20: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 20

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

IB Enabled High Availability Shared I/OReduces LAN switch requirementsReduces LAN switch requirements

Total Connections = 116 = ~ 45% reductionTotal Connections = 116 = ~ 45% reduction(2) IB paths/server (2) IB paths/server -- IB fabricIB fabric(6) FC paths to storage (6) FC paths to storage -- FC fabricFC fabric(2) Ethernet paths/I/O subsystem (2) Ethernet paths/I/O subsystem –– networknetwork(2) E(2) E--net maint path/I/O subsystem net maint path/I/O subsystem -- networknetwork

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Page 21: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 21

Potential Savings

Current dedicated I/O subsystem/serverCurrent dedicated I/O subsystem/serverCosts = ~ $225,000Costs = ~ $225,000

IB shared I/O System with IB shared I/O System with Improved Improved

BW, connectivity, manageability, availabilityBW, connectivity, manageability, availabilityCosts = ~ $112,500 Costs = ~ $112,500 Savings = ~ 50%Savings = ~ 50%

Additional nonAdditional non--hardware TCO gainshardware TCO gainsOperational Expense Operational Expense

Estimated at 3x Estimated at 3x –– 8x Capital Expense reduction8x Capital Expense reductionSimpler system design to manageSimpler system design to manage

Page 22: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 22

System Benefits

Increased BW & connectivity per serverIncreased BW & connectivity per serverReduced infrastructure complexityReduced infrastructure complexityReduced power & spaceReduced power & spaceBW migration to bursting servers BW migration to bursting servers Natural low latency IPC networkNatural low latency IPC network

Page 23: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 23

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

Managing Scalability w/Traditional I/OWhat happens when just 2 more servers are added?What happens when just 2 more servers are added?

In the FC SANIn the FC SAN(1) new switch has to be added(1) new switch has to be addedFabric will need to be reconfiguredFabric will need to be reconfigured

Maintenance LAN will also need to changeMaintenance LAN will also need to changeFrom a 16From a 16--pt switch/router to 24pt switch/router to 24--port port

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Page 24: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 24

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

Managing Scalability w/Traditional I/OAdding servers takes a lot of hard work & timeAdding servers takes a lot of hard work & time

20 Net new connections20 Net new connectionsDisruptive FC fabric reconfigurationsDisruptive FC fabric reconfigurations

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Page 25: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 25

EthernetEthernetLAN/WANLAN/WAN

MaintenanceMaintenanceLANLAN

FCFCSANSAN

Managing Scalability w/IB based I/OAdding additional servers is significantly simpler & easierAdding additional servers is significantly simpler & easier

8 net new connections = a 60% reduction8 net new connections = a 60% reduction(2) IB paths/new server (2) IB paths/new server -- IB fabricIB fabricNo new switches or reconfigurationsNo new switches or reconfigurationsFaster & nonFaster & non--disruptive implementationdisruptive implementation

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

PowerEdge2450

Page 26: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 26

Scalability Net Results w/IB shared I/O

Making adds, moves, or changes meansMaking adds, moves, or changes meansLess timeLess timeLess costLess costLess effortLess effortLess complexityLess complexityLess personnelLess personnelLess disruptionsLess disruptionsMore controlMore controlMore simplicityMore simplicityMore stabilityMore stabilityBetter RASBetter RAS

Page 27: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 27

PCI Bus Constraints

PCI bus limitations have been strangling CPU I/OPCI bus limitations have been strangling CPU I/OLike trying to drink from a fire hoseLike trying to drink from a fire hose

Page 28: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 28

PCI Bus Constraints

PCI PCI (66Mhz)PCI-X

(133 Mhz)DDR QDR 3GIO

Max BW 4 Gbps 8 Gbps 16Gbps 32Gbps 64Gbps

I/O Constraint (4) GbE w/TOE or (2) 2gig FC

(4) SCSI 320 (1) 4x IB(1) 10gigE, FC, or 4x IB

(2) 10gigE, FC, or 4x IB

ArchitectureSwitched

serialIssues Not until 04

Shared Parallel Bus

Bus contention

Page 29: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 29

PCI Bus vs. IB

Comparison

Scalability: Ports & BW

QoS Security

PCI PCI-X DDR QDR 3GIO

InfiniBand

Protects software base

Out-of-box connectivity

Fabric Convergence

DisadvantagesAdvantages

Simpler for chip-to-chip

PCB, Copper, & Fiber

Lower cost

Fault Tolerance

Clustering

Multi-cast

Until there is 3GIO, bus contention

Software

Page 30: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 30

Solution: PCI Bus AND IB

It’s not “either:or”It’s not “either:or”They are complimentary not mutually exclusiveThey are complimentary not mutually exclusive

The best solutions takes advantage of bothThe best solutions takes advantage of bothThis is why you rarely hear anymore that IB is the PCI replacemeThis is why you rarely hear anymore that IB is the PCI replacementnt

There are new HCAs There are new HCAs WITHWITH PCIPCI--X interfacesX interfacesExpect DDR, QDR, & 3GIO as wellExpect DDR, QDR, & 3GIO as wellThe IB benefits are almost as greatThe IB benefits are almost as great

Eliminates bus contentionEliminates bus contentionPreserves PCI software basePreserves PCI software base

Provides IB benefits NOWProvides IB benefits NOWDon’t have to wait for native server IBDon’t have to wait for native server IB

PCI-X HCA Example

Page 31: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 31

Low Cost HP/HA Server Clustering

IB clustering costs less for scaling out than SMP or NUMA scalinIB clustering costs less for scaling out than SMP or NUMA scaling upg upIB eliminates fabric messaging performance Issues with clusterinIB eliminates fabric messaging performance Issues with clusteringg

Long queuesLong queuesPCI bus contentionPCI bus contention

IB enables low cost server (shared I/O arguments even stronger hIB enables low cost server (shared I/O arguments even stronger here)ere)Diskless bladesDiskless blades

Personality on the storagePersonality on the storageHigher Fault Tolerance and AvailabilityHigher Fault Tolerance and Availability

One connection for clustering and shared I/OOne connection for clustering and shared I/OLess I/O interfaces than any other interconnectLess I/O interfaces than any other interconnect

Higher performanceHigher performanceLower TCOLower TCO

Page 32: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 32

Industry Analyst’s IB Enabled Server Forecast

0

1000000

2000000

3000000

4000000

5000000

6000000

2002 2003 2004 2005

Gartner IDC

Analysts are split in their forecast of IB’s TAM; but, not on its potential

Page 33: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 33

IB Enabled Servers as a % of Total

0.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

90.0%

2002 2003 2004 2005

Gartner IDC

Page 34: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 34

Conclusions

Even if the analysts views are optimisticEven if the analysts views are optimisticHuge % of servers will be I/B enabledHuge % of servers will be I/B enabledThe value proposition is far too strong to ignoreThe value proposition is far too strong to ignoreInitial deployment will utilize PCIInitial deployment will utilize PCI--X HCAsX HCAsNative deployments will enable lower cost server blade clustersNative deployments will enable lower cost server blade clustersAs more and more servers become IB enabledAs more and more servers become IB enabled

Clever IT people will realize that they can run IB native for:Clever IT people will realize that they can run IB native for:Clustering, Networking, and StorageClustering, Networking, and Storage

When IB becomes native with the server motherboardWhen IB becomes native with the server motherboardThe perception becomes that it’s freeThe perception becomes that it’s free

There is always high market demand for…free.There is always high market demand for…free.

Page 35: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 35

Dragon Slayer ConsultingDragon Slayer ConsultingDragon Slayer Consulting

?????Questions?????

Page 36: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 36

Why Not Just Use GbE or FC?

GbE and FC are the current fabric infrastructuresGbE and FC are the current fabric infrastructuresIT personnel already know & understand the technologiesIT personnel already know & understand the technologiesFC & GbE are already battling it out for SAN infrastructuresFC & GbE are already battling it out for SAN infrastructures

FCP vs. iSCSIFCP vs. iSCSI

Page 37: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 37

IB vs. FC vs. GbE

TechnologyStandards

BodySignaling

SpeedFirst

StandardMaximum

Frame SizePrimary

ApplicationGigabit Ethernet

IEEE & IETF 1.25 Gbps 1999 1.5KLAN: Local

Area NetworkFibre Channel

ANSI 2.125 Gbps 1988 2KSAN: Storage Area Network

InfiniBand Architecture

InfiniBand Trade

Association

2.5Gbps (1x) 10Gbps (4x)

30Gbps (12x)2001 4K

IAN: I/O Area Network

Page 38: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 38

How IB compares w/GbE & FC in OSIFibre

ChannelIB

ArchitectureUpper Level

ProtocolsApplication Application

Transport Layer

TCP UDPFC-4:

Protocol Mappings

IBA Operations

(FC-3)

Line Encoding

FC-1: Encoding

Media Access Control

Physical Layer

FC-0: Physical

MediaPhysicalPhysical Layer Entities

Link Layer

Ethernet (802.3)

Application

Logical Link Control

IPNetwork

Layer FC-2: Framing

Service Class

Network

Media Access Control

= layers not included in the protocol standards

Page 39: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 39

Data Center Fabric & I/O Consolidation

IB enables convergence through shared server I/OIB enables convergence through shared server I/OOne I/O interface forOne I/O interface for

ClusteringClusteringNetworkNetworkStorageStorage

Eliminates the need for multiple server I/O blades/portsEliminates the need for multiple server I/O blades/portsIB virtual lanes providesIB virtual lanes provides

Multiple independent logical fabrics multiplexed on one physicalMultiple independent logical fabrics multiplexed on one physical oneoneQoS to prioritize trafficQoS to prioritize trafficThe benefits of independent fabrics with:The benefits of independent fabrics with:

The management and maintenance of one fabricThe management and maintenance of one fabricSwitches, directors, and routers provideSwitches, directors, and routers provide

Scalability, redundancy, availability, and flexibilityScalability, redundancy, availability, and flexibility

Page 40: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 40

Requirements of a Shared I/O SystemCooperative Software ArchitectureCooperative Software Architecture

Ability to productively distribute work between host & external Ability to productively distribute work between host & external shared I/O system shared I/O system Virtualization of I/OVirtualization of I/O

Host manipulates logical resourcesHost manipulates logical resourcesHost has no awareness of underlying physical resources Host has no awareness of underlying physical resources

All I/O managed external to hostAll I/O managed external to hostHost originates requests and receives resultHost originates requests and receives result

Heterogeneous Operating SystemsHeterogeneous Operating Systems3 Classes of I/O 3 Classes of I/O

Efficiently handle small to very large messagesEfficiently handle small to very large messagesMicrosecond sensitive latency without sacrificing bandwidthMicrosecond sensitive latency without sacrificing bandwidth

Channel ArchitectureChannel ArchitectureHighly differentiated priority and service levelsHighly differentiated priority and service levelsConnection oriented guaranteed delivery mechanismConnection oriented guaranteed delivery mechanismInherent memory semantics and protectionInherent memory semantics and protectionHigh speed / low latencyHigh speed / low latency

Page 41: Dragon Slayer Consulting - Computer Science | UMass …bill/cs520/slides_13A_IB_Value_prop.pdf · 5/27/2002 Dragon Slayer Consulting 3 Definition of Input/Output “The transfer of

5/27/2002 Dragon Slayer Consulting 41

Dragon Slayer ConsultingDragon Slayer ConsultingDragon Slayer Consulting

Market Projections

IDC & GartnerIDC & Gartner--DataquestDataquest