Birds of a Feather Presentation - HPC Advisory Council InfiniBand QDR.pdf · Mellanox InfiniBand...

Preview:

Citation preview

Mellanox InfiniBand QDR 40Gb/sThe Fabric of Choice for High Performance Computing

Gilad Shainer, shainer@mellanox.comJune 2008

Birds of a Feather PresentationBirds of a Feather Presentation

2 Mellanox Technologies

Performance RoadmapGigabits per second

InfiniBand Delivers Ultra Low Latency

InfiniBand Technology Leadership

Industry Standard• Hardware, software, cabling, management• Design for clustering and storage interconnect

Price and Performance• 40Gb/s node-to-node• 120Gb/s switch-to-switch• 1us application latency• Most aggressive roadmap in the industry

Reliable with congestion managementEfficient

• RDMA and Transport Offload• Kernel bypass• CPU focuses on application processing

Scalable for Petascale computing & beyondEnd-to-end quality of serviceVirtualization accelerationI/O consolidation Including storage

3 Mellanox Technologies

InfiniBand in the TOP500

InfiniBand makes the most powerful clusters• 5 of the top 10 (#1,#4, #7, #8, #10) and 49 of the Top100• The leading interconnect for the Top200• InfiniBand clusters responsible for ~40% of the total Top500 performance

InfiniBand enables the most power efficient clustersInfiniBand QDR expected Nov 2008No 10GigE clusters exist on the list

Top500 Interconnect Placement

01020304050607080

1-100 101-200 201-300 301-400 401-500

Top500 Placement

Num

ber o

f Clu

ster

s

InfiniBand All Proprietary High Speed GigE

InfiniBand Clusters - Performance

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Nov 05 June 06 Nov 06 June 07 Nov 07 June 08

Perfo

rmac

e (G

flops

)

InfiniBand Performance

360% CAGR

4 Mellanox Technologies

Mellanox InfiniBand End-to-End Products

High Throughput - 40Gb/sLow latency - 1us Low CPU overhead

Kernel bypassRemote DMA (RDMA)Reliability

Blade/Rack Servers StorageSwitch

ADAPTER ADAPTERSWITCH

Adapter ICs & Cards

Cables

Switch ICsSoftware

End-to-End Validation

Maximum Productivity

5 Mellanox Technologies

ConnectX IB QDR 40Gb/s MPI BandwidthPCIe Gen2

0

1000

20003000

4000

5000

6000

7000

1 4 16 64 256

1024

4096

1638

465

536

2621

4410

48576

4194

304

Bytes

MB

/s

IB QDR Uni-dir IB QDR Bi-dir

ConnectX - Fastest InfiniBand Technology

Performance driven architecture • MPI latency 1us, ~6.5GB/s with 40Gb/s InfiniBand (bi-directional)• MPI message rate of >40 Million/sec

Superior real application performance• Engineering Automotive, oil & gas, financial analysis, etc.

ConnectX IB MPI Latency

0

1

2

3

4

5

1 2 4 8 16 32 64 128 256 512 1024

Bytes

usec

PCIe Gen2 IB QDR Latency

1.07us6.47GB/s

6 Mellanox Technologies

Mellanox ConnectX MPI Latency - Multi-core Scaling

0

2

4

6

1 2 3 4 5 6 7 8# of CPU cores (# of processes)

Late

ncy

(use

c)

Mellanox ConnectX MPI Latency - Multi-core Scaling

0

3

6

9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# of CPU cores (# of processes)

Late

ncy

(use

c)

ConnectX Multi-core MPI Scalability

Scalability to 64+ cores per node, to 20K+ nodes per subnetGuarantees same low latency regardless of the number of cores Guarantees linear scalability for real applications

7 Mellanox Technologies

InfiniScale IV Switch: Unprecedented Scalability

36 40Gb/s or 12 120Gb/s InfiniBand Ports• Adaptive routing and congestion control• Virtual Subnet Partitioning

6X switching and data capacity• Vs. using 24-port 10GigE Ethernet switch devices

4X storage I/O throughput• Critical for backup, snapshot and quickly loading large datasets• Vs. deploying 8Gb/s Fibre Channel SANs

10X lower end-to-end latency performance• Vs. using 10GigE/DCE switches and iWARP-based adapters

3X the server and storage node cluster scalability when building a 3-tier CLOS fabric• Vs. using 24-port 10GigE Ethernet switch devices

8 Mellanox Technologies

Addressing the Needs for Petascale Computing

Faster network streaming propagation• Network speed capabilities• Solution: InfiniBand QDR

Large clusters• Scaling to many nodes, many cores per node• Solution: High density InfiniBand switch

Balanced random network streaming• “One to One” random streaming• Solution: Adaptive routing

Balanced known network streaming• “One to One” known streaming• Solution: Static routing

Un-balanced network streaming• “Many to one” streaming• Solution: Congestion control

Designed to handle all communications in HW

9 Mellanox Technologies

HPC Applications Demand Highest Throughput

Fluent Message Size Profiling

1.00E+06

1.00E+07

1.00E+08

1.00E+09

1.00E+10[0.

.256]

[257..

1024

][10

25..4

096]

[4097

..163

84]

[1638

5..65

536]

[6553

7..26

2144

][26

2145

..104

8576

]

Message size (Byte)

Tota

l dat

a se

nd (B

yte)

2 Servers 7 Servers

The need for latency

The need for bandwidth

LS-DYNA Profiling

110

1001000

10000100000

100000010000000

1000000001000000000

10000000000

[0..64]

[65..256]

[257..1024

][1025..4

096]

[4097..163

84][16385..6

5536][65537..2

62144]

[262145..1048576]

[1048577..4194304]

[4194305..infin

ity]

MPI message size (Byte)

Tota

l dat

a tra

nsfe

rred

(Byt

e)

16 cores 32 coresThe need for bandwidth

The need for bandwidth

Scalability Mandates Highest Bandwidth

Lowest Latency

10 Mellanox Technologies

HPC Council Advisory

Distinguished HPC alliance (OEMs, IHVs, ISVs, end-users) Members activities• Qualify and optimize HPC solutions • Early access to new technology, and mutual development of future solutions • Explore new opportunities within the HPC market • HPC targeted joint marketing programs

A community effort support center for HPC end-users• Mellanox Cluster Center

Latest InfiniBand and the HPC Advisory Council member technologyDevelopment, testing, benchmarking and optimization environment

• End- user support center - HPCHelp@mellanox.comFor details – HPC@mellanox.com

11 Mellanox Technologies11

Providing advanced, powerful, and stablehigh performance computing solutions

Recommended