12
Boosting Scalability of Boosting Scalability of InfiniBand-based HPC Clusters f S © 2010 Voltaire Inc. Asaf Wachtel, Senior Product Manager

Boosting Scalability ofBoosting Scalability of InfiniBand

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Boosting Scalability ofBoosting Scalability of InfiniBand

Boosting Scalability ofBoosting Scalability of InfiniBand-based HPC Clusters

f S

© 2010 Voltaire Inc.

Asaf Wachtel, Senior Product Manager

Page 2: Boosting Scalability ofBoosting Scalability of InfiniBand

InfiniBand-based HPC ClustersScalability Challenges

► Cluster TCO Scalability• Hardware costs• Hardware costs• Software License costs• Space, Power & Cooling

► Communication Scalability• Handle increasing compute power• Multi-core GPUsMulti core, GPUs

► Utilization Scalability• Many jobs & users• Varying sizes, traffic patterns & QoS

► Application Scalability• Home grown or ISVs

© 2010 Voltaire Inc. 2sc10

• Home-grown or ISVs• MPI Collectives

Page 3: Boosting Scalability ofBoosting Scalability of InfiniBand

Voltaire 40Gb/s InfiniBand Portfolio

Fabric provisioning and Application Accelerationp gperformance monitoring

pp

40Gb/s InfiniBand Switching Platforms

HSSMHSSMSSI Blade Switch

© 2010 Voltaire Inc. 3sc10

4700324/648 x IB ports

4200162 x IB ports

403636 x IB ports

4036E34 x IB ports + 2 x 1/10GbE

Page 4: Boosting Scalability ofBoosting Scalability of InfiniBand

Scalable Architectures

► Fat Tree• Full bi-sectional bandwidth at any node county• Uniform oversubscription options

► HyperScale► HyperScale• Scale to thousands of nodes with linear performance

• Large non-blocking islands (more than 2,000 cores)

• 4-hops maximum latency to any port

• Lowest number of switches and cables

► Torus• Lowest cost solution

B ilt ti l ith d it h d bl

© 2010 Voltaire Inc. 4sc10

• Built entirely with edge switches and copper cables

• Optimized support by Voltaire software, including Torus2QoS routing

Page 5: Boosting Scalability ofBoosting Scalability of InfiniBand

HyperScale in the Top500

► Large, low-latency, non-blocking Islands► Lowest number of switches & cables► Lowest number of switches & cables► Scales to thousands of nodes with linear

performance1 200 d1,200-node

Interconnect in only 2 Racks

8:1 Oversubscribed Core

13 x non-blocking

© 2010 Voltaire Inc. 5sc10

13 x non blocking HyperScale Islands 1.05PFLOPs

83.7% Efficiency

Page 6: Boosting Scalability ofBoosting Scalability of InfiniBand

The Challenge: Static Routing Inefficiency

► The Challenge: One Size Routing does not Fit All• Static routing assumes uniform traffic across entire fabricStatic routing assumes uniform traffic across entire fabric• Real life is different

Most jobs use small portion of the clustersDifferent traffic patterns for different jobsDifferent requirements for different traffic types (e.g. storage)

► The Solution: Voltaire TARA™ (Traffic Aware Routing Algorithm)

• A new routing algorithm on top of OpenSM• Dynamically optimizes routing according to defined traffic patterns:

Fabric topologyJob-specific communication patternsSymmetric/Asymmetric communicationTraffic load/QoS

F ll i t t d ith l di j b h d l

© 2010 Voltaire Inc. 6sc10

• Fully integrated with leading job schedulers

Page 7: Boosting Scalability ofBoosting Scalability of InfiniBand

TARA – Traffic Aware Routing AlgorithmMaximizing Cluster Utilization

OpenSM without UFM TARA UFM TARA is ON

1600

1800

2000

1600

1800

2000

600

800

1000

1200

1400

port

wei

ght

600

800

1000

1200

1400

port

wei

ght

© 2010 Voltaire Inc. 7sc10

0

200

400

1.18

1.28

2.20

2.30

3.22

3.32

4.24

4.34

5.26

6.18

6.28

7.20

7.30

8.22

8.32

9.24

9.34

10.26

11.18

11.28

12.20

12.30

13.22

13.32

14.24

14.34

15.26

16.18

16.28

17.20

17.30

switch.port

0

200

400

1.18

1.28

2.20

2.30

3.22

3.32

4.24

4.34

5.26

6.18

6.28

7.20

7.30

8.22

8.32

9.24

9.34

10.26

11.18

11.28

12.20

12.30

13.22

13.32

14.24

14.34

15.26

16.18

16.28

17.20

17.30

switch.portInternal ports on the line cards

Page 8: Boosting Scalability ofBoosting Scalability of InfiniBand

The Challenge: Collective Operations Scalability

► Grouping algorithms are unaware of the topology and inefficient► Network congestion due to “all-to-all” communication► Slow nodes & OS involvement impair scalability and predictability► The more powerful servers get (GPUs, more cores), the more poorly

collectives scale in the fabriccollectives scale in the fabric

% collectives out of total run time

Total run time Run time variance

# Ranks # Ranks# Ranks

© 2010 Voltaire Inc. 8sc10

# Ranks # Ranks# Ranks

Significant Inhibitor to MPI Application Scalability

Page 9: Boosting Scalability ofBoosting Scalability of InfiniBand

Introducing:Voltaire Fabric Collective Accelerator

Grid Director Switches: F b i

Grid Director Switches:

Unified Fabric Manager (UFM)

FCA Manager:Topology-based collective tree

+ +Fabric Processing Power

Switches:Collective

operations offloaded to switch CPUs

(UFM):Topology Aware Orchestrator

p gySeparate Virtual networkIB multicast for result distribution

switch CPUs

++ FCA Agent:Inter-core processing

………. ……….localized & optimized

© 2010 Voltaire Inc. 9sc10

Breakthrough performance with no additional hardware

Page 10: Boosting Scalability ofBoosting Scalability of InfiniBand

FCA– Fabric Collective AcceleratorUnmatched Application Scalability

► First and only system-wide solution for offloading MPI collectives

► Accelerates MPI collective computation by as much as 100X► 10-40% improvement in application runtime► Integrated with leading MPI implementations

140160180

Fluent truck_111m 192 cores

406080

100120

PMPI

PMPI + FCA

© 2010 Voltaire Inc. 10sc10

020

PMPI PMPI + FCA

Page 11: Boosting Scalability ofBoosting Scalability of InfiniBand

Summary

► Reduced total cost of ownership via scalable topologies

(HyperScale)

► Increase cluster utilization via Traffic Aware Routing (TARA)g ( )

► Boost application scalability using Fabric Collective

Acceleration (FCA)Acceleration (FCA)

$© 2010 Voltaire Inc. 11sc10

More Performance for each $ Spent

Page 12: Boosting Scalability ofBoosting Scalability of InfiniBand

Thank YouAsaf Wachtel

© 2010 Voltaire Inc.

Senior Product Manager, InfiniBand [email protected]