17
HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP 1 FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff [email protected] Red Hat Lee Fisher [email protected] Hewlett-Packard High Performance Computing on Wall Street conference 14 September, 2009

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP1

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

Carl [email protected] Hat

Lee [email protected]

High Performance Computing on Wall Street conference14 September, 2009

Page 2: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP2

From simulation to trade

tradertrader

schedulerscheduler

Internal poolInternal pool

Another Another internal internal divisiondivision

External External resource resource Eg EC2Eg EC2

tradetrade

Messaging

Messaging

Latency

Scale up

Scale out

Grid

Page 3: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP3

Red Hat Enterprise MRG

Integrated platform for high performance distributed computing

High speed, interoperable, open standard Messaging

Deterministic, low-latency Realtimekernel

High performance & throughput computing Grid scheduler for distributed workloads and Cloud computing

Page 4: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP4

AMQP, HP Performance, scale up.

two Intel(R) Xeon(R) CPU X5570 @ 2.93GHz per blade two Intel(R) Xeon(R) CPU X5570 @ 2.93GHz per blade (Nehalem 2.93 GHz, 8MB L3 cache, 95W)(Nehalem 2.93 GHz, 8MB L3 cache, 95W)MemoryMemory 24GB(6x4GB) , Memory Type DDR324GB(6x4GB) , Memory Type DDR3--1333, HT, Turbo 2/2/3/3) 1333, HT, Turbo 2/2/3/3) Infiniband 4X QDR IB DualInfiniband 4X QDR IB Dual--port Mezzanine HCAs(1 port connected) port Mezzanine HCAs(1 port connected) Infiniband SwitchInfiniband Switch BLc 4X QDR IB Switch BLc 4X QDR IB Switch

8 Broker 4 Broker 2 Broker 1 Broker0

2000000

4000000

6000000

8000000

10000000

12000000

Single HP Nehalem BL460c 40G Infiniband AMQP Perftest

8 bytes64 Bytes256 Bytes1024 Bytes

Number of Brokers on the Server

Mes

sage

s/Se

c

Page 5: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP5

AMQP Messaging on 8-node HP Nehalem Infiniband 40Gps > 11 M mes/s

4 Broker 2 Broker 1 Broker 0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

0

0.5

1

1.5

2

2.5

3

3.1 3.1 3.1

NehalemHarperton% Nehalem vs Harperton

Number of Brokers per Server

Mes

sage

s/Se

c

Page 6: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP6

KVM Performance – AMQP Messaging Intel Nahalem 2 10Gbit Vt-D > 1 M mes/s

16 32 64 128 256 512 1024 2048 40960

200000

400000

600000

800000

1000000

1200000

0

100

200

300

400

500

600

700

800

900

104 6081 1023869

902689 880965

804045741297

555465

369145

210634

RHEL 5.4 KVM AMQP 2-Guest

Msg/secThroughput MB/sec

Msg Size (bytes)

Mes

sage

s / S

ec

Page 7: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP7

MRG Messaging Infiniband RDMA Latency:Under 40 Microseconds Reliably Acknowledged

13

57

911

1315

1719

2123

2527

2931

3335

3739

4143

4547

4951

5355

5759

6163

6567

6971

7375

7779

8183

8587

8991

9395

9799

0.0340

0.0360

0.0380

0.0400

0.0420

0.0440

0.0460

0.0480

MRG Messaging Latency Test on HP BL460c G6 Infiniband100K Message Rate

32 Bytes RDMA Nehalem256 Bytes RDMA Nehalem1024 Bytes RDMA Nehalem

Ave

rage

Lat

ency

(ms)

Page 8: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP8

Components of the Solution Stack

Solutions still matter in an industry-standard, open source world…

HP reduced SMI BIOS'sRed Hat MRG - Realtime

Red Hat / HP SystemsRed Hat MRG – Messaging / Grid

Red Hat MRG – Tuning toolsTuning & working in labs

HP – Voltaire / Red Hat RDMA

HP compute & storage

Determinism, and performance needs to work at each layer, HP & Red Hat are partnered across the stack

FSI-HPC Solution Stack

X86-64 Server Architecture

BIOS

Operating System

Server Interconnect L2 Fabric

Integrated Systems

Workload Middleware

Application Environment

Users

Services

Page 9: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP9

Hardware matters…

Scale-Up Blades

Scale-Out Rack-Optimized SL6000

HP Low Latency Lab with MRG+

Red Hat MRG Lab with HP BL460/BL685 & IB

Today’s RFP Metrics:Performance/Watt Performance/BTUPerformance/Rack

Page 10: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP10

Dealing with SMIs

HP BIOS Option for Low Latency AppsDisable frequent SMIs used for Dynamic Power Savings Mode, CPU

Utilization monitoring, P-state monitoring and ECC reportingBenefits both RHEL & MRG operating environments.

Latency spikes with standard BIOS settings Latencies when SMIs disabled in BIOS

Page 11: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP11

MRG – Realtime RHEL on HP systems

Enables applications and transactions to run predictably, with guaranteed response times

Upgrades RHEL 5 to realtime OS

Provides replacement kernel for RHEL5; x86/x86_64

Preserves RHEL Application Compatibility

Certified on HP hardware, see Red Hat / HP certifications

Time

Res

pons

e tim

e

Page 12: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP12

MRG Realtime Scheduling Latency

VanillaMin: 1Max: 2857Mean: 11.47Mode: 9.00Median: 9.00Std. Deviation: 54.94MRG RTMin: 4Max: 43Mean: 8.34Mode: 8.00Median: 8.00Std. Deviation: 1.49

Page 13: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP13

Networking matters…

Voltaire DDR and QDR InfiniBand:

RoEE – RDMA on Enhanced EthernetRoEE is defined to be a verbs compliant IB transport running over the emerging IEEE Converged Enhanced Ethernet standardwww.openfabrics.org/archives/spring2009sonoma/monday/grun.pdf

36 QDR QSFP ports Ethernet mngt port

Serial portUSB port

LEDs

Test Configuration:Two Nehalem-based server w/ ConnectX PCI-E HCAs, back-to-backQDR – ConnectX HCA running at QDRDDR – ConnectX HCA running at DDRRHEL5 UPDATE 2Mellanox VERBs Performance Test

Page 14: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP14

MRG GridProvides leading high performance & high throughput computing:

Brings advantages of scale-out and flexible deployment to any application or workloadDelivers better asset utilization, allowing applications to take advantage of all available computing resources

Enables building cloud infrastructure and aggregating multiple clouds:Integrated support for virtualization as well as public cloudsSeamlessly aggregates multiple cloud resources into one compute pool

Provides seamless and flexible computing across:Local gridsRemote gridsPrivate and hybrid cloudsPublic clouds (Amazon EC2) ‏Cycle-harvesting from desktop PCs

Page 15: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP15

Based on Condor and Includes:

Enterprise SupportabilityFrom Red Hat

Web-Based Management ConsoleUnified management across all of MRG for

job, system, license management, and workload management/monitoring

Low Latency SchedulingEnable job submission to Condor via AMQP

Messaging clientsEnable sub-second, low-latency scheduling

for sub-second jobs

Virtualization Support via libvirt IntegrationSupport scheduling of virtual machines on

Linux using libvirt API's

Cloud Integration with Amazon Ec2Enable automatic cloud provisioning, job

submission, results storage, teardown via Condor scheduler

Extensible, it can be a dependency for other jobs or executed based on rules (e.g. add capacity in in the cloud if local grid out of capacity) ‏

Concurrency LimitsSet limits on how much of a certain resource

(e.g. software licenses, db connections) can be used at once

Dynamic SlotsMark slots as partitionable and sub-divide them

dynamically so that more than one job can occupy a slot at once

Page 16: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP16

Testing and developing solutions working together…

...Delivered in reference papers & certifications...Delivered in reference papers & certifications

Red Hat / HP White Paper:

1-GigE 10-GigE IPoIB IB SDP IB RDMA60

62

64

66

68

70

72

74

Throughput Memory Usage

cachebufffree

Page 17: FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWAREwhp-hou4.cold.extweb.hp.com/pub/c-products/servers/linux/... · 2009-10-14 · WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff

HPC on Wall Street 2009 | Carl Trieloff – Red Hat | Lee Fisher - HP17

Additional Information

www.redhat.com/mrgwww.hp.com/go/fsi