39
Your Innovations Inspire Our Innovations Dmitry Gryaznov Intel Ukraine October 2009

Hpc Day Oct 09

Embed Size (px)

DESCRIPTION

Your Innovations Inspire Our InnovationsDmitry GryaznovIntel Ukraine October 2009

Citation preview

Page 1: Hpc Day Oct 09

Your Innovations InspireOur Innovations

Dmitry GryaznovIntel Ukraine October 2009

Page 2: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 2

The Technical Computing Architecture

Optimizing the Time From Idea to Reality With a New Generation of Intelligent Processors

Innovation and Discovery

Create Visualize

Simulate

Technical Computing

CAE/CFD Weather

DCC Life Science

Energy Finance

Analyze

Page 3: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 3

Insatiable Demand for Performance, Density, and Efficiency

Intel Power Reduction Over Time

Your Demand For Performance

1970 1980 1990 2000 2005 2010

1.E-07

1.E-06

1.E-05

1.E-04

1.E-03

1.E-02

1.E-01

1.E+00

1 PFlops

1 TFlops

1 GFlops100 MFlops

100 PFlops1 EFlops

1993 20171999 2005 2011 2023

1 ZFlops

2029Source: Top500.org

~ 1 Million Factor Reduction In Energy per Transistor Over 30+ Years

Page 4: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 4

Intelligent Performance

Software Versatility

Ease of Deployment

Meeting Today’s HPC Challenges

Genomics Research

Medical Imaging

Weather Prediction

Oil Exploration Design Simulation

Financial Analysis

Scaling Performance Forward

Page 5: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 5

Intelligent Performance

Up to 3x Performance Increase

Performance Optimized For Your Environment

Power Efficiency

Enabling You to Intelligently

“Scale Your Performance Forward“

For notes and disclaimers, see legal information slide at end of this presentation.

Page 6: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 6

Intel® Xeon® 5500 (codename Nehalem – EP):

Putting More Brain Power into Your Cluster

Performance by Design• Intel® QuickPath Interconnect • Integrated memory controllerIntelligent Performance • Intel® Turbo Boost Technology• Hyper-Threading technologyPower Efficiency• More power states • Faster transition between power

states• lower idle power

Driving Performance Through Multi-coreTechnology and Platform Enhancements

QPI

Shared L3 Cache

Core Core Core Core

Integrated Memory Controller – 3 Ch DDR3

Page 7: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 7

Intel® Xeon® 5500 Platform

Intel® 5520

Chipset

PCI Express* 2.0

ICH 9/10Intel® X25-ESSDs

Intel® 82599 10GbE

Controller

Platform Ready for Future 32nm Products

Up to 3X the performance over previous generation Intel® Xeon® 5400

Optimize your performance for diverse workloadsLower TCO by providing more energy efficient higher performing solutions

For notes and disclaimers, see legal information slide at end of this presentation.

Page 8: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 8

Energy Open MP Energy Open MP Weather EnergyWeather FEA FEA CFD CFD

Xeon 5400

series

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2009, Intel Corporation. * Other names and brands may be claimed as the property of others.

Source: Published/submitted/approved results March 30, 2009. See backup for additional details

Relative Performance Higher is better

Intel® Xeon 5500: A New Generation of Intelligent Processors

Knows Where to Put the Speed, Knows How to Save Energy

Page 9: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 9

Memory Technology

Example Usages

Fastest QPI

Fastest Memory

Faster QPI

Faster memory

Advanced Processors for HPCSelection Guidance

8M cache 6.4 GT/s QPI DDR3 1333 HT Turbo +3

8M cache 5.86 GT/s QPI HT Turbo +2

4M cache 4.8 GT/s QPI

Advanced Skus

DDR3 80019.2 GB/s144 GB

DDR3 106625.5 GB/s

96 GB

DDR3 133332 GB/s48 GB

MemoryRequirement

MaxCapacity

BalancedPerformance

MaxBandwidth

HPCTechnical Computing

General Purpose Enterprise workloads

VirtualizedEnvironment

E55062.13 GHz

E55402.53 GHz

E55302.40 GHz

E55202.26 GHz

X55702.93 GHz

X55602.80 GHz

X55502.66 GHz

E55402.53 GHz

E55302.40 GHz

X55702.93 GHz

X55602.80 GHz

X55502.66 GHz

Standard Skus

Basic Skus

X55702.93 GHz

X55602.80 GHz

X55502.66 GHz

HPC E55202.26 GHz

E55042.00 GHz

E55021.86 GHz (2C)

Page 10: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 10

Step Function in PerformanceAdvanced SKUs Offer Significant Performance Gains

50

100

150

200

250

0

2.0

0/4

.8/8

00 E

5504

SPECint_ rate_base2006 SPECfp_ rate_base2006

2.1

3/4

.8/8

00 E

5506

2.2

6/5

.86/1

066 E

5520

2.4

0/5

.86/1

066 E

5530

2.5

3/5

.86/1

066 E

5540

2.6

6/6

.40/1

333 X

5550

2.8

0/6

.40/1

333 X

5560

2.9

3/6

.40/1

333 X

5570

Maximum Performance

Turbo and HT “ON”

SP

EC

Ben

chm

ark

Basic

Standard

Advanced

For notes and disclaimers, see legal information slide at end of this presentation.

Page 11: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 11

Intel Technology is Changing HPCTCO, Performance, Reliability

ExtremePerformance

PowerEfficient

ReduceSystem Cost

IncreasedReliability

10GbESolid State Disk

€ Intel IT evaluation results.

Optimize Performance for

I/O Intensive Apps and Boot Drive Replacement

Bridging the Gap Between

1GbE and Infiniband®

SSD Proof Points

Page 12: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 12

Intel® Xeon® 5500 Putting More Brainpower into the Datacenter

Up to 7.8X Performance

Same Power, Same Space

020406080

100120140160180200

Intel® Xeon 5100Intel® Xeon 5500

New Intel® Xeon® 5500 Series With SSDs

Dual-core Intel® Xeon® 5160 Processor (Woodcrest)

For notes and disclaimers, see legal information slide at end of this presentation.Source: Intel internal measurements. Test configurations in backup

Page 13: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 13

Nehalem-EX The next step in large scale HPC

– Up to 8 cores per socket

– Up to24MB shared last level cache

– 4 high bandwidth QPI links / processor

– High memory bandwidth and capacity

Schedule: Target Q4’09 Production Availability

PCIExpress*

I/O HUB I/O HUB

PCIExpress*

Nehalem Nehalem

Nehalem Nehalem

High Performance

More Scalability, Cores, and Memory Capacity

Page 14: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 14

Delivering Versatility

Performance Gains Today

Optimize Application Performance

Develop Highly Portable and Parallel

Software

Enabling You to Easily“Scale Your Performance

Forward“

Page 15: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 15

?

Parallel Programming Challenge Ease of Use and Flexibility

Irregular Patterns, Data Structures, and Serial Algorithms

Scale to Multi-Core Today → Hard

Scale to Many-Core Tomorrow → Harder

Increasing Cores (2→64+ Cores)

Vector Instructions (4→8+ Wide)

Cache and Interconnect Latency

Page 16: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 16

Scaling Performance Forward One Development Environment – Multi- to Many-core

Simplify Your Development

PerformanceOptimize/Tune

PerformanceOptimize/Tune

ConfidenceCorrectness

InsightArchitectural Analysis

Page 17: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 17

Ease of DeploymentConfidently Deploy and Manage a Cluster

Enabling You to Confidently “Scale Your Performance Forward“

EASY

Certified Cluster Configurations

Intel® Cluster Checker to Validate

Simplification

Application Interoperability “Out of Box” Experience

Page 18: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 18

ICR – Intel® Cluster Ready

Simplify Management

with Intel® Cluster Checker

Simplify Deploymentwith registered

applicationsSimplify

Manufacturing

with defined recipes and

Intel® Cluster Checker to

validate

Simplify Purchasingwith certified

cluster configurations

Simplifying Your Cluster

Registered ISV/Apps =18/53

Certified OEM/Platforms = 21/89

What is ICR?A specification to help OEM’s

& PI’s manufacture HPC clusters based upon the Intel

architecture

Page 19: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 19

The Future Looks Bright

Breakthrough Technology Year After Year

45nm – Intel® Xeon

5400– Intel® Xeon

5500– Nehalem EX

FutureA leap ahead in

technology

22nm Continue to

deliver world class processor

technology

32nm – Westmere –

more cores– Sandy Bridge –

higher integration

Silicon and Software

Tools Unleash

Performance

Page 20: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 20

Solving Your HPC Challenges

– Certified cluster configurations to simplify cluster deployment

– Use Intel® Cluster Checker to validate configurations: ensure a highly reliable solution

– Easily optimize application performance and eliminate the need to increase software resources

– Develop highly portable, parallel software

– Up to 3x performance gains to decrease your time to discovery

– Improved power technology, more efficient data for a lower TCO

Intelligent Performance

Software Versatility

Ease ofDeployment

Scaling Performance Forward

For notes and disclaimers, see legal information slide at end of this presentation.

Page 21: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 21

BACK UP

Page 22: Hpc Day Oct 09

Intel Confidential2222

2009: 1,000 servers

2006: 1,000 servers

Performance BENEFIT over WDCSPECfp_rate_base2006

(4.16x)

Save863,000

KWhperyear

Business BENEFITS

Up to 4X the performance;Up to 14% less power

New Intel Xeon® 5500 series

Dual Core Performance RefreshData Center perf. optimization with Intel® Xeon 5500 (Nehalem-EP)

Dual core Intel Xeon® 5160 Processor (WDC)

Source: Intel estimates and measurements as of Nov 2008. Performance comparison using SPECfp_rate _base2006. Use this slide in conjunction with backup slide. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark LimitationsResults have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.

020406080

100120140160180200

WCD NHM

For notes and disclaimers, see legal information slide at end of this presentation.

Source: Intel internal measurements. Test configurations in backup

Page 23: Hpc Day Oct 09

Intel Confidential2323

2009: 1,215 servers

2006: 1,000 servers

Performance BENEFIT over WDCSPECfp_rate_base2006

(5.06x)

Up to 5X Performance*Same Power Envelope(*without any benefit from SSDs)

New Intel® Xeon® 5500 series with SSDs

Dual Core Performance RefreshData Center perf. optimization with Intel® Xeon® 5500 (Nehalem-EP)

Dual core Intel® Xeon® 5160 Processor (WDC)

Source: Intel estimates and measurements as of Nov 2008. Performance comparison using SPECfp_rate _base2006. Use this slide in conjunction with backup slide. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark LimitationsResults have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.

020406080

100120140160180200

WCD NHM

For notes and disclaimers, see legal information slide at end of this presentation.

Source: Intel internal measurements. Test configurations in backup

Page 24: Hpc Day Oct 09

Intel Confidential242424* Other names and brands may be claimed as the property of others.    Copyright © 2008, Intel Corporation.

Improving Performance Efficiency

† Xeon® 5300 series data based on Xeon® X5365 SKU (B-3 stepping), Xeon® 5400 series based on Xeon® X5470 (E-0 stepping), and Xeon® 5500 based on Xeon® W5580 (D-0 stepping). Number of operating states includes all frequency operating points, including Turbo Boost and base frequency. Idle power based on C6 idle power for Xeon® 5500, and C1E for Xeon® 5300 and 5400 SKUs. C6 also requires OS support and may vary by SKU. Faster transitions based on Package C1E exit transition latency.

Intelligent Power Evolution

Up to

5X

Up to

5X

Up to

5x3

50

10

15

10

2

More Operating States

Lower CPU Idle Power (W)

Faster Transitions (msec)

2009 Xeon 5500 Series 2007-2008 Xeon 5400 Series 2006 Xeon 5300 Series

Page 25: Hpc Day Oct 09

Intel Confidential252525* Other names and brands may be claimed as the property of others.    Copyright © 2008, Intel Corporation.

Extending Performance with SSD’s Extending Performance with SSD’s

HPC Opportunities

• Hard Disk Drive Replacement; I/O intense apps•Boot Drive Replacement

Usage Models

Benefits • Lower Latency and Higher Throughput

• New Levels of Reliability & Mgmt

• Less Power and Smaller Footprint

Faster Access to Data

Lower TCO

Energy & SpaceSavings

Unparalleled IOPs with Solid State Disks

Page 26: Hpc Day Oct 09

Intel Confidential262626* Other names and brands may be claimed as the property of others.    Copyright © 2008, Intel Corporation.

The Truth Of Law’s & Observations

Growing the problem size may mitigate the impact of Amdahl’s Law.

ONLY if the serial fraction doesn’t grow in proportion to the problem size

70%Parallel30% Serial

70% Parallel

30% Serial

If Serial Component Remains Proportionately Equal, There Is No Inherent Speed Up Factor Available

If parallel component is 50xfaster the max speed up is 3.25X

Amdahl’s Law

95% Parallel

5%Serial

70%Parallel30%Serial

If The Serial Component Shrinks In Size As Problem Expands, There are Significant Speed Up Opportunity Available

If parallel component is 50xfaster the max speed up is 18.26X

Gustafson’s Observation

Page 27: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 27

Intel® Xeon® 5500: HPC Leading Capability

– See Intel® Xeon® 5500: An Advance in HPC performance slides (previous 2 slides for turbo and HT details)

Feature Today Nehalem-EP Benefit

Peak CPU-Chipset BW

21 GB/s(1333MHz)

46.1 GB/s(6.4 GT/s)

Up to 2.2x

Peak Mem BW 21 GB/s(FBD-667)

32 GB/s(DDR3-1333)

Up to 1.5x/CPU

Max Memory Capacity

64/128 GB(FBD)

144 GB(DDR3)

Up to 2x*

Turbo Boost No YesPerformance on

demand based on SW needs

Hyper-Threading No Yes Up to 16 threadsfor a DP system

CPU

Pla

tform

Page 28: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 28

1.00

1.74

1.25

Quad-Core Intel XeonX5482/1600 FSB HDD

RAID0

Quad-Core Intel XeonX5570/6.4 MT HDD

RAID0

Quad-Core Intel XeonX5570/6.4 MT SSD

RAID0

ANSYS® Mechanical™12.0 Preview 7

Up to74%

Data Source: Approved/published results as of March 30, 2009.

Quad-Core Intel® Xeon E5570 with Intel® X25-E Extreme SSD is 74% faster than previous quad core processor

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other names and brands may be claimed as the property of others.

• ISV Application Description

ANSYS 12.0 software is a comprehensive multiphysics tool combining structural, thermal, fluids, acoustic and electromagnetic simulation capabilities in a single engineering software solution. Its comprehensive range of physical models can be applied to simulation-based product development in a broad range of industries and applications.

• Benchmark description

The benchmark uses a FEA model with 1.5 million degrees of freedom to extract 50 dynamic mode frequencies and mode shapes using block Lanczos solver. The workload is IO-intensive with limited scaling. The results are based on 4-process parallel execution; see backup slides for details.

Relative Performance Higher is better

2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms Intel® X25-E Extreme SSD Performance Comparison using ANSYS ® Mechanical™ 12.0 Preview 7

Page 29: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 29

1.00

1.60

1.33

Quad-Core Intel XeonX5482/1600 FSB HDD

RAID0

Quad-Core Intel XeonE5570/6.4 MT HDD

RAID0

Quad-Core Intel XeonE5570/6.4 MT SSD

RAID0

MD Nastran R3

Up to 60%

Data Source: Approved/published results as of March 30, 2009.

Quad-Core Intel® Xeon E5570 with Intel® X25-E Extreme SSD is 60% faster than previous quad core processor

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other names and brands may be claimed as the property of others.

• ISV Application Description MD Nastran R3 combines best-in-class

solver technologies - Nastran, Marc, Dytran, Adams, and LS-Dyna - into one, fully-integrated, multidiscipline simulation solution for the manufacturing enterprise allowing manufacturers to perform interoperable, multidisciplinary analyses on complex models.

Benchmark description

MD Nastran benchmarks representing 5 solutions sequences including static analysis, normal modes analysis with/without ACMS, direct frequency response, modal frequency response and non-linear analysis using serial, SMP, and DMP execution.

Relative Performance Higher is better

2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms Intel® X25-E Extreme SSD Performance Comparison using MD Nastran benchmarks

Page 30: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 30

LEGAL DISCLAIMERS

Page 31: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 31

Nehalem-EP PerformanceComparison to Previous Generation 5400 Series on Server and

HPC Benchmarks – Config Details

Data source: Intel Internal measurements – November 2008

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm

Benchmark Specific Details (All data based on Intel internal measurements, February 2009)Benchmark OS Memory Other Software & Hardware details

SPECint*_rate_base2006, SPECfp*_rate_base2006,

Suse Linux 10-64bit

Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHzXeon 5400 HPC: 16GB (8x2GB) FB DDR2-800MHzNehalem-EP: 24GB (6x4GB) DDR3-1333MHz

SPEC binaries built with Intel Compiler 11.0 for 32-bit/64-bit Linux. HT ON for Nehalem-EP. Turbo mode disabled

TPC*-C – Oracle* RedHat Linux OS Xeon 5400 Server: 64GB (16x4GB) FB DDR2-667 Oracle* 11g. HT ON for Nehalem-EP. Turbo mode disabled.TPC*-C – SQLServer Microsoft Windows

Server 2003Nehalem-EP: 288GB memory simulated using 72GB (18x4GB) DDR3-800 MHz. Result recalibrated for 144GB

Microsoft SQLServer*2005. HT ON for Nehalem-EP. Turbo mode disabled.

TPC-*H Microsoft Windows Server 2008

Xeon 5400 Server: 64GB (16x4GB) FB DDR2-667MHzNehalem-EP: 72GB (18x4GB) DDR3-800MHz

Microsoft SQLServer 2008 RTM; HT ON, Turbo mode disabled

SAP-SD* Suse Linux 10-64bit

Xeon 5400 Server: 32GB (8x4GB) FB DDR2-667MHzNehalem-EP: 48GB (12x4GB) DDR3-1066MHz

SAP* 2-Tier SD benchmark. ECC 5.0 Version. Oracle database. HT ON for Nehalem-EP. Turbo mode enabled

SPECjbb*2005 Various Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHzNehalem-EP: 24GB (6x4GB) DDR3-1333MHz

4 JVM instances for HTN and 2 JVM instances on Nehalem-EP. HT ON for Nehalem-EP. Turbo mode enabled

SPECjvm*2008 Various Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHzNehalem-EP: 24GB (6x4GB) DDR3-1333MHz

Baseline result. 1 JVM instance. HT ON for Nehalem-EP. Turbo mode enabled

SPECweb*2005 Microsoft Windows Server 2008

Nehalem-EP: 18GB (18x1GB) DDR3-800MHz IIS7 with Zend PHP Isapi Dll 5.0; HT ON vs OFF study

vConsolidate vCon 2.0 - Profile 2

Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHzNehalem-EP: 48GB (12x4GB) DDR3-1066MHz

Vmware ESX 3.5 for Xeon 5400; Vmware ESX 4.0 Beta 1 for Nehalem-EP. HT ON, Turbo mode disabled.

All HPC applications Red Hat EL5-U3 Beta - 64-bit;

Xeon 5400 HPC: 16GB (8x2GB) FB DDR2-800MHzNehalem-EP: 24GB (12x2GB) DDR3-1066MHz

All benchmarks run with 8 process. HT ON for Nehalem-EP. Turbo mode disabled

Linpack Red Hat EL5-U2 Beta - 64-bit;

Xeon 5400 HPC: 16GB (8x2GB) FB DDR2-800MHzNehalem-EP: 24GB (6x4GB) DDR3-1333MHz

Intel® SMP LINPACK 10.0.4 (Linux) for HTN. Intel® SMP LINPACK 10.1 Beta 2 (Linux) for Nehalem-EP; HT OFF

Stream Red Hat EL5-U1 64-bit;

Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHzNehalem-EP: All memory configurations were run

8 Copies. Stream Triad used for comparison. HT OFF for Nehalem-EP

Benchmark comparisons for HT ON vs OFF, Turbo ON vs OFF shown were measured using the same platform configuration as above. Comparisons across different Nehalem-EP skus were measured on the same platform using the above configuration.

Xeon 5400 Server platform common configuration details: Super Micro server platform X7DB3 with two Quad-Core Intel Xeon processor X5460(HTN 3.16GHz) or X5470(HTN 3.33GHz) with 2x6M L2 Cache, 1333 MHz system bus, Blackford ChipsetXeon 5400 HPC platform common configuration details: Super Micro server platform X7DWA-N with two Quad-Core Intel Xeon processor E5472(HTN 3.0GHz) or X5482(HTN 3.20GHz) with 2x6M L2 Cache, 1600 MHz system bus, Seaburg ChipsetNehalem-EP platform common configuration details: Intel server pre-production SuperMicro platform with two Quad-Core Nehalem-EP processor, 2.93GHz with 8M L3 Cache, 6.4QPI, Tylesburg-EP Chipset. (SPECcpu2006 measured on “Green City” platform)

Page 32: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 32

– All comparisons based on published/submitted/approved results as of March 30, 2009– Fluent: Comparison based on published/submitted results to www.fluent.com/software/fluent/fl6bench/fl6bench_6.4.x/index.htm as of March 30, 2009.

All comparisons were using results run on 8 cores within a single machine on dual socket quad-core servers.– Baseline Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel® Xeon® processors X5482

3.20GHz, 12MB L2 cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.3*. Performance measured using Fluent Version 12.0 Beta. (Version 12.0.13)*. Six individual benchmarks are shown as a measure of single node performance. "Overall" performance is the geometric mean of the six individual benchmarks.

– Intel® Xeon® processor X5570 based platform details: SGI Altix ICE 8200EX* server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4 MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit Suse Linux Enterprise Server* 10 SP2 with ProPack 6SP2*. Performance measured using Fluent Version 12.0 Beta. (Version 12.0.9) Six individual benchmarks are shown as a measure of single node performance. "Overall" performance is the geometric mean of the six individual benchmarks.

– Quad-Core AMD Opteron* processor model 2384 platform based details:Server platform with two AMD Opteron 2384 processor 2.7GHz, 6MB L3 cache, Linux OS. Performance measured using Fluent Version 12.0 Beta. (Version 12.0.7) Six individual benchmarks are shown as a measure of single node performance. "Overall" performance is the geometric mean of the six individual benchmarks.

– SPECompM2001– Baseline Intel® Xeon® processor E5472 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors E5472 3.0GHz,

12MB L2 cache, 1600MHz FSB, 32GB memory (8x4GB 800MHz DDR2 FB-DIMM), SUSE LINUX 10.1* (X86-64) (Linux 2.6.16.13-4-smp). Binaries built with Intel Compiler 10.1. Referenced as published at 17187. (SPECompMbase2001). For more information see http://www.spec.org/omp/results/res2007q4/omp2001-20071107-00274.html.

– Intel® Xeon® processor X5570 based platform details: Cisco B-200 M1 server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, 6.4GT/s QPI, 24 GB memory (6x4 GB DDR3-1333MHz), Red Hat EL 5.3, Linux Kernel 2.6.18-128.el5 SMP x86_64, Binaries built with Intel® C/C++ Compiler 11.0 for Linux. Result submitted to www.spec.org for review at 43593 (SPECompMbase2001) as of March 30, 2009.

– Quad-Core AMD Opteron processor 2384 based platform* details: Supermicro H8DMU Server platform* with two Quad-Core AMD Opteron processors 2386SE* 2.80GHz, 6MB L3 cache, 16GB memory (8x2GB, PC2-6400, Reg, dual-rank CL5), SUSE Linux Enterprise Server 10 64-bit, Binaries built with PathScale Compiler Suite*, Release 3.1. Referenced as published at 22678 (SPECompMbase2001). For more information http://www.spec.org/omp/results/res2008q4/omp2001-20081021-00320.html.

– Multiphysics Finite Element Analysis using ANSYS* - Comparison based on published/submitted results to www.ansys.com/services/hardware-support-db.htm as of March 30, 2009.

– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.1*. Performance measured using ANSYS* Mechanical* 12.0 Preview 7. Benchmark for Ansys-Shared* consists of a suite of 8 workloads and Ansys-Distributed* consists of a suite of 7 workloads. Geo mean of each these workload groups used for comparison.

– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4 MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit RedHat Enterprise Linux 5.3. Performance measured using ANSYS* Mechanical* 12.0 Preview 7. Benchmark for Ansys-Shared consists of a suite of 8 workloads and Ansys-Distributed consists of a suite of 7 workloads. Geo mean of each these workload groups used for comparison.

– MM5 v4.7.4 - t3a and WRF v3.0.1 - 12km CONUS : Comparison based on measured results as of March 30, 2009. All comparisons were using results run on 8 cores within a single machine on dual socket quad-core servers. Same platform used for both benchmarks

– Baseline Intel® Xeon® processor X5482 based platform details: SGI Altix ICE 8200EX* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit Suse Linux Enterprise Server* 10 SP2 with ProPack 6SP2*.

– Intel® Xeon® processor X5570 based platform details: SGI Altix ICE 8200EX* server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4 MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit Suse Linux Enterprise Server* 10 SP2 with ProPack 6SP2*..

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2009, Intel Corporation. * Other names and brands may be claimed as the property of others.

Intel® Xeon 5500: A New Generation of Intelligent ProcessorsSystem Configuration Information

Page 33: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 33

– All comparisons based on published/submitted/approved results as of March 30, 2009– Reservoir Simulation using Schlumberger Eclipse*– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2

cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.3*. (64-bit) and Eclipse version 2008.1 software.– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3

cache, QPI 6.4* MT/sec, 24GB (12x2GB 1066MHz DDR3) memory, 64-bit RedHat Enterprise Linux 5.3. Eclipse version 2008.1 software.– Reservoir Simulation using Landmark Nexus*– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2

cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.2*. (64-bit) . Landmark Nexus R5000 software*.– Intel® Xeon® processor X5560 based platform details: Supermicro X8DTN+ server platform* with two Intel Xeon processors X5560 2.80GHz, 8MB L3

cache, QPI 6.4 MT/sec, 12GB memory, 64-bit RedHat Enterprise Linux 5.3. Landmark Nexus R5000 software.– Reservoir Simulation using CMG* IMEX*– Intel® Xeon® processor X5482 based platform details: Dell Precision T7400 platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache,

1600MHz FSB, 32GB RedHat Enterprise Linux 5.2*. (64-bit) CMG IMEX, Version 2008.11.– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3

cache, QPI 6.4* MT/sec, 18GB memory. RedHat Enterprise Linux 5.3. (64-bit) CMG IMEX, Version 2008.11.– Computational Fluid Dynamics analysis using Star-CD* (Single Node) - Comparison based on published/submitted results to

http://www.cd-adapco.com/products/STAR-CD/performance/406/index.html as of March 30, 2009. All comparisons were using results run on 8 cores within a single machine on dual socket quad-core servers.

– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.3*. Performance measured using STAR-CD v4.06. Same configuration used for all both benchmark results - A-Class and C-Class.

– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+* server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI* 6.4 MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit RedHat Enterprise Linux 5.3. Performance measured using STAR-CD v4.06. Same configuration used for all both benchmark results - A-Class and C-Class.

– Crash Simulation analysis using LS-DYNA* (Single Node): Comparison based on published/submitted results to http://www.topcrunch.org/ as of March 30, 2009. All comparisons were using results run on 8 cores within a single machine on dual socket quad-core servers.

– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache, 1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux* 5.3. Performance measured using LS-DYNA mpp971.s.R321. Same configuration used for all three benchmark results - neon_refined_revised, 3 vehicle collision, car2car.

– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4 MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit RedHat Enterprise Linux 5.3. Performance measured using LS-DYNA mpp971.s.R321. Same configuration used for all three benchmark results - neon_refined_revised, 3 vehicle collision, car2car.

– SPECompL2001– Baseline Intel® Xeon® processor E5472 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors X5482 3.20GHz,

12MB L2 cache, 1600MHz FSB, 32GB memory (8x4GB 800MHz DDR2 FB-DIMM), SUSE LINUX 10.1* (X86-64) Intel Compiler 11.0. Submitted to www.spec.org for review at 81332 as of March 30, 2009.

– Intel® Xeon® processor X5570 based platform details: Cisco B-200 M1 server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, 6.4GT/s QPI, 24 GB memory (6x4 GB DDR3-1333MHz), Red Hat EL 5.3, Linux Kernel 2.6.18-128.el5 SMP x86_64, Binaries built with Intel® C/C++ Compiler 11.0 for Linux. Result submitted to www.spec.org for review at 234,996 (SPECompMbase2001) as of March 30, 2009.

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2009, Intel Corporation. * Other names and brands may be claimed as the property of others.

Intel® Xeon 5500: A New Generation of Intelligent ProcessorsSystem Configuration Information

Page 34: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 34

Step Function in Performance

SPECint_ SPECfp_

rate_base2006 rate_base2006

Nehalem 1.86/4.8/800 DC E5502 No data No data 24GB (12x2GB) DDR3-800MHz Intel Compiler 11.0

2.00/4.8/800 E5504 125 110 24GB (12x2GB) DDR3-800MHz Intel Compiler 11.0

2.13/4.8/800 E5506 130 113 24GB (12x2GB) DDR3-800MHz Intel Compiler 11.0

2.26/5.86/1066 E5520 185 154 24GB (12x2GB) DDR3-1066MHz Intel Compiler 11.0

2.40/5.86/1066 E5530 192 158 24GB (12x2GB) DDR3-1066MHz Intel Compiler 11.0

2.53/5.86/1066 E5540 199 161 24GB (12x2GB) DDR3-1066MHz Intel Compiler 11.0

2.66/6.40/1333 X5550 225 185 24GB (6x4GB) DDR3-1333MHz Intel Compiler 11.0

2.80/6.40/1333 X5560 230 188 24GB (6x4GB) DDR3-1333MHz Intel Compiler 11.0

2.93/6.40/1333 X5570 235 190 24GB (6x4GB) DDR3-1333MHz Intel Compiler 11.0

Numbers were measured using Intel Compiler 11.0 binaries from Oct 2008.

Final numbers (with minor variations from above) will be based on newer binaries.

“Peak” and “SPEED” results are WIP as well. All results are with HT ON and Turbo ON. J an 26, 2009. Data Source: Kuppuswamy Sivakumar, Intel Corporation, SPG Marketing

NHM-EP SPEC CPU2006 benchmark preliminary results (Read disclaimers below)

Memory details Compiler

Disclaimers: All NHM-EP numbers are preliminary. Numbers in Red are estimates. Others are measured.

Page 35: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 35

2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platformsSSD Performance Comparison using ANSYS ® Mechanical™ 12 P7 benchmarks

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other names and brands may be claimed as the property of others.

Test System Configuration and Results

 

Quad Core Intel® Xeon® 5482

(Harpertown)3.2/1600

Quad Core Intel® Xeon® 5570

(Nehalem)2.93/6.4

Quad Core Intel® Xeon® 5570

(Nehalem)2.93/6.4

System Baseboard Supermicro X7DB8+ Supermicro X8DTN+ Supermicro X8DTN+

Processors Intel Xeon 5482 Intel Xeon 5570 Intel Xeon 5570

number/type sockets 2 Quad-core 2 Quad-core 2 Quad-core

core frequency 3.2 GHz 2.93 GHz 2.93 GHz

LL cache size 2x 6144 KB 8192 KB 8192 KB

Chipset FSB/QPI Seaburg 1600 MT/s Tylersburg 6400 MT/s Tylersburg 6400 MT/s

Memory 16 GB 24 GB 24 GB

DIMMS 8x2 GB FBD 12x2GB DDR3 12x2GB DDR3

memory speed 800 MHz 1067 MHz 1067 MHz

I/O Subsystem4x 15K RPM U320 SCSI RAID0

4x 15K RPM U320 SCSI RAID0

4x SLC SSD RAID0

Operating System 64-bit Red Hat EL5U1 64-bit Red Hat EL5U3 64-bit Red Hat EL5U3

Elapsed time in seconds(lower is better)

8124 6522 4680

Relative performance(higher is better)

1 1.25 1.74

Page 36: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 36

36

2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms Intel® X25-E Extreme SSD Performance Comparison using MD Nastran benchmarks

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other names and brands may be claimed as the property of others.

Test System Configuration and Results

 

Quad Core Intel® Xeon® 5482(Harpertown)

3.2/1600

Quad Core Intel® Xeon® 5570

(Nehalem)2.93/6.4

Quad Core Intel® Xeon® 5570

(Nehalem)2.93/6.4

System Baseboard Supermicro X7DB8+ Supermicro X8DTN+ IN001 Rev 1.02 Supermicro X8DTN+ IN001 Rev 1.02

Processors Intel Xeon 5482 Intel Xeon 5570 Intel Xeon 5570

number/type sockets 2 Quad-core 2 Quad-core 2 Quad-core

core frequency 3.2 GHz 2.93 GHz 2.93 GHz

LL cache size 2x 6144 KB 8192 KB 8192 KB

Chipset FSB/QPI Seaburg 1600 MT/s Tylersburg 6400 MT/s Tylersburg 6400 MT/s

Memory 32 GB 24 GB 24 GB

DIMMS 8x4 GB FBD 12x2GB DDR3 12x2GB DDR3

memory speed 800 MHz 1067 MHz 1067 MHz

I/O Subsystem 4 x 15K RPM U320 SCSI RAID0 4 x 15K RPM U320 SCSI RAID04x Intel® X25-E Extreme SATA Solid-

State Drive RAID0

Operating System 64-bit Red Hat EL5U1 64-bit Red Hat EL5U3 64-bit Red Hat EL5U3

Geomean for 12 workolads (lower is better)

2838.52 2137.04 1772.29

Relative performance(higher is better)

1.00 1.33 1.60

Page 37: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 37

Dual Core Performance Refresh Calculation DetailsIntel Estimated (1,000 1,000)

2006 2009 Delta / Notes Product Intel® Xeon® 5160

Processor (3.00GHz) Intel Xeon 5500 series (2.93GHz)

Performance per Server

45.1SPECfp_rate_base2006

188SPECfp_rate_base2006

up to 4.16x per/server

Server Power (Watts)

365W active / 240W idle

329W active / 125W idle

Server active 20 hours and idle for 4 hours per day. Assumes cooling with 2.0 PUE.

# Servers needed 1,000 1000

# Racks needed 48 racks 48 racks Same # of Racks

Total Perf 45,100 total SPECfp_rate_base2006 Performance

188,000 total SPECfp_rate_base2006 Performance Up to 4.16X

performance boostAnnual kW/hr 6,046,320 5,182,560 Estimated 14% lower

energy UtilizationAnnual Energy Costs

$604,632 $518,256 $86,376 less electricity costs per year. Assumes $0.10/kWhr and 2x cooling factor

Annual Cost Savings of $

Cost of new HW n/a $86,376

Author
add "estimated" in front of 14%add "up to" in front of 4.16x
Page 38: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 38

Dual Core Performance Refresh Calculation DetailsIntel Estimated (1,000 server w/HDD 1,215 w/ SSD

2006 2009 Delta / Notes Product Intel® Xeon® 5160

Processor (3.00GHz) Intel Xeon 5500 series (2.93GHz)

Performance per Server

45.1SPECfp_rate_base2006

188SPECfp_rate_base2006

Up to 5x per/server

Server Power (Watts)

365W active / 240W idle

316W active / 117W idle

Server active 20 hours and idle for 4 hours per day. Assumes cooling with 2.0 PUE.

# Servers needed 1,000 1215 Using SSD’s and ½ size brds

# Racks needed 48 racks 48 racks Same # of Racks

Total Perf 45,100 total SPECfp_rate_base2006 Performance

188,000 total SPECfp_rate_base2006 Performance Up to 5X

performance boostAnnual kW/hr 6,046,320 6,044,440 Similar power

requirementsAnnual Energy Costs

$604,632 $604,444 Approximately the same power cost

Page 39: Hpc Day Oct 09

* Other names and brands may be claimed as the property of others.    Copyright © 2009, Intel Corporation. 39

Dual Core Performance Refresh Calculation DetailsIntel Estimated (1,000 server w/HDD 1,869 w/ SSD and optimized

data center, PUE 2.0 1.3)

2006 2009 Delta / Notes Product Intel® Xeon® 5160

Processor (3.00GHz) Intel Xeon 5500 series (2.93GHz)

Performance per Server

45.1SPECfp_rate_base2006

188SPECfp_rate_base2006

Up to 7.8xx per/server

Server Power (Watts)

365W active / 240W idle

316W active / 117W idle

Server active 20 hours and idle for 4 hours per day. Assumes cooling with 2.0 PUE.

# Servers needed 1,000 1869 Using SSD’s and ½ size brds

# Racks needed 48 racks 48 racks Same # of Racks

Total Perf 45,100 total SPECfp_rate_base2006 Performance

188,000 total SPECfp_rate_base2006 Performance Up to 7.8X

performance boostAnnual kW/hr 6,046,320 6,043,690 Similar power

requirementsAnnual Energy Costs

$604,632 $604,369 Approximately the same power cost