51
Copyright HyperTransport Consortium 2009 HyperTransport Extending Technology Leadership International HyperTransport Symposium 2009 February 11, 2009 Mario Cavalli General Manager HyperTransport Technology Consortium

HyperTransport Extending Technology Leadership

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HyperTransportExtending Technology Leadership

International HyperTransport Symposium 2009February 11, 2009

Mario CavalliGeneral Manager

HyperTransport Technology Consortium

Page 2: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HyperTransport and Consortium Snapshot

Industry Status and Trends

HyperTransport Leadership Role

February 11, 2009

Mario CavalliGeneral Manager

HyperTransport Technology Consortium

HyperTransportExtending Technology Leadership

Page 3: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HyperTransport SnapshotLow Latency, High Bandwidth, High Efficiency

Point-to-Point Interconnect Leadership

CPU-to-CPU CPU-to-I/O

CPU-to-Coprocessor

Page 4: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Adopted by Industry Leadersin Widest Range of Applications

than Any Other Interconnect Technology

Page 5: HyperTransport Extending Technology Leadership

Snapshot

Formed 2001

Controls, Licenses, Promotes HyperTransport as Royalty-Free Open Standard

World Technology Leaders among Commercial and Academic Members

Newly Elected PresidentMike UhlerVP Accelerated ComputingAdvanced Micro Devices

Copyright HyperTransport Consortium 2009

Page 6: HyperTransport Extending Technology Leadership

Industry Status and Trends

Copyright HyperTransport Consortium 2009

Page 7: HyperTransport Extending Technology Leadership

Global Economic DownturnTough State of Affairs for All Industries

Copyright HyperTransport Consortium 2009

Consumer Markets Crippled with Long-Term to Recovery

Commercial Markets Strongly Impacted

Page 8: HyperTransport Extending Technology Leadership

Consequent Business Focus

Cost Effectiveness

No Redundancy

Frugality

Copyright HyperTransport Consortium 2009

Page 9: HyperTransport Extending Technology Leadership

Downturn Breeds Opportunities

Copyright HyperTransport Consortium 2009

Reinforced Need for More Optimized, Cost-Effective Computing Infrastructure

Good for HPC Sector

Page 10: HyperTransport Extending Technology Leadership

Creating Demand for New Technology

Copyright HyperTransport Consortium 2009

Delivering:More Value for Same Power and Cost Same Value for Less Power and Cost Best Investment Preservation Minimized Total Cost of Ownership

Through Better:Performance and Power Efficiency Resource Flexibility and Adaptability System Virtualization Consolidation

Page 11: HyperTransport Extending Technology Leadership

Producing New Computing TrendsCloud Computing Hosted Software, Software as a Service (SaaS)Replace Costly In-House Infrastructure and Management Resources

Infrastructure Centralization Demands Efficient Data Centers, Server Farms

Copyright HyperTransport Consortium 2009

Page 12: HyperTransport Extending Technology Leadership

Producing New Computing Trends (cont.)

Netbook over Notebook / Desktop

New? No

Innovative? No

Same for Less? No

Less for Much Less? Yes!

Good Enough if Budget Tight? Yes!

Right-Time, Right-Place Products? Right!

Copyright HyperTransport Consortium 2009

Page 13: HyperTransport Extending Technology Leadership

HyperTransport Leadership Role

Copyright HyperTransport Consortium 2009

Page 14: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Answers Market Trend Expectations

With Core Values

Leading Performance

Full Scalability

Power Efficiency

Low Design Cost

Market-Proven Solidity

Vast Product Ecosystem

Page 15: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Continued Technology Progression

With Expanding Market Presence

2001

HT 1.0

HT 2.0

2002

2004

2003

20062005

HTX

HT 1.1

17.7M HT-BasedSystems Shipped

(Note 1)

2008

HT 3.0

Note 1: by end of 2003 – Source InStatNote 2: by end of 2008 – Source InStatNote 3: High Node Count HT Specification 1.0 - Accessible/Useable by HTC Promoter and Contributor Members Only

62.7M HT-BasedSystems Shipped

(Note 2)

HTX3

HT 3.1HNC 1.0

(Note 3)

2009

Page 16: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HT 3.1 Specification

Keeps HT Ahead of Industry Requirements

HT 3.1

2.6 GHz 2.8 GHz 3.0 GHz 3.2 GHz Clock

51.2 GB/s (32-Bit)25.6 GB/s (16-Bit)

HT 3.041.6 GB/s (32-Bit)20.8 GB/s (16-Bit)

Clock Rate 2.0 GHz 3.2 GHz 60%Bandwidth 16 GB/s 51.2 GB/s 220%Link Width 16-bit 32-bit 100%

Feature Current Use HT 3.1

Max Max Headroom

Solidifies HT LeadershipReinforces HT ROI

The Only 32-Bit-CapableProcessor Interconnect

In Industry

Page 17: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HTX3TM Specification

3x Bandwidth of HTXTM Connector Standard

• HT3.0 Performance• HT3.0 Link Splitting Support• More Power Mgmt. Features• 100% Backward Compatibility

For Highest Performance Subsystems

Page 18: HyperTransport Extending Technology Leadership

Direct Network / Switched Network

Copyright HyperTransport Consortium 2009

High Node Count HT Specification 1.0

Enables Scalable HPC Systems and Clusters with Low Latency Non-Coherent Shared Memory Architecture

Server nServer 2

Server 1

M1

M2

M4M3

M5

M6

M7 M8

Mx

Mx+1

Mx+2 Mx+3

Page 19: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

High Node Count HT Specification 1.0 (cont.)

Answers Ever Compounding On-Chip + In-System Addressing Challenge

You are Here

ExponentialNumber of Cores

ExponentialNumber of CPU

Clusters/Subclusters

Page 20: HyperTransport Extending Technology Leadership

Network

Copyright HyperTransport Consortium 2009

High Node Count Specification 1.0 (cont.)

Supports Global Sharing of Localized Data Storage

Server YServer X

Server Z

Page 21: HyperTransport Extending Technology Leadership

Network

Copyright HyperTransport Consortium 2009

High Node Count Specification 1.0 (cont.)

Server YServer X

Server Z

Flash MemorySubsystem

High-DensityDRAM

Especially High-Density DRAM

Supports Global Sharing of Localized Data Storage

Page 22: HyperTransport Extending Technology Leadership

Network

Copyright HyperTransport Consortium 2009

High Node Count Specification 1.0 (cont.)

Server Y

Especially High-Density DRAMand Low Power Flash-Based Memory Subsystems

Server X

Server Z

Flash MemorySubsystem

High-DensityDRAM

Supports Global Sharing of Localized Data Storage

Page 23: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

High Node Count Specification 1.0 (cont.)

Best System and Performance ScalabilityMinimized Power Consumption

Optimized Total Cost of Ownership

Page 24: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Mature Stability, Mission-Critical Reliability

Field-Proven Dependability for Demanding Markets

63 Million HT-Powered Products

by end of 2008

8% Defense Applications 17%32% Top500 Supercomputers 28%11% Core Routers 1.2%22% Edge Routers 34%15% SAN 11%23% Servers 38%

2007 2007Capture Market Yr/Yr Growth

Source: InStat

Page 25: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Ever Expanding Product Ecosystem• From HT IP to HT Software• Fosters Technology Strength

• 12 HT-Based Processor Brands• Widespread Market Utilization

X86 Computing

Graphics

Security

Packet

Media

Comm

Acceleration

System Virtualization

Page 26: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Expanding Product Ecosystem (cont.)

New Godson Multi-Core Server-Class CPU

• Petascale Performance Target by 2010• Backed by China’s Government• MIPS-Based with 200+ More Instructions for

x86 Translation and Acceleration• 16 GFLOPS at 1GHz and 10W of Power• Earlier versions (non-HT), produced by ST

Microelectronics and sold to 40 companies in set-top boxes, laptops, etc.

• @200 developers working on Godson HW, @100 on SW and Compilers

Institute of Computing TechnologyChinese Academy of Sciences

Page 27: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HyperTransport Book

Covers all HT Link and HTX Specification

Available Online from MindShare www.mindhsare.comin Paper and eBook Formats

700 Pages of Must-Have Tutorial

Co-Authored by HTC’s Brian Holden

Page 28: HyperTransport Extending Technology Leadership

Thank You!

Mario CavalliGeneral Manager

HyperTransport Technology Consortium

Copyright HyperTransport Consortium 2009

Page 29: HyperTransport Extending Technology Leadership

Corollary InformationNot Part of Live Presentation

Copyright HyperTransport Consortium 2009

Page 30: HyperTransport Extending Technology Leadership

HyperTransport Everywhere!

Copyright HyperTransport Consortium 2009

Also in PowerPC-Based and Intel-Based Products

Page 31: HyperTransport Extending Technology Leadership

Godson Server-Class CPU

Copyright HyperTransport Consortium 2009

4-Core Reconfigurable Architecture

PCIe PCIe

DMA Engine Supports Pre-Fetch and Matrix

Shared L2 ConfigurableAs Internal RAM, DMA

To Internal RAM Directly(Stream Processor)

8 Config. AddressWindows of EachMaster Port AllowPages Migration

Across L2 and Memory

8x8 AXI Switch

2 Links for Each Node’s4 Connection Points

Nodes Organized in Mesh

ncHT1.0 ncHT1.0

Directory-Based CoherenceProtocol Safeguards

Cache Data

65-nm Technology

Institute of Computing Technology - Chinese Academy of Sciences

Page 32: HyperTransport Extending Technology Leadership

Godson Server-Class CPU (cont.)

Copyright HyperTransport Consortium 2009

Godson Versions

8-Core Multi-Chip 20W Version Possible in 2009

Institute of Computing Technology - Chinese Academy of Sciences

Page 33: HyperTransport Extending Technology Leadership

Godson Server-Class CPU (cont.)

Copyright HyperTransport Consortium 2009

GodsonCoresProfile

Institute of Computing Technology - Chinese Academy of Sciences

Page 34: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

How and Why HyperTransport HTX Proves Best Choicefor Compute-Intensive Applications

HTXTM Spotlight

Page 35: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HTXTM Values Snapshot

Enables• HPC Products Demanding Performance

Beyond the Reach of PCI-Class Interconnects• Integration of System Functionality

Too New/Complex/Costly for MB Integration

Empowers• HPC Solution Providers with a Competitive Edge

– No Risks of Premature MB Integration– Shortest Time-to-Market– One MB Fits Multiple Markets/Applications– Up-Sell Factor

Page 36: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Compute Intensive• High Bandwidth + Low Latency• Multi-Processing, Co-Processing

Target Markets• Database Analytics• High Traffic Web Services• Stock Trading Acceleration• Server Clustering and SMP• Streaming Media Servers• Financial Modeling

HTXTM Applications

Page 37: HyperTransport Extending Technology Leadership

Expanding HTXTM Product Ecosystem

HTXTM

Server / MBData Analysys Coprocessor

Content-AwareRouting Processor

High-PerfServer Clustering

Controller

Content/SecurityProcessor

10GE NICRef Design

UniversalHTX/HTX3 Board

Ref Design

FPGA Ref Design Board

Content/SecurityProcessor

More Innovative HTXTM Systems and Subsystems in the PipelineCopyright HyperTransport Consortium 2009

Page 38: HyperTransport Extending Technology Leadership

New HTXTM Systems

Copyright HyperTransport Consortium 2009

ProLiant DL165-G5

ProLiant DL785-G5

HTX HTX PCIe PCIe PCIe PCIe PCIe PCIe PCIex16 x16 x4 x4 x16 x4 x4 x4 x8

Slot Blank 9 Blank 8 7 6 5 4 3 2 1

Page 39: HyperTransport Extending Technology Leadership

New HTXTM Subsystems

Copyright HyperTransport Consortium 2009

Cache-Coherent Shared Memory Processor for Scalable Server Clustering

NumaChip Technology

Page 40: HyperTransport Extending Technology Leadership

New HTXTM Subsystems

Copyright HyperTransport Consortium 2009

VulcanContent-Aware Routing Processor for Multi-Core Systems

Delivers UnprecedentedMulti-Core Processing andPower Optimization

ApplicationsHigh-Traffic WebTelecomAutomated TradingHigh Throughput, Fast Network Access

Page 41: HyperTransport Extending Technology Leadership

New HTXTM Reference Designs

Copyright HyperTransport Consortium 2009

HTX3TM Universal Reference Design Board

HT3 Core IP

Page 42: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Why HTX3TM ?

Empowers Future HPC Innovation

• FPGAs Playing Key Role in Compute-Intensive Designs• HTX3 Paves Way for New Generation FPGA Technology

– FPGAs from Bandwidth Bottlenecks to Performance Drivers• Power Optimization Ranks High in HPC Agenda• HT 3.0 Has Reached Maturity and Stability• HT 3.0 Capability Now Safely and Stably

“Connectorized”

Reinforces HTX Performance Edge over PCI Express

Page 43: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HTX3TM Features Summary

Feature HTX HTX3 Notes

Max Clock Rate 800 MHz 2.6 GHz 12” Trace length

Max Bandwidth x Lane 1.6 GT/s 5.2 GT/s Bi-directional

Max Bandwidth Aggregate

6.4 GB/s 20.8 GB/s Bi-directional 16-Bit HT link

HT3 Link Splitting Support

NO YES HT link can be 1x 16-Bit or 2x 8-Bit for multi-CPU

support

HT3 Extended Power Management

NO YES LDTREQ# Signal Added to participate in x86 power

states

Extended FPGA Guidelines

NO YES Incorporated field-proven recommendations

Full Backward Compatibility

-- YES Level shifters and signal allocation

For more details, see HTX3 specifications on HTC’s web site

Page 44: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

HTXTM a Substitute for PCI Express?

No – HTX Complements and Coexists withPCIe by Providing the Capability that

PCIe Cannot Deliver

DDR Memory

Chipset

Direct Connect toCompute-Intensive

Subsystems PeripheralInterconnects

HTXHTX TMTM

HTX3HTX3 TMTM

HTX3HTX3 TMTM

DDR Memory

16-Bit

2x 8-Bit

Page 45: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Unique HTXTM Capabilities

• 20% Better Physical Layer Latency and Bandwidth due to Absence of 8B/10B Clock Recovery Overhead– No SerDes

• 55% Lower Latency Per Transaction due to Absence of Intermediate Control Logic Overhead– 95nS of PCIe Gen2’s Estimated Round Trip Penalty out of 170nS

Total on Short, Open Page DRAM Reads• Vastly Leaner Protocol (Packet Payload)

– 12 Less Bytes of Overhead per Packet Compared to PCIe• 20nS Better Per-Transaction Latency in Heavy Traffic

Environments due to HT’s Priority Request InterleavingTM

Aggregate Latency Advantage

Page 46: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Unique HTXTM Capabilities (cont.)

Up to Twice Packet/Latency Efficiencyin Intra-Processor Traffic

Packet Overhead Efficiency Margins over PCIe

Min Overhead

Max Overhead

Data Bytesper Packet

Efficiency

HTXTM

Usual Intra-Processor Traffic

Page 47: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Considerable Per-Packet Latency Advantage

Per Packet Latency Advantage over PCIe Gen2HTX3TM

nS

nS

nS

Min Packet Overhead

Max Packet Overhead

HTX3: 2.6 GHz - x16 Links

PCIe: 5.0 GHz – x16 Links

Data Bytes Per Packet

Latency Advantage

Latency Advantage

The results take into account PCIe’s 20% clock recovery, packet payload and 55% chipset overhead penalties. HTX’s Priority Request Interleaving, if applicable, will add to HTX’s total latency advantage.

Usual Intra-Processor Traffic

Unique HTXTM Capabilities (cont.)

Page 48: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Superior Bandwidth

Feature PCIeGen1

PCIeGen2

HTX HTX3

Max Clock Rate 2.5 GHz 5.0 GHz 800 MHz 2.6 GHz

Double Data Rate NO NO YES YES

Max Bandwidth x Lane 2.5 Gbps 5.0 Gbps 1.6 GT/s (*) 5.2 GT/s (*)

8B/10B Penalty -20% -20% No Penalty No Penalty

Net Bandwidth x Lane 2.0 Gbps 4.0 Gbps 1.6 GT/s (*) 5.2 GT/s (*)

Net Bandwidth 16-Bit - Aggregate

8 Gbytes/s

16 Gbytes/s

6.4 GBytes/s

20.8 GBytes/s

(*) HyperTransport supports Double Data Rate (DDR), transferring data on both the leading and trailing edge of the clock. Therefore HyperTransport’s bandwidth is more appropriately represented by the term “Transfers/second” than the term “Bits/second.”

Unique HTXTM Capabilities (cont.)

Page 49: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Tangible Time-to-Result Savings!

Time-to-Result Savings vs. PCIe Gen2HTX3TM

Number of Packets Transferred

100,000Per Task

1 MillionPer Task

1 BillionPer Task

Bytes per Packet Transferred

4 0.78 mS 7.8 mS 7.8 Sec

16 4 mS 40 mS 40 Sec

256 0.32 Sec 3.20 Sec 53 Min

512 1.16 Sec 11.62 Sec 3.23 Hrs

The results take into account PCIe’s 20% clock recovery, packet payload and 55% chipset overhead penalties. HTX’s Priority Request InterleavingTM , if applicable, will add to HTX’s total time-to-result latency advantage

Compute-Intensive Tasks Require 100Ks to Billions of Packet Transactions

Unique HTXTM Capabilities (cont.)

Page 50: HyperTransport Extending Technology Leadership

Copyright HyperTransport Consortium 2009

Example: Celoxica’s AcceleratorCompany’s Benchmark Results

Unique HTXTM Capabilities (cont.)

HTXTM Interface Interface

Latency Access to Network Data Regardless of Packet Size

1.4 uS <10 uS

Page 51: HyperTransport Extending Technology Leadership

HPC - Industry’s Bright Star

Copyright HyperTransport Consortium 2009

Strong Business Growth Opportunities