Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
12th ANNUAL WORKSHOP 2016
EVOLUTION OF PCI EXPRESS® AS THE UBIQUITOUS I/O INTERCONNECT TECHNOLOGY
Debendra Das Sharma, PhD Senior Principal Engineer and Director I/O Technology and Standards
Data Center Group, Intel Corporation Chair, PHY Logical Group, PCI-SIG®
April 7, 2016
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
AGENDA
§ Introduction § PCI Express® (PCIe®) 4.0 Specification § Retimers for Extending Channel Reach § Form Factors § Compliance § Conclusions
2 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
EVOLUTION OF PCI/ PCI EXPRESS® TECHNOLOGY
§ Peripheral Component Interconnect (PCI) started as bus-based PC interconnect in 1992 • 32 bit @ 33 MHz • Evolved to 64 bits @ 33/ 66/ 132 MHz
§ Moved to link-based serial interconnect with full-duplex differential signaling with PCI Express® (PCIe®) with backwards compatibility for software • Doubling data rate every generation
§ Evolution from PC to HPC, servers, clients, hand-held, and Internet-Of-Things usage over three decades
3 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
EVOLUTION OF PCI TECHNOLOGY
PCI/PCI-X SYSTEM PCI EXPRESS BASED SYSTEMS
4 Copyright © 2016 PCI-SIG® - All Rights Reserved
PCI-X Device
CPU
PCI/PCI-X SYSTEM
Connectivity Port(s)
(Enet/…)
PCIe port(s) for Wi-Fi, modem, storage, etc.
Large system with lots of open PCIe® slots – e.g., servers, workstations
Highly integrated systems including hand-held devices
PCI-X Device
PCI Bridge
PCI-X Bridge
AGP GFX Host Bridge Memory
PCI-X Device
PCI EXPRESS® BASED SYSTEMS
CPU
Memory PCI Express GFX
SWITCH
SWITCH
Storage
Network
Root Complex
PCI
PCI Bridge
PCI Bridge
PCI
PCI Bridge
Legacy End Point
End Point
SoC CPU
Core(s)
Memory PCIe
integrated GFX
PCIe integrated Network
SoC Interconnct & Root Complex
OpenFabrics Alliance Workshop 2016
PCI EXPRESS® ROADMAP AND PLATFORM EVOLUTION
5
Source: Intel Corporation
Copyright © 2016 PCI-SIG® - All Rights Reserved
GB
/S
(FU
LL D
UPL
EX B
W F
OR
A X
16)
5
50
500
(Platform evolution in keeping with Moore’s Law: More PCIe® lanes along with speed increases culminating in PCIe being integrated into CPU socket starting PCIe 3.0)
2004 2005 2007 2009 2011 2013 2015 2017
CPU
Mem & I/O Controller
South Bridge
Memory
PCIe to PCI Bridge
PCI/PCI-X*
PCIe 1.0a (~14 Lanes per Socket)
CPU
ESI
PCIe 2.0 (~18/ 36 Lanes Per Socket)
IOH
CPU CPU
South Bridge
CPU CPU
South Bridge
PCIe 3.0 (40 Lanes )
PCIe 3.0 (40 Lanes )
PCIe 1.0 @ 2.5GT/s
• I/O Virtualization • Device Sharing
PCIe 2.0 @ 5GT/s
PCIe 3.0 @ 8GT/s Atomic Ops, Caching Hints,
Improved PM, Lower Latency
PCIe 4.0 @ 16GT/s
OpenFabrics Alliance Workshop 2016
PCI EXPRESS®: A LAYERED ARCHITECTURE
6
Software
Mechanical
Data Link
Transaction
Logical PHY
PCI Express Layering - Enabler for Modularity and Reuse
CON F I G R E G
Electrical
ï PCI compatibility, configuration, driver model ï PCI Express enhanced configuration model
ï Logical connection between devices ï Reliable data transport services (CRC, Retry, Ack/Nak)
ï Market segment specific form factors ï Evolutionary and revolutionary
ï Split-transaction, packet-based protocol ï Credit-based flow control, virtual channels
ï Physical information exchange ï Interface initialization and maintenance
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
PCI EXPRESS®: A LOW-POWER INTERCONNECT
Source: IDF Sept’15
7
Item PCIe® 3.0 PCIe® 2.0 M-‐PHY Gear 3
Line Speed [Gbps] 8 5 5.83
PHY Overhead 128/130, 1[GB/s] 8/10, 500[MB/s] 8/10, 583[MB/s]
AcGve Power [mW] 60 (L0) 46 (L0) 58 (HS)
Standby Power [mW] 0.11 (L1.2) 0.11 (L1.2) 0.2 (Hibern8)
MB/mJ (higher = beRer) 14-‐18 8-‐12 8-‐12
Synopsys* Published Power Data • 5 mW/Gb/Lane – Active • 10 uW/Lane – Standby • Source:
http://news.synopsys.com/2015-05-21-Synopsys-Announces-Industrys-Lowest-Power-PCI-Express-3-1-IP-Solution-for-Mobile-SoCs
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
AGENDA
§ Introduction § PCI Express® (PCIe®) 4.0 Specification § Retimers for Extending Channel Reach § Form Factors § Compliance § Conclusions
8 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
PCI EXPRESS® 4.0 SPEED AND CHANNEL § PCIe® 4.0 data rate: 16.0 GT/s § Fully backwards compatible with PCIe 3.x (8.0 GT/s), PCIe 2.x (5.0 GT/s) and PCIe
1.x (2.5 GT/s); Preserves decades of ecosystem investment and innovation § Low cost, high performance, low power I/O technology § Connector improvements to reduce cross-talk and improve insertion loss at 8G
Nyquist § 2 connector 20” server PCIe topology needs either retimer or ultra low-loss PCB to
operate at 16.0 GT/s
Source: Intel Corporation
9 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
TRANSMITTER EQUALIZATION
§ 2.5 GT/s and 5.0 GT/s: Fixed de-emphasis for Link § 8.0 GT/s and 16.0 GT/s: Analysis demonstrates need for per Tx-Rx EQ
• Variations in receiver design, channel, PVT • Adjust each Tx by its Rx individually • Start with a preset value and then adjust dynamically
§ 4 Stages: • Stage 0: Preset values communicated at a lower data rate to downstream component • Stage 1: Link tries to stabilize at the preset at 1E-4 BER • Stages 2 and 3: Each receiver asks its transmitter to adjust till it achieves 1E-12 or better BER
12
BER Eye With Best Pre-Set BER Eye With Best Tx Coeff
Co-efficient based Tx EQ provides better margin
Results from an 18” 2C channel at 8.0 GT/s
Copyright © 2016 PCI-SIG® - All Rights Reserved Source: Intel Corporation
OpenFabrics Alliance Workshop 2016
LINK EQUALIZATION: STAGES 2 AND 3
13
8GT/s
TX
RX
RC
Stage 2: Intended for Upstream Port to achieve BER <= 10-12. Starts at the preset. Coefficients/presets are exchanged in sub-loops until this is accomplished within 24 ms. A port may decide not to make any new requests. Corresponds to Phase 2.
4 3 1 2 0
75
6 8
9
post cursor
Example: start from preset 7 (coef=4/6)
1st sub-loop
0
0
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
2nd sub-loop
a. EP Rx eval needs more pre, post ok
b. à d. repeat with (6/5
a. EP Rx eval reveals need for less post, more pre
b. EP sends (5/5) to RC c. RC applies (5/5) to TX d. RC echo’s (5/5) to EP
3rd sub-loop finds good result with (7/4)
so moves to phase 3
(5/5) (6/5) (7/4)
8GT/s
pre
curs
or
RX
TX
EP
(5/5) (6/5) (7/4)
Stage 3/Phase 3 is same as phase 2 in opposite direction Downstream Port may skip Phase 2/ 3 if presets are good enough for the Link
Source: Intel Corporation
§ Receiver full swing (FS) defines granularity of coeff ü Table at bottom-right is for illustrative purposes ü X-axis is pre-cursor, y-axis post-cursor, diagonal defines the boostline ü Each tile represents a coeff (e.g. p7=4/6, p8=5/5, etc) ü Numbers in tiles represent presets; black tiles are illegal coeff space
7
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
AGENDA
§ Introduction § PCI Express® (PCIe®) 4.0 Specification § Retimers for Extending Channel Reach § Form Factors § Compliance § Conclusions
12 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
RETIMER TO EXTEND CHANNEL REACH
§ Channel dxtension devices § Part of PCIe® base spec (3.1+) § Up to two retimers § Critical for longer server channels in
PCIe 4.0 architecture § Has the electrical and PHY Logical – no
link/transaction layer, no config registers, no in-band access by S/W
§ Actively participates in link training, power management, ppm difference adjustment, link equalization
§ Electrically separate links on either end of retimer
13
SoYware
TransacGon
Data Link
Confi
guraGo
n
Registers
ReGm
er Coverage
Mechanical
Logical PHY
Electrical
Upstream Component Downstream Port
Tx(B) Retimer 1
Rx(B) Tx(C) Rx(C)
Tx(A) Rx(A)
Rx(F) Tx(F)
Retimer 2 Rx(D) Tx(E)
Tx(D)
Rx(E)
Downstream Component Upstream Port
Source: Intel Corporation
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
AGENDA
§ Introduction § PCI Express® (PCIe®) 4.0 Specification § Retimers for Extending Channel Reach § Form Factors § Compliance § Conclusions
14 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
NVM EXPRESSTM DRIVING PCI EXPRESS® SSDS IN THE DATA CENTER
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2014 2015 2016 2017 2018 2019
Data Center SSD Units by Interface
PCIe SATA SAS
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2014 2015 2016 2017 2018 2019
Data Center SSD total GB by Interface
PCIe SATA SAS
Source: Forward Insights Q1’15
*
17 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
DATA CENTER FORM FACTORS FOR PCI EXPRESS®
42, 80, and 110mm lengths, smallest footprint of PCI
Express® (PCIe®) connector form factors, use for boot or for max storage density
2.5in makes up the majority of SSDs sold today because of ease of deployment,
hotplug, serviceability, and small form factor
Single-‐Port x4 or Dual-‐Port x2
M.2 CEM Add-‐in-‐card U.2 2.5in (aka SFF-‐8639)
Add-‐in-‐card (AIC) has maximum system compaGbility with exisGng
servers and most reliable compliance program. Higher power envelope, and opGons for height
and length
BGA
16x20 mm ideal for small
and thin plahorms
18
Source: Intel Corporation
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
INEXPENSIVE CABLING = INDEPENDENT CLOCK + SPREAD SPECTRUM (SSC) (SRIS)
• Challenge: PCI Express® (PCIe®) specification did not support independent clock with SSC ‑ SATA* cable ~ $0.50 ‑ PCIe cables include reference clock > $1 for equivalent cable
• PCIe base specification 3.0 ECNs approved 1) Requires use of larger elasticity buffer 2) Requires more frequent insertion of SKIP ordered set 3) Requires receiver changes (CDR) 4) Second ECN updates Model CDRs
• SRIS will create a number of new form factor opportunities for PCIe
‑ OCuLink* ‑ Lower cost external/internal cabled PCIe ‑ Next generation of PCI-SIG® cable specification
19
Example of possible PCIe cable
Separate Refclk Modes of OperaGon: 5600ppm (New -‐ SRIS) and 600ppm (ExisGng -‐ SRNS)
Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
AGENDA
§ Introduction § PCI Express® (PCIe®) 4.0 Specification § Retimers for Extending Channel Reach § Form Factors § Compliance § Conclusions
18 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
Clear Test Output Maps • Directly to Test Spec
PCI-SIG® Specs Describes Device requirements • 3.0 Base and
CEM specs
C&I Test Specs Define Test criteria based on spec requirements • Test Definitions • Pass/Fail Criteria
Test H/W & S/W Validates Test criteria • Compliance • Interoperability
PASS
FAIL Test Tools
And Procedures
PCI-SIG Specs
C&I Test Spec
Workshops
PCI EXPRESS® COMPLIANCE PROCESS
Predictable path to design compliance
21 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
AGENDA
§ Introduction § PCI Express® (PCIe®) 4.0 Specification § Retimers for Extending Channel Reach § Form Factors § Compliance § Conclusions
20 Copyright © 2016 PCI-SIG® - All Rights Reserved
OpenFabrics Alliance Workshop 2016
CONCLUSIONS
§ Single PHY standard covering applications and form factors from handheld to data center
§ Predominant direct I/O interconnect from CPU with high bandwidth § Active development to extend PHY rate to 16 GT/s § A variety of standard form factors covering applications from small/
light mobile to the data center § A robust and mature compliance and interoperability program § Low-power § High-performance
23
Data Center / HPC Embedded Mobile Source: Intel Corporation
Copyright © 2016 PCI-SIG® - All Rights Reserved
12th ANNUAL WORKSHOP 2016
THANK YOU Debendra Das Sharma, PhD
Senior Principal Engineer and Director I/O Technology and Standards Data Center Group, Intel Corporation Chair, PHY Logical Group, PCI-SIG®