Upload
jarah
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Focus: High Speed Serdes. Performance and Lessons from the CMS Global Calorimeter Trigger (GCT). Introduction. GCT hardware challenges over the last 2½ years. High Speed Serdes Data rate = 1.28 Gb/s Link speed = 1.6 to 2.0Gb/s Two problems Clock routing inside FPGA - PowerPoint PPT Presentation
Citation preview
GCT Performance & Lessons: Greg Iles15 September 2008 1
Performance and Lessons from the Performance and Lessons from the CMS Global Calorimeter Trigger (GCT)CMS Global Calorimeter Trigger (GCT)
Focus: High Speed SerdesFocus: High Speed Serdes
GCT Performance & Lessons: Greg Iles15 September 2008 2
IntroductionIntroduction
– GCT hardware challenges over the last 2½ years.
– High Speed Serdes• Data rate = 1.28 Gb/s• Link speed = 1.6 to
2.0Gb/s
– Two problems• Clock routing inside FPGA• Reflections on copper
links
– Latter forced redesign• New design for GCT-GT
links
RCT to GCT Links:Optical links based on 8B/10B
GCT to GT Links:Legacy links from old GCT projectDC coupled electrical links
GCT Performance & Lessons: Greg Iles15 September 2008 3
RCT to GCTRCT to GCT
Leaf card used to receive 32 fibres @ 1.6Gb/s8B/10B encoding with CRC checkLink operating synchronously to TTC clock (80MHz)Rx-Elastic buffer bypassed for low latency operation.
Source card receives 2 RCT cables with 32 differential ECL pairs each
Approx 15m of fibre
Patch Panel
GCT Performance & Lessons: Greg Iles15 September 2008 4
Discovery
– Detected CRC errors in USC55• Rare, but reproducible• Linked to particular 1 out of 16 MGTs
– Hardware verification tests OK• Not identical test because it used local oscillator.
– Moved system back to Lab• Tried different TTC clocking methods• Suspected clocking scheme or QPLL
– req. 40ps pp, qpll < 50ps pp
– Opted to run links with local 100MHz oscillator.
GCT Performance & Lessons: Greg Iles15 September 2008 5
100MHz Links
Data @ 80MHz
Low Latency Clk Domain
Bridge
Inserts extra commmas
Fifo / DualPort RAM
If ‘extra’ comma remove enable
MGTTLK2501
Data & Extra Comma @ 100MHz
2.0 Gb/s Serial Link
Data @ 80MHz
Data & Extra Comma @ 100MHz
Lose ~1 bx in
latency here
Gain ~1 bx in latency here because link
running faster
Run at 100MHz – But still problem persists – NOT TTC clock issue !
GCT Performance & Lessons: Greg Iles15 September 2008 6
Local Clock Resources
– Changed to asynchrounous mode
• Problem persists• Arggghhhh !!!
– Reverted to very old firmware• Links work OK…
– The implication is a timing error within the FPGA
• Look at fifo bridge from 100Mhz to LHC-80MHz
– More info from Xilinx App notes:
• XAPP763, XAPP670
Limited number of global clock nets. Use MGT local clock resources.
Xilinx Virtex II Pro XC2VP70
GCT Performance & Lessons: Greg Iles15 September 2008 7
MGT Rx Recovered Clk (Local)
For low latency operation use MGT Rx Recovered clock and external fifo
16 MGTs in total:
Local clocks = 10Global clocks = 6
GCT Performance & Lessons: Greg Iles15 September 2008 8
Part 1: Conclusios– Local clock fix:
• Removed async signal used in 100Mhz domain, but sourced from 80MHz domain.
• Added CRC checker to 100MHz domain
– Later problems with Global clock:
• Force only local clk resources
– Low latency operation can be non trivial
• Doesn’t follow “standard” design
• Bypasss of the rx elastic buffer can be challenging
– Use oscillator to provide low jitter reference clock
END – Part 1RCT to GCT links
NEXT-Part 2:GCT to GT links
GCT Performance & Lessons: Greg Iles15 September 2008 9
GCT to GT
– Legacy link from original GCT. Required by GT
• National Semiconductor DS92LV16• Running just above spec (i.e. 80.1 v 80 MHz ref clock)• DC coupled• Uses Infiniband x1 cable, 100 Ohm Differential
– Original testing with 3.0m Leoni cables• Final system required extra cables, but unable to obtain
more.
– Alternative supplier found: 1.5m Amphenol cables delivered.• Detect errors with random patterns
Problem detected when we changed cables, but NOT a cable issue
Thanks to Jan Troska and Francois Vasey for allowing us to borrow a high bandwidth scope
GCT Performance & Lessons: Greg Iles15 September 2008 10
Cable + ConnectorCable + Connector
Measured acrosstermination res.
Trace measured with 1.5m Amphenol cable, connector and 100 ohm termination resistor
Threshold at +/- 100mV for eye diagram data
GCT Performance & Lessons: Greg Iles15 September 2008 11
Powered & Populated BoardPowered & Populated Board
Trace measured on current GTI card in loopback mode
GCT Performance & Lessons: Greg Iles15 September 2008 1212
ReflectionsReflections
Amphenol 1.5m, reflection ~ 252mV
Leoni 3.0m
Leoni 0.5m, reflection ~226mV Leoni 2.0m, reflection ~168mV
GCT Performance & Lessons: Greg Iles15 September 2008 13
Part 2: Conclusion
– Try not to operate parts at specification limit
– DC balance with 8B/10B (or equivalent)
– Optically isolate– Try to have just 1 transmission
line• i.e. Do not have 1 inch of
FR4 + 1m cable + 1 inch of FR4
– Links to GT will be replaced.
END – Part 2GCT to GT links
NEXT-Part 3:New GCT-GT Interface
GCT Performance & Lessons: Greg Iles15 September 2008 14
OGTI: Replacement for GTI card
Optical Global Trigger Interface
– Why necessary• In original GCT project the GCT-
GT interface was at the extreme limit of design specs
• Switching to to a shorter cable was sufficient to break the system.
– Capable of both: • Transmitting (Tx)• Receiving (Rx)
GCT Performance & Lessons: Greg Iles15 September 2008 15
Technology choice
– Xilinx Virtex 5, XC5VLX110T-3FF1136C• 3rd generation Xilinx serial link – 3.75 Gb/s• Can run at higher speed to reduce latency.• Baseline to run links at 2.4Gb/s
– 64% max spec
– POP4 fibre optic transceiver• Transceiver: 4 in, 4 out • 850 nm multimode, 1 to 3.125 Gb/s• Baseline to run links at 2.4Gb/s
– 77% max spec
• Two suppliers– AvagoTech: HFBR-7934Z– Zarlink: ZL60304
GCT Performance & Lessons: Greg Iles15 September 2008 16
OGTI
POP4 (optics)HFBR-7934Zor ZL60304 x4 Tx or Rx
Cross-point switch allowschoice of upto
4 clocks
Xilinx V5XC5VLX110T-
3FF1136C
Dual CMC header
(~340 I/O)
GCT Performance & Lessons: Greg Iles15 September 2008 17
Theoretical Mininmum Latency
All numbers in bx CurrentDesign
Virtex5 1.6Gb/s
Virtex5 2.4Gb/s
Virtex5 3.2Gb/s
Send 2nd word 0.5 0.5 0.5 0.5
IOB and 80MHz->LinkWordSpeed 0.0 0.3 (e) 0.9 (e) 0.8 (e)
Serdes Tx-Rx Delays (datasheet) 1.8 5.0 (d) 3.3 (d) 2.5 (d)
Cable/Fibre (1.5m) 0.3 0.3 0.3 0.3
Serdes to FPGA 0.5 0.0 0.0 0.0
Sync to Local LHC clk 1.0-2.0 (c) 1.0-2.0 (c)
1.0-2.0 (c)
1.0-2.0 (c)
Total 4.1 7.1 6.0 5.1
Latency increase comparted to TDRLink(3bx) + Sync(1.5bx) = 4.5bx
-0.4 +2.6 +1.5 +0.6
Notes:(a) Latency calculated from clk edge sampling data into SerDes/FPGA until FPGA fabric(b) The numbers are obtained from datasheets and theoretical performace.(c) Assumed 1.0 bx here, but could be 2.0 depending on sync method.(d) No elastic buffer. Tx = 9.5 x RXUSRCLK, Rx = 10.5 RXUSRCLK(e) Assumed ¼ bx for IOB + 2 link speed clks (4 x ½). Based on current 80MHz to 100MHz
bridge.(f) All numbers worst case, but no contingency. (g) block ram performance dependent.
GCT Performance & Lessons: Greg Iles15 September 2008 18
Future
– Designing & testing at > 5Gb/s is unpleasant• e.g. the inductance of a few mm of scope probe is a
problem• Xilinx recommends stubs on vias are removed.
– rememeber that average PCB board is only 1.6mm thick !
– Intel, HP, IBM, Sun Microsystems, Luxtera, Kotura all working on silicon photonics
• i.e. replace high speed electrical links on the PCB with optical interconnects.
• Others better placed to comment on this.
GCT Performance & Lessons: Greg Iles15 September 2008 19
End