23
H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010 ASIC FEAFS - A front end chip for SLHC CMS strip module - Hervé CHANAL Yannick ZOCCARATO (CNRS IN2P3 MICHRAU)

presentatio FEAFS 06 2010

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: presentatio FEAFS 06 2010

H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

ASIC FEAFS

- A front end chip for SLHC CMS strip module -

Hervé CHANAL

Yannick ZOCCARATO

(CNRS IN2P3 MICHRAU)

Page 2: presentatio FEAFS 06 2010

2H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Outline

1. Introduction2. Cluster finding

1. Principle2. Algorithm

3. Data links1. FIFO2. RAM3. Frames4. Communication controller5. Simulations

4. Conclusion

Page 3: presentatio FEAFS 06 2010

3H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Electronic design for a SLHC CMS strip module :FEAFS ASIC

Data Concentrator ASIC

Current plans for CMS strip tracker

Page 4: presentatio FEAFS 06 2010

4H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

– Find clusters on a set of 128 strips,– Send two different flows of data to the ASIC

concentrator:• read out data:

– average rate : 100 kHz (100kHz * 128b = 12.8 Mbps)• cluster data for trigger:

– average number of clusters (size<3) : 0.25 at 20MHz – Foreseen data packet:

2 bits (No. Clusters) + [5 bits (add) + 2 bits (width) + 3 bits (time stamp)] = 12 bits for 1 cluster

→ average rate: 12b * 0.25 * 20MHz = 60 Mbps

→ Average bandwidth of the link: 60 + 12.8 = 72.8 Mbps.

Design needs for FEAFS

Page 5: presentatio FEAFS 06 2010

5H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

FEAFS architecture

Page 6: presentatio FEAFS 06 2010

6H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Part I:Finding the clusters

Page 7: presentatio FEAFS 06 2010

7H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Cluster Finding : Principle

• Find all the clusters of triggered strip in two set of 64 strips

• Compute their address (degraded to 5 bits) and the number of adjacent triggered strips

• Report only these with a size bellow a programmable threshold

• Find clusters overlapping in the two layer in programmable window with an offset

… 0 0 1 1 0 0 0 0 0 0 1 1 1 0 …

Strip n+14 n+1…

Clusters

n

… 0 0 0 0 0 0 0 0 0 1 1 0 0 0 …

… 0 0 1 1 0 0 0 0 0 0 1 1 1 0 …

No overlap Overlap

Page 8: presentatio FEAFS 06 2010

8H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Cluster Finding : Architecture

• Two flows up to the overlap finding• The bus size is reduced at each step by priority encoders• Wake up on the internal registers to reduce the consumption

Clusterresearch

Thresholdcut

Priorityencoder

Overlapfinding

Up to 32 clusters

Up to 32 clusters

Up to 6 clusters

Up to 12 clusters

Up to 4 clusters

Priorityencoder

Wake up Wake up Wake up

64 inputsfrom

comparators

Fifofull

Slow control

Clusterresearch

Thresholdcut

Priorityencoder

Up to 32 clusters

Up to 32 clusters

Up to 6 clusters

64 inputsfrom

comparators

Page 9: presentatio FEAFS 06 2010

9H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Cluster Finding : Algorithm

• 2x64 strips ⇒ 2x16 blocks of 4 strips– Same granularity as the degraded address (5 bits ⇒ 128 blocks)– Need for boundary block to merge clusters

• 32 LUT of 16x16 bits used to find clusters in each block (up to 2 clusters/block)

• Address of a cluster =- Address of its block - Arbiter when a clustercross a boundary

• Only 1 clock cycle

LUT

merger

Up to 2 clusters

4 inputsfrom

comparators

LUT

merger

Up to 2 clusters

4 inputsfrom

comparators

LUT Up to 2 clusters

4 inputsfrom

comparators

Up to 2 clusters

Up to 2 clusters

Page 10: presentatio FEAFS 06 2010

10H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Cluster Finding : Bus Size

• The first priority encoder can remove clusters which couldhave been kept in the overlap finding and in the 4 cluster output bus– A dedicated test bench with random inputs– Simulation of the input load (number strips triggered)

• Simulations shows no loss with 6 clusters/bus with 1% stripoccupency

Priorityencoder

Overlapfinding

Up to 32 clusters

Up to 6 clusters

Up to 12 clusters

Priorityencoder

Up to 32 clusters

Up to 6 clusters

Bus location

Simulation results

Page 11: presentatio FEAFS 06 2010

11H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Cluster Finding : Windowing

• Comparison of the degraded address (on 4 bits) between the retreived 6 clusters from the two set

• Offset to handle possible position difference between the twolayers

• Keep clusters if for a cluster with an address a1 from layer 1, another cluster from layer 2 with an address a2 verifyabs(a1+offset-a2)≤threshold.

… 0 0 1 0 0 0 1 0 0 0 0 0 0 0 …

… 0 0 0 0 0 1 1 0 0 0 0 0 0 0 …Plate 1

Plate 2

Cluster dropped

block

Cluster keept

Ex. : window of size 0, offset 1

Page 12: presentatio FEAFS 06 2010

12H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Part II:Sending the data

Page 13: presentatio FEAFS 06 2010

13H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

FEAFS architectureData sending part

Page 14: presentatio FEAFS 06 2010

14H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

write_data wdata read_datardata

wenable

write

MEMORY

waddr

wraddresspointer

waddr

write

wptr

raddr

sync_wptrsync_rptr

wptr rptr

raddr

rdaddresspointer

rptrreadread

Reset management

asynchronous reset

wclk rclkwreset rreset

waddr raddr

writefull

empty

reset

FIFO

• 2 FIFO of 16 words depth for clusters and readout data• Used to safely pass from LHC clock domain (40MHz) to output link clock domain (up to 100MHz).

Page 15: presentatio FEAFS 06 2010

15H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Clusters: Output Frame

27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

Cluster 1

Size 1Add 1

29 2831 30

0 1

0 1

0 1

0 1

Id Nb cl

0 0

0 1

1 0

1 1

Cluster 2 Cluster 3 Cluster 4

Size 1Add 1 Size 2Add 2

Size 1Add 1 Size 2Add 2 Size 3Add 3

Size 1Add 1 Size 2Add 2 Size 3Add 3 Size 4Add 4

Word 1 Word 2 Word 3 Word 4 Word 5 Word 6 Word 7 Word 8

Chip output: 4 bit bus

Cluster Frame:From 3 to 8 words of 4 bits, depending on the number of clusters.

Page 16: presentatio FEAFS 06 2010

16H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Readout :Output Frame

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

N° StripId word

1 0 0 0

Word 1 Word 2 Word 3 Word 4 Word 5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

N° StripId word

1 0 0 1

Word 1 Word 2 Word 3 Word 4 Word 5

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

N° StripId word

1 1 1 1

Word 1 Word 2 Word 3 Word 4 Word 5

113114115116117118119120121122123124125126127128

The 128 strips are split in 8 words of 20 bits.

4 MSB as Id and word count number.

Trigger frames can be inserted between two readout frame.

Page 17: presentatio FEAFS 06 2010

17H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Performs the output bus arbitration with 4 running modes as a function of the FIFO states.

Communication controller

Signbal « trigger off » and « busy »activated

Readout DataTrigger DataSurvival(Readout and trigger FIFO full)

Signal « Busy »activated

IgnoredReadout DataBusy(Readout FIFO full)

Signal « trigger off » activated

IgnoredTrigger DataDerated(Trigger FIFO full)

Readout DataTrigger DataNormal

SpecialPriority 2Priority 1Communication Mode

Page 18: presentatio FEAFS 06 2010

18H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Latency

• Study the latency of thefull chip with respect to theL1 rate, the triggered striprate and the output stage frequency

Page 19: presentatio FEAFS 06 2010

19H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

I2C

Monitoring and slow control through an I2C interface :

Slow control for:– Gain of preamplifier,– Discriminator Threshold,– Strip enable,– Cluster threshold,– Coincidence offset,– Coincidence window,– Readout and cluster Data for test,– Configuration register.

Monitoring for :– Counter of cluster loss,– Counter of data trigger loss,– Counter of data trigger loss.

Page 20: presentatio FEAFS 06 2010

20H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Conclusion

• First prototype develloped in130nm from IBM with VCAD Standard cells• Only 32 inputs : internalmutiplexer• Size: 4mm²• Sent 31th May to the foundry• Estimated power (with 20% activity on inputs): 60mW

Page 21: presentatio FEAFS 06 2010

21H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

SPARES

Page 22: presentatio FEAFS 06 2010

22H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Read out data PipelineData from comparators

128

Readout dataFIFO

L1

Data_in

Communication controller

Data_in

Wr

Full

Full

MUX

Data_out

Data_out

status

Data_out

Wr for time

Busy,trigger-off

MUX sel

I2C

SCL

SDA

Wr_Clk Wr

Rd Rd_Clk

TriggerdataFIFO

Rd

Wr_Clk

Rd_Clk

Empty

Empty

registersPriorityencoderCoincidence researcher Up to 8

clusters

Priorityencoder

Priorityencoder

Cut on width

Clusterresearcher

Up to 32cluster

Up to 32cluster

Cut on width

Up to 32cluster

Up to 32cluster

Up to 4cluster

Up to 4cluster

LHC clock

Link clock

SPARE : Numerical part details of FEAFS

Page 23: presentatio FEAFS 06 2010

23H. CHANAL, Y.ZOCCARATO– journées VLSI – juin 2010

Readout data pipeline

Store the readout data while the trigger is being processed.

When the trigger is coming the data is sent in FIFO

Strip 1

Strip 2

Strip 128

Strip 3

Strip 4

LHC clk

T+1 T+2 T+3 T+4 T+5 T+6 T+7 T+8 T+9 T+135