Upload
raduseica
View
234
Download
0
Embed Size (px)
Citation preview
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
1/14
ARTICLE IN PRESS
Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457
A 96-channel FPGA-based Time-to-Digital Converter
(TDC) and fast trigger processor module with multi-hit
capability and pipeline
Mircea Bogdana, Henry Frischa, Mary Heintza, Alexander Paramonova,,Harold Sandersa, Steve Chappab, Robert DeMaatb, Rod Kleinb, Ting Miaob,
Peter Wilsonb, Thomas J. Phillipsc
aEnrico Fermi Institute, University of Chicago, USAbFermilab National Accelerator Laboratory, USA
cDuke University, USA
Received 2 July 2005; received in revised form 15 August 2005; accepted 18 August 2005
Available online 7 September 2005
Abstract
We describe an field-programmable gate arrays based (FPGA), 96-channel, Time-to-Digital converter (TDC) and
trigger logic board intended for use with the Central Outer Tracker (COT) [T. Affolder et al., Nucl. Instr. and Meth.
A 526 (2004) 249] in the CDF Experiment [The CDF-II detector is described in the CDF Technical Design Report
(TDR), FERMILAB-Pub-96/390-E. The TDC described here is intended as a further upgrade beyond that described in
the TDR] at the Fermilab Tevatron. The COT system is digitized and read out by 315 TDC cards, each serving 96 wires
of the chamber. The TDC is physically configured as a 9U VME card. The functionality is almost entirely programmed
in firmware in two Altera Stratix FPGAs. The special capabilities of this device are the availability of 840 MHz LVDS
inputs, multiple phase-locked clock modules, and abundant memory. The TDC system operates with an input
resolution of 1.2 ns, a minimum input pulse width of 4.8 ns and a minimum separation of 4.8 ns between pulses. Each
input can accept up to 7 hits per collision. The time-to-digital conversion is done by first sampling each of the 96 inputs
in 1.2-ns bins and filling a circular memory; the memory addresses of logical transitions (edges) in the input data are
then translated into the time of arrival and width of the COT pulses. Memory pipelines with a depth of 5 :5ms allowdeadtime-less operation in the first-level trigger; the data are multiple-buffered to diminish deadtime in the second-level
trigger. The complete process of edge-detection and filling of buffers for readout takes 12 ms. The TDC VME interfaceallows a 64-bit Chain Block Transfer of multiple boards in a crate with transfer-rates up to 47 Mbytes/s. The TDC
module also produces prompt trigger data every Tevatron crossing via a deadtimeless fast logic path that can be easily
reprogrammed. The trigger bits are clocked onto the P3 VME backplane connector with a 22-ns clock for transmission
to the trigger. The full TDC design and multi-card test results are described. There is no measurable cross-talk between
www.elsevier.com/locate/nima
0168-9002/$- see front matter r 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.nima.2005.08.071
Corresponding author. Tel.: +1 773 7027479; fax: +1 773 8345959.
E-mail address: [email protected] (A. Paramonov).
http://www.elsevier.com/locate/nimahttp://www.elsevier.com/locate/nima7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
2/14
channels; linearity is limited by the least-count time bin. The physical simplicity ensures low-maintenance; the
functionality being in firmware allows reprogramming for other applications.
r 2005 Elsevier B.V. All rights reserved.
PACS:29.40.Gx; 07.50.E; 29.40.+r
Keywords:TDC; FPGA; Pipelined; Multi-hit; Fast trigger processer
1. Introduction
The Collider Detector at Fermilab (CDF), is a
large (5000-ton) detector of particles produced in
protonantiproton collisions at 1.96 TeV at the
Fermilab Tevatron[2]. The detector consists of a
solenoidal magnetic spectrometer surrounded by
systems of segmented calorimeters and muon
chambers. Inside the solenoid, precision tracking
systems measure the trajectories of particles; the
particle momenta are measured from the curvature
in the magnetic field and the energy deposited in
the calorimeters. The tracking systems consist of a
silicon-strip system with 4750,000 channels
around the beam-pipe, followed by the Central
Outer Tracker (COT), a large cylindrical drift
chamber with 30,240 sense wires arranged in 96
layers divided into 8 superlayers of 12 wireseach[1]. Four of the layers have the wires parallel
to the beam axis; the remaining four are tilted by
2 to provide small-angle stereo for 3D recon-
struction of tracks. The maximum drift time of the
COT is 200 ns; the maximum drift length is
0.88 cm.
During the present Run II, which started in
2001, the peak luminosity of the Tevatron has
grown to over 1032 cm2 s1, a factor of more than
five higher than in Run I. The Tevatron operates
with a time between beam crossings of 396 ns, withthe result that the occupancy (hits/channel) in the
COT increases with luminosity as the average
number of protonantiproton collisions per bunch
crossing is now substantially greater than one.
A broad range of efforts are underway to upgrade
the readout bandwidth to allow operation at
luminosities up to 3 1032 cm2 s1.
The COT is used to provide a precise measure-
ment in the magnetic spectrometer of the trajec-
tories of the particles produced in the high-energy
protonantiproton collisions. These measurements
are made by recording the time of arrival of
electrical charge at sense wires in the COT.
A particle traversing the COT perpendicular to
the beams traverses 96 layers of sense wires; there
are a total of 30,240 sense wires in the COT. The
measurement of the time of arrival at each wire is
made with a multi-channel Time-to-Digital con-
verter (TDC)[3].
The conversion of time to a digital signal is a
well-developed and sophisticated field[47]. In this
note we describe the design of a new 96-channel
TDC and trigger logic module designed for the
COT implemented in two 48-channel field-pro-
grammable gate arrays (FPGAs) [8]. A third,
smaller, FPGA serves as the VME interface. The
other chips on the board are limited to delay lines,
buffers on the input and output signals to theconnectors, and DC-to-DC converters to supply
voltages not available in the existing VME [9,10]
crates.
Using FPGAs for the combined TDC and
trigger logic functionalities has the benefits of
negligible cross-talk, linearity limited only by the
least count, integrated flexible signal processing for
trigger logic, and high-reliability due to very-low
chip count. In addition, because the implementa-
tion is in firmware, the chip can easily be
reconfigured for enhanced capability; the abilityto reconfigure also shortens the design cycle as the
board does not need to be rebuilt (in general)
during prototyping.
Thirty working prototypes have been built and
tested. A comprehensive suite of test routines,
including some that exploit the capabilities of large
FPGAs to implement sources of test data, has
been implemented and documented. We present
results on performance and readout bandwidth.
Fig. 1shows one of the 30 preproduction boards.
ARTICLE IN PRESS
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 445
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
3/14
2. TDC Specifications
A summary of the TDC physical and opera-
tional characteristics is given in Table 1. The
schematics of the board are available at [11].Details of how the TDC operates are given in the
text below.
3. Principle of operation of the TDC
Two functions are implemented in the firmware
of the FPGAs: a conventional TDC, digitizing the
time of arrival of signals, and the processing of
COT data for the trigger system. In this section we
describe the TDC implementation.Secondary particles from antiprotonproton
(pp) collisions traverse the Tevatron beam pipe,
the silicon-strip vertex detector, and then the COT
drift chamber volume. The charged particles ionize
the gas in the drift chamber volume; the tracks are
measured from the time of arrival of the ionization
on the sense wires of the COT[1]. These electrical
pulses (hits), are amplified and shaped by the
Amplifier Shaper Discriminator (ASDQ) cards[1]
directly on the end-plates of the COT, and
transmitted to Repeater cards that drive the
cables to the VME crates on the outside of
the magnet yoke that contain the TDC boards.
The TDC is used to digitize the time of arrival and,as a measure of pulse height, the width of the
pulses from the Repeater cards.
The time-to-digital conversion is implemented
with two Altera Stratix FPGAs, each handling 48
sense wires. This device has an LVDS differential
I/O interface that consists of one dedicated serial-
izer/deserializer circuit [13] for each of the 48
differential I/O pairs. Serial data are received
along with a low-frequency clock. An internal
Phase-Locked Loop (PLL) multiplies the incoming
clock by a factor of 1 to 10. Each input signal is
sampled at the resulting high-frequency clock
rate, converting it to a (1-bit) serial stream, which
is then shifted serially through a shift register.
The shift register is read out as a parallel word
at the low-frequency clock rate, thus convert-
ing the serial data stream into a parallel data
stream that contains the input data sampled at the
higher clock rate. In this application the low-
frequency clock is internally generated with a 12 ns
period and the multiplier factor is set to 10, for a
resulting 1.2 ns sampling rate of the incoming
LVDS signal and a 10-bit wide parallel outputdata stream.
Fig. 2illustrates the serial-to-parallel conversion,
as seen in the Altera Quartus II [14] simulation
window. The input pulse is converted into a 10-bit
parallel data stream, clocked out as successive
words at a 12 ns period. The leading edge of a hit
in the tracking chamber is determined by inspecting
this stream, two words at a time, and counting the
number of 0 bits before the first 1 bit of a
string of at least four 1 bits in 2 consecutive
words at a time. The width of a hit is calculated bycounting the number of successive bits (either 0
or 1) until the start of a string of at least four
consecutive 0 bits occurs.
4. TDC BoardBlock diagram
The physical layout and the data flow on the
board are presented in Fig. 3. We step through
each element in turn below.
ARTICLE IN PRESS
Fig. 1. The CDF-II TDC board. The two large chips with silver
heat sinks are the TDC FPGAs; the large black objects are DC-
to-DC converters (the layout allows addition DC-to-DCconverters, not needed and hence not stuffed on this board).
The FPGA for the VME interface can be seen in the upper left-
hand corner. Connector headers and dip switches near the
center of the board allow debugging with a logic analyzer.
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457446
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
4/14
The Front Panel (on the left in Fig. 3) receives
96 differential inputs, arranged in four 24-
channel connectors, that receive pulses from
the amplifier/shaper circuits of the COT. The
signals are first applied to a receiver block
that converts them from a CDF-specific
quasi-LVDS signal [12] into standard LVDS
and passes 48 of them directly to each of the
two FPGA TDC chips, which have identical
designs.
Each of the two TDC FPGAs (TDC Chips)
does the time-to-digital conversion for 48 wires
to generate the Hit-Count and Hit-Data results.
These are stored in internal VME Readout
buffers implemented on the chip. The TDC
Chips also generate prompt data for the Level-1
track trigger processor (XFT).
The VME interface block is implemented with
an Altera Apex FPGA [15]. This block co-
ordinates VME access to the TDC Chips for
ARTICLE IN PRESS
Table 1
The physical and operational characteristics of the CDF-II TDC
Characteristic Values Comment
TDC digitization performance
Channels 96 48/FPGA
Time bin size 1.2 ns
Hits/channel p7 hits Configurable via VME
Min time between hits 4.8 ns
Full scale range 304.8 ns 254 1.2ns
Min width/hit 4.8 ns
Max width/hit 304.8 ns
Differential non-linearity o200 ps See Section 8.1
Accuracy (jitter) p1 count See Section 8.1
Channel-to-channel skew o100 ps
L1 pipe-line size 512 words=6:144ms 480-bit wordsTest data RAM size 512 words=6:144ms 480-bit words
L2 buffer length p64 words/768 ns 480-bit words; ConfigurableProcessing time 12ms after L2A Includes readout packing
Min interval between L2As 12ms L2A is Level-2 Accept
Prompt (Trigger) Outputs
XFT trigger bits/wire 6 From 11 time windows
XFT time-window 6 ns Min 1.2 ns; Max 12 ns
# of XFT time-windows 11/wire Mapped into 6 trigger bits
Trigger latency 80 ns after BC First word out
Trigger output freq. 32 bits/ 22 ns
Trigger output 43 pins TTL, on VME P3
Readout characteristics
VME interface VME64 Implemented in FPGA
VME readout modes A32/D32, A32/D64 D64 in CBLT mode onlyCBLT64 transfer rate 47 MBytes/s Burst speed
Test modes Data generator Internal 8192 wordmemory
Physical characteristics
Physical format 9U VME ANSI VIPA[9]
Power requirements (V/A) 5 V=15A; 5 V=2 AInput connectors 68-pin Mini-D Ribbon
Input levels LVDS CDF uses quasi-LVDS[12]
Front panel LEDs 1 Triple LED/FPGA Configurable in firmware
Trigger output connector VME P3
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 447
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
5/14
regular and Chain Block Transfers (CBLT) [9]
in both 32- and 64-bit modes. The VME chip
itself is connected to only the 16 least significant
bits of the VME data bus (the TDC Chips
connect to all 32 data lines).
Control signals from the CDF data acquisition
and trigger systems are brought onto the TDC
board using user-defined pins of the VME P2
backplane connector. The control signals are
bussed on the backplane of the CDF-standard
9U VME/VIPA crate[9,10]from a CDF TracerCard [16] to each of the TDC boards. The
specific control signals used by the TDC are as
follows:
The CDF system clockthis is the master
reference signal for the CDF data-acquisi-
tion system. The clock has a 132-ns period
and is synchronous with the accelerator RF
structure. On the TDC board the differential
PECL signal from the backplane is first
converted to TTL, then phase-locked, buf-
fered and applied to the TDC chips. The
clock signal is also applied to the TDC chips
after passing through a pair of program-
mable delay lines (0.25 ns granularity, 64 ns
range).
The Bunch Crossing signal (BC), indicating
that a clock corresponds to a crossing of
proton and antiproton bunches.
The Bunch 0 signal (B0) marks the first
proton bunch which comes once per cycle
around the Tevatron ring. The Level l and 2 trigger Accept/Reject
signals, as well as the Level 2 Buffer address
bits.
A calibration pulse from the VME back-
plane. This is converted from PECL to ECL
and can be applied to a pair of pins on each
of the four front panel connectors. The pulse
is thus sent to the amplifierdiscriminator
shaper card (ASDQ) of the COT[1] and is
used for testing and calibration.
ARTICLE IN PRESS
12ns clock
Input pulse
Leading edge Trailing edge
bit 9
bit 8
bit 7
bit 6
bit 5
bit 4
bit 3
bit 2
bit 1
bit 0
OUTPUT
0ns
Word 0 Word 1 Word 2 Word 3 Word 4 Word 5 Word 60000000111 1111111111 1111000000
Fig. 2. An example serial-to-parallel conversion of one of the 96 input channels into a 10-bit-wide parallel data stream as seen in the
Quartus II[14]simulation window. The top two traces are one input line from the tracking chamber and the locally-generated 12-ns
clock. The next 10 traces are the 10 bits of parallel data from the serial-to-parallel conversion, output as one word every 12 ns. The
parallel data, shown at the bottom of the figure, are then examined two words at a time for transitions that signify a leading or trailing
edge of a hit in the COT.
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457448
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
6/14
The VME P3 backplane connector is used to
transmit trigger flags generated by the TDC to
the eXtremely Fast Tracker (XFT) processor to
identify tracks in the COT for the Level-1
trigger.
Each FPGA is connected to a 20-pin header
so that it is easy to use a logic analyzer for
testing and diagnosis. Signals can be routed
to the header by programming the FPGA
firmware.
5. The TDC FPGA Chip
The block diagram of a TDC Chip is presented
in Fig. 4. There are two major data paths inside
the TDC Chip, one to record the COT hits for
VME readout, and one for the generation of the
prompt trigger bits (Trigger Primitives) for the
XFT trigger track processor. The Chip is also
provided with a Test Data generator, an LVDS
pulse generator and a PLL clock generator.
The major functional blocks inside the TDC
Chip are described individually below. These are
implemented as firmware and are optimized for the
CDF application; other applications can be
accommodated by firmware changes.
5.1. The input block
Each TDC Chip has four banks of 12 high-speed LVDS inputs. From each bank a 120-bit
wide data bus passes data to the MUX/MASK
block (see Fig. 4), which can be set under VME
control to block out any unwanted channel (for
example, a COT wire that is continuously set true
due to a some failure). The MUX/MASK block
also allows internal testing of the TDC Chip by
allowing the inputs to be switched to a test pattern
generated inside the Chip with the Test Data
Generator block (described in Section 5.5). The
ARTICLE IN PRESS
FP Conn4
FP Conn3
FP Conn2
FP Conn1
68 pin LVDS
68 pin LVDS
68 pin LVDS
68 pin LVDS
From COT
TDC Chip_1
STRATIX
EP1S30F780C6
TDC Chip_0STRATIX
EP1S30F780C6
P
P
P
1
2
3
VME
EP20K100QC240-3V
CDF_CLK_DEL
2 x 3D3418-0.25S
VMEBuffe
r
Buffer
to XFT
CDF Control
VME
VME
VMEBuffer
Receiver
Receiver
Receiver
Receiver
Fig. 3. The physical layout of the TDC board. The four input connectors, each with 24 LVDS channels, are on the left; the VME
backplane connectors are on the right. The elements are described in turn in the text.
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 449
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
7/14
fast digitization and conversion to a 10-bit wide
data stream for each channel then follows.
5.2. The pipeline and the Level 2 buffer system
The CDF Level-1 trigger is deadtime-less, with
all front-end data held in a pipeline for 5:544mswhile the Level-1 trigger decision is being made
[17]. On the TDC card the delay is implementedwith a clocked pipeline. On receipt of a Beam
Crossing (BC) signal, an input memory address
counter is set to zero; the counter then increments
on every 12 ns clock. The phase of this input pulse
can be adjusted on each board in 0.25 ns steps, set
by VME, to compensate for signal propagation
and input signal length differences. The memory is
512 words deep, each word containing 480 bits (48
channels 10 bits/channel). The memory has two
ports; the first port is always writing and the
second port is always reading. The adjustable
address of the second port, an offset from the first
port, is set to establish the desired delay period for
the pipeline. The maximum delay is 6:1ms.To reduce deadtime in the CDF Level-2 (L2)
trigger system, on a Level-1 Accept signal (L1A)
the data from a given beam crossing are trans-
ferred to one of four Level-2 buffers, awaiting a L2
trigger decision [17]. The four L2 buffers areindependently controlled by an accompanying L2
buffer-selection signal. These signals are in phase
with the CDF clock pulse. The L2 buffers are
two-port memories; each has a respective write
address counter for the input write port and all
share a single address counter for their (output)
read port. This allows the writing of a second
Level-1 buffer while a first is still collecting data.
The write clock is synchronous with the logic in
the pipeline, 12 ns per tick. The data are written
ARTICLE IN PRESS
SERDES
IN
TEST-DATA RAM*
MUX
MASK*
RAM*
512 words
00
01
10
11
RAM*64 words
Inputs from COT
Edge
Detector*
The Pipeline
L2 Buffers
PLL
XFT
Block* XFT-DAQ*
Hit Count Buffer*
7 VME32 words
Hit Data Buffer*168 VME32 words
L1A
L2A
CDF CLOCK
Tx Pulse
RAM*
512 words
SERDES
OUTTx Out Pulse
to front panel
To VME
Interface VME
Decoder
VME
Access(*)
Delayed 12ns , 22ns , 66ns
clocks
readwrite
To P3
Fig. 4. The functional block diagram of the TDC FPGA (TDC Chip). All processing is determined by programming in firmware. Each
Chip handles 48 LVDS channels (shown as coming into the SERDES block in the upper left). The prompt trigger flags for the XFT
trigger processor are output through the P3 VME connector. Data are read out by the CDF Data Acquisition system from the Hit
Count and Hit Data Buffers. An asterisk indicates registers or memories that are VME-accessible. The individual blocks in the diagram
are described in the text.
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457450
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
8/14
10-bits-wide per channel, so that 48 channels are
written at each tick in a 480-bit wide word. To
achieve a maximum of 384 ns for the time range,
the maximum length is 32 words, set via VME atinitialization (the maximum drift time in the CDF
COT is 200ns).
A Level-2 Accept signal, together with an
address pointer, selects one of the four Level-2
buffers to be transferred to the Edge Detector
Block. The read port of the selected memory is
driven through the full range of stored data
addresses to present all the stored data. As not
all buffers contain data from beam crossings that
pass the Level-2 trigger (and therefore these would
not receive a Level-2 Accept signal), the logic
allows any Level-2 buffer to be overwritten
whether or not it has been read. No data memory
buffer is ever erased.
5.3. The edge detector block
The purpose of the Edge Detector Block is to
find hits on the wires. Hits are defined as pulses of
at least 4.8 ns in width. Since the pulses can be
of indefinite length, the techniques of pattern
matching or look-up tables cannot be used. The
technique used in the Edge Detector Block is tolook for leading and trailing edges of a pulse. A
leading edge is defined as a transition from low (0)
to high (1) and a trailing edge is defined as a
transition from high (1) to low (0). It is assumed
that all wires start out in a low (0) state and the
first transition to find is a leading edge.
The Edge Detector Block is made up of two
modules. The first, called the ED, finds and stores
the edges on each of the 48 wires. The second,
called the ED48, controls the timing of the data
transfer into the Edge Detector Block, collects andpacks the output hit data, and signals when the
Block is finished. Each wire has its own dedicated
ED module, making 48 on each TDC Chip; there
is only one ED48 module on each Chip.
The data from the Level-2 Buffer are fed into
the Edge Detector Block in 10-bit words. A single
ED looks for hits in two consecutive 10-bit words
at a time. The beginning of a hit is defined as a
zero followed by at least four ones, or in the case
of the first word, four ones in a row. The end of a
hit is defined as a one followed by four zeroes.
There are three possible transitions in each word
and each transition needs a memory cycle.
The maximum number of words looked at perbeam crossing for each wire is variable, with a
maximum of 33 words (396/12). The maximum is
set by the time between Tevatron beam crossings,
396 ns, and the data clocking period of 12 ns for
each word (1.2 ns per bit times 10 bits per word).
The number of words to be searched by the ED is
set in a register via VME at initialization.
5.3.1. A worked example
Table 2 shows the relationship between the
position of bits in the words sent to the ED andtheir respective time value as an example in a word
that has three possible transitions.
In this case, there is a leading edge (transition
#1) starting at time value 1 and ending at time
value 4 (transition #2). There is also possibly
another leading edge (transition #3), starting at
time value 9, depending on what is in the next
word. In this case, the ED, which looks at 2 words
at a time, would find a hit if the next word started
with three ones.
Once a hit has been found, the data describing
the hit are stored in a RAM in the ED and the hittotal is incremented. Each hit is characterized by
the number of the time bin of the leading edge and
the width, expressed as the number of time bins in
the hit. Thus if the data in the example ofTable 2
were the first word in the data stream, the leading
edge stored would be 1 and the width would be 4.
If instead the example data were the third word,
the leading edge time would be 21 (10 bits each for
ARTICLE IN PRESS
Table 2
A sample word showing three possible transitions in the dataword sent to the Edge Detector
Bit position [9] [8] [7] [6] [5] [4] [3] [2] [1] [0]
Sample data 0 1 1 1 1 0 0 0 0 1
Time value 0 1 2 3 4 5 6 7 8 9
The first is a leading edge starting at time value 1 and ending at
time value 4, which is the second transition. There is also
possibly another leading edge transition starting at time value 9,
depending on what is in the next word. Time values go from
early (9) to late (0).
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 451
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
9/14
the first and second words plus time value 1 in the
third word) and the width would still be 4.
5.4. The XFT Block
The choice of implementing the TDC function-
ality by firmware in an FPGA facilitates integrat-
ing an associated functionality, the implementation
of complex deadtimeless pattern recognition to
feed the tracking trigger system. The TDC XFT
block generates Trigger Primitives used by the
eXtremely Fast Tracker (XFT) [18], which identi-
fies tracks in the Central Outer Tracker (COT).
The tracks are used in the Level 1 Trigger, as well
as in Level 2, and consequently the complete
pattern recognition and momentum reconstruction
have to be available within 5:5ms after the beamcrossing. The sense-wire planes of each superlayer
in the COT are tilted in the r f plane so that a
high momentum track will traverse each superlayer
[1], and consequently will travel between a pair of
sense wires in each plane, resulting in some hits
that are prompt. The present Run IIa XFT splits
the 396ns interval between beam crossings into
three time bins, aprompt, anot sure, and a delayed
time bin. These three bins are logically combined to
give 2 trigger bits per wire every 396ns. The newTDC/XFT design described here uses 11 time
windows to produce 6 output trigger bits per COT
wire every 396 ns. The 6 bits, referred to below
as Trigger Primitives, are derived from the hit
occupancies in the 11 time windows with Boolean
logic. The larger number of time windows allows
better momentum resolution and fake track rejec-
tion.The Trigger Primitives, consisting of the 6 bits
per wire times 48 wires, are output each 396 ns
from the XFT block on each of the 2 FPGAs.
Every 22 ns the TDC Chip sends out 16 bits to
the P3 backplane (48 bits for 66 ns, and so 48
6 288 bits for 396 ns). In order to speed up the
transmission of data to the XFT, the bit for each
of the time bins is transmitted in turn after the
calculation for its corresponding three time-win-
dows (see below) is completed. The time windows
can be reprogrammed to optimize performance by
changing data in VME registers without firmware
changes.
The TDC XFT block receives two types of
signals, the CDF control pulses Bunch 0 (B0) and
Bunch Crossing (BC), and the primary (digital)
data-stream from the COT. The XFT block sends
out timing alignment signals and the Trigger
Primitives (trigger bits) to the P3 connector on
the VME backplane for transmission to the XFT.
The TDC XFT block is controlled by two VME
registers and an internal RAM[19]. The values of
the registers determine the input and outputdelays, and the RAM contents define the time-
window intervals.
The TDC XFT block includes three main blocks
(Fig.5):
ARTICLE IN PRESS
48 Occupancy DetectorsOutput
Multiplexer
Data Stream Trigger Bits
Trigger
Primitives
Trigger
Logic
Control
CDF_CLK
BC
Time-window
enable bits
Time-window
clear bits
BC Delayed
B0
Fig. 5. Block diagram of the TDC XFT Logic. The data stream comes from the MUX/MASK Block (Section 5.1) as 480-bit words
every 12 ns. The CDF_CLK, BC and B0 signals are the 132 ns CDF master clock, and the Bunch Crossing and Bunch Zero signals
generated by the Tevatron, and are transmitted to the VME backplane through the CDF DAQ system [16]. The TDC XFT block sends
out the Trigger Primitives, which are the multiplexed trigger bits, and trigger control signals.
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457452
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
10/14
Occupancy Detectors (OD).
The 48 Occupancy Detectors, one per channel,
receive the digitized COT data stream (48
channels 10 bits/channel every 12 ns), per-form hit recognition, and send out 48 6
Trigger Primitive bits every bunch crossing to
the Output Multiplexer. Each OD looks for a
1111 pattern (a hit) in two consecutive words
in the data stream (20-bits), which corresponds
to a 24 ns time interval. The search for hits is
performed separately by two Hit Scanners
inside each OD; one for hits starting in the first
5 bits (6 ns), and the second for hits in the next 5
bits of the 20-bit data segment. Each Occu-
pancy Detector contains 11 independent 1-bit
registers, (Time-Window Registers), controlled
by the Trigger Logic Control, that store hit
information. Each register records if there is a
hit in during the corresponding 12-ns time-
window. In the current firmware design each
time-window is controlled by two bits (time-
window control bits), used to implement
separate hit scanning for the first and the last
6 ns of each 12-ns clock interval. If the first
control bit is high, the register is sensitive to the
Hit Scanner for the first 6 ns of every 12ns
interval (the first 5 bits of 10-bit data word).Similarly, if the second bit is high, the register is
sensitive to the Hit Scanner for the last 6 ns of
every 12 ns (last 5 bits of 10-bit data word). The
time-window register records the logical OR of
the output of the two Hit Scanners. For each of
the 11 registers, the time-windows covering a
396 ns cycle (33 12-ns intervals) are defined by a
2 33 bitmap stored in the Trigger Logic
Control. At the end of every 396-ns cycle the
Time Window Register bits are sent to the
Output Multiplexer. The registers are thencleared.
This implementation allows defining time-win-
dow ranges in units of 6 ns for the CDF XFT.
However, hit scanning can be done separately
for every bit of the 10-bit data word (this would
require configuring the firmware for 10 time-
window control bits per channel). In this case
the time-window unit would be 1.2 ns. The 6 ns
resolution is thought to be sufficient for the
existing XFT system.
Trigger Logic Control (TLC). The TLC con-
trols the XFT logic. Every 12 ns the Trigger
Logic Control block sends 22 bits, two per time-
window to the Occupancy Detectors definingwhich 11 time-windows are enabled for this
clock cycle. All Occupancy Detectors get the
same 22 time-window bits. These time-window
bits are stored in a 22-bit wide RAM, accessible
via VME. The RAMs Address Counter is
incremented every 12-ns clock and hence the
time-window bits refresh every 12ns. The
Address Counter starts to count on receiving
an XFT-Enable pulse, which is adjusted to be
delayed by the same amount as the COT data
stream relative to the Bunch Crossing (BC) and
Bunch Zero (B0).
Output Multiplexer (OM). This receives the
trigger bits from the 48 Occupancy Detectors
and sends the Trigger Primitives in parallel
to the P3 backplane. Each OM sends out a
16-bit Trigger Primitive word every 22 ns on
the P3 connector to the XFT. The signals
are buffered to the P3 connector as TTL levels.
The OM also sends synchronously a Word_0
marker, the B0 marker if appropriate (i.e. the
crossing is that of the Tevatron bunch 0), and
an alignment signal (Data Strobe). The OMdoes not perform any logical operations on
the trigger bits, but sends them in the order
required by the XFT [18].
The existing cables to the XFT have too low a
band-width to transmit the 22-ns Main Clock as
a data strobe, and so Trigger Primitive bits are
sent on the leading and trailing edges of a
slower clock which is also transmitted on the
cables. The Data Strobe (DS) is a 44 ns clock
formed from doubling the period of the Main
Clock. Thirty-two bits, 16 from each TDCChip, are sent every 22 ns, with 18 such cycles in
the beam-crossing period of 396 ns.
5.4.1. TDC XFT-DAQ Block
The TDC XFT-DAQ block of the firmware,
used for testing and diagnostic purposes only,
connects to a dedicated DAQ system similar to the
hit-data stream. It has the same structure, consist-
ing of a Pipeline, L2 buffers, and VME-readout
buffers, and follows the same L1A/L2A sequence
ARTICLE IN PRESS
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 453
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
11/14
as that of the main hit-data stream. The length of
the XFT-L2 buffers is also VME controlled. For
testing, the XFT block is fitted with another simple
VME Readout RAM that contains the currentXFT Trigger Primitives. This RAM can be frozen
and read out via VME for diagnostic tests, in
particular for debugging the TDC-XFT connec-
tion.
5.5. The Test Data Generator
The Test Data Pattern Generator inside the
TDC Chip allows testing of the functionality of the
TDC by running test data through the entire
sequence. The Pattern Generator is implemented
in the firmware as a VME accessible dual-port,
512-bit wide RAM, which can store 8192 32-bit
words. The first 480 bits of the 512 are used to
drive the 48 channels with 10-bit words of test
data. The Pattern Generator is used in the tests
described below in Section 8.
5.6. The ASDQ Pulse Generator
The COT electronics chain of ASDQ-Repeater-
TDC boards can currently be calibrated by having
each TDC send a differential-ECL calibrationpulse upstream on a pair of dedicated lines in the
multi-conductor signal cable to its associated
ASDQ cards. Each ASDQ uses this input to
generate a calibration pulse at its input, which then
propagates back through the data lines of the same
signal cable[1]. This loop allows a time calibration
of the signal path, including the actual cable
length. Multiple chains can be calibrated simulta-
neously by providing the bussed calibration back-
plane signal to each TDC (see Section 4). The
TDC firmware also provides for local generationon the TDC of the pulses to the ASDQ cards. The
choice of the two calibration modes is made by
VME register selection.
In the local generation mode, the SERDES
OUT block (see Fig. 4) generates a serial LVDS
pulse pattern, available as an ECL signal on the
front panel. The timing and number of pulses are
controlled via VME by writing the contents of the
Tx Pulse RAM(Fig.4), which is implemented as a
10-bit wide, 512-word memory.
5.7. The Clock Generator Block
The Clock Generator Block, implemented in
each TDC FPGA with PLLs, generates the 12-nsand 22-ns clocks used inside the Chip. All the
clocks are synchronous with the delayed CDF
clock. The prompt CDF clock is also received and
is used to latch the CDF-specific back-plane
control signals (described in Section 4). Four 12-
ns clocks are generated onto output pins of the
FPGA as LVDS signals and routed back into the
Chip onto the FPGAs dedicated high-speed clock
input pins, one for each high-speed I/O bank.
6. The VME Interface Block
The VME Interface is implemented with a third,
smaller, Altera FPGA [15]. The design permits
Chained Block Transfer (CBLT) read commands
in both 32- and 64-bit modes for data transfer.
CBLT uses geographical addresses different from
the ones normally used in the crate, recognized
only by the participating modules. In CDF there
are typically 18 TDC cards per VME crate.
There are two possible CBLT read commands:
(1) Read Block Transfer from virtual slot 30: Hit
Count words are read from every TDC module
in the crate. Each TDC produces 14 words/
board in 32-bit mode and 8 words/board in 64-
bit mode.
(2) Read Block Transfer from virtual slot 31: Hit
Data words are read from every TDC module
in the crate. Each TDC produces up to 336
words/board in 32-bit mode and up to 168
words/board in 64-bit mode.
In this firmware implementation, the CBLT mode
is enabled by default. Up to 18 TDC cards sit in a
VME crate on the CDF detector; the TDC module
closest to the Crate CPU is automatically con-
sidered first in the chain. The setting of a module
as last in the chain, or the possible removal of a
module from the readout, is done by writing to a
register in the modules VME Chip to disable
CBLT mode[19].
ARTICLE IN PRESS
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457454
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
12/14
7. Module power requirements
The TDC board receives 5 V=15 A and
5 V=2A on the P0, P1, and P2 backplaneconnectors. The board generates 1:5 V=15 Aand 3:3 V=10 A with DC/DC converters, and2:5 V=3 A using a linear regulator. A 3:3 Vvoltage is also generated and passed through the
front panel connectors to the amplifierdiscrimi-
natorshaper card (ASDQ)[1]. Spare locations on
the TDC board are provided for two additional
DC/DC converters.
8. Performance of the TDC and Readout
The implementation of both the TDC function-
ality and the processor with its extensive trigger
logic and self-testing capabilities in one chip with
digital inputs and outputs has an impact on
characterizing the properties of the TDC. We
discuss the linearity, cross-talk, and channel-to-
channel time skew in turn below [20]. We also
describe the bandwidth of the module for both 32-
and 64-bit VME block transfers.
8.1. Linearity
Unlike in more conventional analog TDCs, the
time-to-digital conversion is basically digital, with
the time binning determined by the sampling of the
input signal during de-serializing. This is con-
trolled by a phase-locked loop. The PLL locks a
local oscillator; an error bit monitors loss of lock.
The jitter in the clock driven by the PLL is
specified to be less than 200 ps; the drift in the PLL
is negligible [13]. The differential non-linearity
consequently is less than 200 ps. Channel-to-channel skew is specified to be less than 100 ps
[13]. Each of these effects is significantly smaller
than the least count of 1.2 ns; we (conservatively)
quote 1 count to be the inherent precision of the
device.
In the CDF application described here, the data
arrive at the front face of the TDC asynchronously
with the CDF clock. There is consequently an
undetermined phase between the clock and the
data, resulting in a possible additional 1 count
jitter.1 Fig. 6shows measurements of the linearity,
with no deviation from expectations.
8.2. Channel-to-channel skew
Skew measurements were made from the LVDS
receivers output pins to the FPGA input pins. The
measuring point on the FPGA was accessed by
soldering test pick-off wires to the FPGA BGAhole pattern on the solder-side of the PC board.
The channel-to-channel skew from contributions
other than the FPGA was measured to be less than
100 ps. Channel-to-channel variations inside the
FPGA are negligible.
8.3. Crosstalk
To test for crosstalk we compare the output of
the TDC boards with a reference set, computed
automatically by the test routines from theinput data stored in the RAM in the TDC Chips.
All test routines are implemented as C-code,
which is executed on the PowerPC (MVME2301
or MVME5500) crate controller after compilation.
Other crate controllers can also be used without
any significant differences. Both walking 1s and
0s, and random patterns were used. The test
ARTICLE IN PRESS
00
50HitDa
taCount(Ti
me)
+20 ns+0 ns +60 ns+40 ns +80 ns +100 ns
Time Change of ASDQ Input Pulse
T0
100
150
200
Avg CNT=91.5
Avg CNT=108.1
Avg CNT=125.0Avg CNT=142.0
Avg CNT=158.4
Avg CNT=175.0
Slope is 1-count/1.202 ns
Fig. 6. Measurement of linearity of one channel: counts versus
delay in the input pulse. Because the sampling is controlled by a
phase-locked loop the differential non-linearity is expected tobe less than the least count, consistent with the measurement.
1This is not inherent in the TDC electronics, but is from the
application. For a synchronous application this does not apply.
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 455
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
13/14
routines exercise the Edge Detector (ED48 test),
the TDC XFT block (XFT test) and the Chain
Block Transfer in both 32- and 64-bit modes
(CBLT32/64 tests).A total of 5 109 cycles were performed; zero
bit-errors were observed. This puts a limit at 90%
C.L. on the cross-talk bit-error rate due to cross-
talk inside the FPGA of 5 1010.
8.4. Readout bandwidth
The results of tests of the readout speed are
described below in more detail. The burst-mode
readout speed achieved in CBLT64 mode is
47 MB/s. The Edge Detector (ED48) processing
time is always less than 12 ms (for 7 hits/wire).
8.5. Testing chain block transfer: the CBLT32/64
Test
The CBLT 32/64 tests read out multiple boards
in a crate sequentially. Tests were performed with
up to 18 TDC boards in a VME crate. The tests
are designed to check the VME chain-block
transfer capabilities of the board, and also crate-
wide characteristics such the crate backplane
capability, stability, and multi-board performance.In the CBLT tests the initial contents of the
Level 2 buffer RAMs are used to predict the
output of the ED48 modules as read from the Hit
Count and the Hit Data RAMs. In the full-crate
test 18 TDC boards are used. For each of the 36
TDC Chips in the test (2/board), the L2 buffer
lengths are set so that all possible patterns are
sampled. The order of accessing the four L2
buffers is also selected differently for each TDC
board, so that all combinations of L2 buffer and
buffer length are sampled. The CBLT32 andCBLT64 tests were repeated 5 109 times each
without a single failure, such as would be due to
cross-talk or timing jitter in the logic.
9. Conclusions and summary
A new 96-channel TDC and trigger processor
module has been designed for the CDF Experi-
ment at Fermilab using the multichannel bit-
sampling capabilities of the AlteraStratix FPGA
family. The board, built in a 9U VME format,
contains few other components other than the 2
TDC FPGAs, a VME controller implemented in a3rd FPGA, DC-to-DC-converters, and input/out-
put buffers. The functionality is exceptionally
flexible, being controlled by firmware, so that it
can be reprogrammed for different applications.
The TDC has extensive test capabilities, imple-
mented directly in the FPGAs. Thirty boards have
been built and tested. The reliability of the board is
high as the chip count is very low. The TDC
differential non-linearity is 200 ps, i.e. within 1
count at any point in the range, and no cross-talk
was detected in 5 109 96-channel tests. The
channel-to-channel skew is less than 100 ps. A full
crate of the CDF-II TDC has been operated and
read out in 64-bit block-transfer mode at a speed
of 47 Mbytes/s.
Acknowledgements
We thank Bill Badgett, Frank Chlebana, Pat
Lukens, Aseet Mukherjee, and Kevin Pitts for
help, support, and advice. Nils Krumnack and Ed
Rogers deserve special thanks for providing
critical input on the XFT specifications and the
design of the XFT sections of the TDC. We thank
Rich Northrop for the picture of the board.
This work was supported in part by the
National Science Foundation under Grant No.
5-43270, and the US Department of Energy.
References
[1] T. Affolder, et al., Nucl. Instr. and Meth. A 526 (2004)249.
[2] The CDF-II detector is described in the CDF Technical
Design Report (TDR), FERMILAB-Pub-96/390-E. The
TDC described here is intended as a further upgrade
beyond that described in the TDR.
[3] R.S. Moore [the CDF Run II collaboration], A custom 96-
channel VME TDC for the CDF detector for Tevatron
collider Run II; FERMILAB-CONF-04-262 Prepared for
2004 IEEE Nuclear Science Symposium and Medical
Imaging Conference (NSS/MIC), Rome, Italy, 1622
October 2004.
[4] D.I. Porat, IEEE Trans. Nucl. Sci. NS-20 (1973) 36.
ARTICLE IN PRESS
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457456
7/26/2019 A 96-channel FPGA-based Time-to-Digital Converter
14/14
[5] J. Kalisz, Metrologia 41 (2004) 17
http://www.iop.org/EJ/abstract/0026-1394/41/1/004.
[6] J.F. Genat, Nucl. Instr. and Meth. A 315 (13) (1992) 411.
[7] An extensive list of references can be found in:
A. Ma ntyniemi, A High Resolution Time-to-DigitalConverter Based on Stabilised Three-stage Delay Line
Interpolation, Department of Electrical and Information
Engineering, University of Oulu; OULU 2004 Thesis;
ISBN 951-42-7460-I;ISBN 951-42-7460-X;http://herkules.
oulu.fi/isbn951427461X/isbn951427461X.pdf
[8] J. Kalisz, R. Szplet, J. Pasierbinski, A. Poniecki, IEEE
Trans. Instrum. Meas. 46 (1997) 51.
[9] ANSI/VIPA 23-1998, March 22, 1998. These crates
support geographical addressing. Chain Block Transfer is
described in Appendix E.
[10] T. Shaw, G. Sullivan, A Standard Front-End Trigger
VME Bus-Based Readout Crate for the CDF Upgrade,
CDF/DOC/TRIGGER/CDFR/2388, May 12, 1998. The
CDF readout crate is a 21-slot 9U VME crate based on the
VIPA standard, with added bussed signals on rows A and
C of the P2 connector. All 64 of the user-defined pins on
these rows are bussed between slots 2 and 21 using the
same termination scheme as the standard VME bussed
lines. Additional power pins are provided on the P0
connector. The J3 backplane, which is physically separate
from J1/J2, is customized for I/O specific to each CDF
subsystem.
[11] The schematics and test results are available online
at: http://edg.uchicago.edu/bogdan/tdc/index.html. The
firmware is in CVS at http://www-cdfonline.fnal.gov/
cgi-bin/cvsweb.cgi/NuTDC. The schematics and code are
available on request.
[12] B. Bevensee, et al., IEEE Trans. Nucl. Sci. NS-43 (1996)
1725;
K. Pitts, private communication.
[13] Altera Corporation, Stratix Device Handbook, v3.0, Apr.
2004. The serializer/deserializer block is labeled SERDESin this Handbook, and in the figures and text here. The
specifications on differential non-linearity and skew in the
chip are from communications from Altera.
[14] Altera Corporation, Quartus II Handbook, v2.1, August
2004.
[15] Altera Apex AP20K100.
[16] T. Shaw, Specification for Trigger And Clock + Event
Readout Module (TRACER) Terri Shaw, CDF Internal
Note 4686, 8/1/98.
[17] M. Campbell, H. Frisch, M. Shochet, G. Sullivan,
D. Toback, J. Wahl, P. Wilson, CDF Internal Note
2038, April 1993.
[18] E. Rogers, N. Krumnack, CDF Internal Note 7193,
August 2004. This note is the specification for the XFT
implementation.
[19] M. Bogdan, H. Sanders, CDF Internal Note 6998, June
2004.
[20] Tests were made with VxWorks V5.3c [21] and FISION
V2.12 [22] running on a RedHat V6.2 Linux operating
system. CDF control pulses were provided by a CDF
TESTCLK card[23].
[21] r2003 WindRiver Systems, Inc. MCL-DS-VXW-0309.
[22] J. Pangburn, FISION V2.12 Users Guide, http://
www-cdfonline.fnal.gov/vme/FISION.html.
[23] T. Shaw, W. Stuermer, Internal note TESTCLK V7, CDF
Crate Clock and Trigger Driver, Fermilab, ETT/CDF
Upgrade Group, July, 1998.
ARTICLE IN PRESS
M. Bogdan et al. / Nuclear Instruments and Methods in Physics Research A 554 (2005) 444457 457
http://www.iop.org/EJ/abstract/0026-1394/41/1/004http://herkules.oulu.fi/isbn951427461X/isbn951427461X.pdfhttp://herkules.oulu.fi/isbn951427461X/isbn951427461X.pdfhttp://edg.uchicago.edu/~bogdan/tdc/index.htmlhttp://edg.uchicago.edu/~bogdan/tdc/index.htmlhttp://www-cdfonline.fnal.gov/cgi-bin/cvsweb.cgi/NuTDChttp://www-cdfonline.fnal.gov/cgi-bin/cvsweb.cgi/NuTDChttp://www-cdfonline.fnal.gov/vme/FISION.htmlhttp://www-cdfonline.fnal.gov/vme/FISION.htmlhttp://www-cdfonline.fnal.gov/vme/FISION.htmlhttp://www-cdfonline.fnal.gov/vme/FISION.htmlhttp://www-cdfonline.fnal.gov/cgi-bin/cvsweb.cgi/NuTDChttp://www-cdfonline.fnal.gov/cgi-bin/cvsweb.cgi/NuTDChttp://edg.uchicago.edu/~bogdan/tdc/index.htmlhttp://edg.uchicago.edu/~bogdan/tdc/index.htmlhttp://herkules.oulu.fi/isbn951427461X/isbn951427461X.pdfhttp://herkules.oulu.fi/isbn951427461X/isbn951427461X.pdfhttp://www.iop.org/EJ/abstract/0026-1394/41/1/004