Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
d19c user’s and developer’s manual
Laurent Charles, Jarne De Clercq, Mykyta Haranko, Jelena Luetic, Stefano Mersi
March 4, 2018
Abstract
This manual describes the basic usage and the structure of the d19c firmware.Vivado project creation, code synthesis and firmware image loading are described,and are supplemented by a complete description of the internal firmware structure.
1
Contents
1. Introduction 41.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2. Introduction to the DAQ chain . . . . . . . . . . . . . . . . . . . . . . . 41.3. DUTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1. Outer tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.1.1. 2S modules . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.1.2. PS modules . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2. Future DUTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. Usage 82.1. Git Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2. Vivado Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3. Front-End Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4. Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.5. IP address assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.6. Firmware Uploading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.7. Middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3. Firmware Description 113.1. Top level view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2. IPbus Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.3. Clock Generator Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4. TTC Decoder Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.5. AMC13 Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.6. Fast Command Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.7. Command Processor Block . . . . . . . . . . . . . . . . . . . . . . . . . . 173.8. Physical Interface Abstraction Layer . . . . . . . . . . . . . . . . . . . . 18
3.8.1. IO to the Fast Command Block . . . . . . . . . . . . . . . . . . . 183.8.2. IO to the Command Processor Block . . . . . . . . . . . . . . . . 203.8.3. IO to the Data Readout Block . . . . . . . . . . . . . . . . . . . . 21
3.8.3.1. l1 data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.8.3.2. stub data . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.8.4. FMCIO mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.8.5. Phase sampling of incoming data . . . . . . . . . . . . . . . . . . 243.8.6. Extras for the MPA/SSA testing . . . . . . . . . . . . . . . . . . 25
3.8.6.1. SLVS Debug Block . . . . . . . . . . . . . . . . . . . . . 253.8.6.2. SSA like data sending . . . . . . . . . . . . . . . . . . . 263.8.6.3. Lateral data sending . . . . . . . . . . . . . . . . . . . . 27
3.9. Readout Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.9.1. Data Buffering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.9.2. Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.9.3. Data Packing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2
3.9.4. Data Readout Buffer . . . . . . . . . . . . . . . . . . . . . . . . . 303.10. External Triggering (DIO5, TLU, NIM+, FMC L12) . . . . . . . . . . . 30
4. Firmware development and verification 324.1. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.1. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.1.2. Testing with emulators . . . . . . . . . . . . . . . . . . . . . . . . 334.1.3. Testing with real chips . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2. Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2.1. Resourse utilization . . . . . . . . . . . . . . . . . . . . . . . . . . 35
A. IPBus register table 35
B. I2C Transaction Format 52
C. Practicalities about the VHDL code 53
D. FW compilation settings using the emulator. 54
E. Practicalities about the HW 56
3
1. Introduction
1.1. Introduction
The d19c project is a firmware (FW) project based on the FC7 [1] general DAQ card.The FW is aimed to form the basis FW for test set-ups for all CMS tracker phase IIchip, hybrid, modules and multi-modules testing. It is important to stress that this isa firmware dedicated for testing and not the firmware for the final DTC for use in theCMS DAQ. On the date of writing (2nd February 2017) the FW aims to incorporatefollowing DUTs:
• RD53A read-out
• CBC based single chip test stands, hybrid assemblies and 2S module assemblieswith electrical read-out
• MPA based single chip test stands, MAPSA assemblies and PS module assemblieswith electrical read-out
• SSA based single chip test stands, MAPSA assemblies and module assemblies withelectrical read-out
• CIC based systems with both electrical read-out and optical read-out over GBTchip
On the day of writing (2nd February 2017) the CBC2 and CBC3 DAQ firmware is inplace and there is a prototype version of the firmware available to facilitate the testingof both MPA and SSA single chips.
1.2. Introduction to the DAQ chain
As previously mentioned the d19c’s backbone is the FC7 DAQ card hosting a Kintex7 FPGA. The FC7 FPGA’s FW comes with a system FW which allows communicationwith periphery devices on the FC7 board and which also allows IPBus communicationwith the firmware. This IPBus communication is the main means of communicationfrom the user to the board. The d19c FW is developed in the user area of the code andinterfaces with the system code where necessary. The communication to the FE (chips,hybrids, modules) are made possible by the 2 FMC connectors which are present on theboard. These multi pin connectors host custom made or of the shelf FMC cards whichcan then further connect to the FE systems. Many FMC cards have been developed e.g.for electrical read-out of 2CBC2, 8CBC2, CBC3, MPA and SSA systems. Also FMCcards with optical connectors are available and will be used later.
4
Figure 1: FC7 top view.
1.3. DUTs
The phase II CMS tracker can be subdivided in inner and outer tracker. The outertracker will, from electrical view point, be build up out of two kinds of modules: 2S (strip-strip) and PS (pixel-strip). An important feature about these outer tracker modules isthat they provide so called stub data as input to the level 1 track finding. Next sectionsbriefly describe the phase II outer tracker, the inner tracker is not yet described asimplementing the read-out firmware in the d19c has not started yet.
1.3.1. Outer tracker
The strip sensors of the 2S modules are read-out by the CBC FE chip. The pixelsensors on the PS modules are read-out by the MPA chip and the strip sensors of thePS module are read-out by the SSA chip. From the FW point of view all these chipsare fairly similar. They provide 1 line of l1 data at 320MHz and 5 lines of stub data forCBC3/MPA and 8 lines of stub data for SSA also at 320MHz. The configuration of thechips happens over i2c [2] and fast commands are sent using 1 line at 320MHz using thesame encoding for all three chips [3].
1.3.1.1. 2S modules The CBC chips are the the FE read-out chips for the strip sensorsof the 2S modules. The d19c firmware allows to read-out both the CBC2 and CBC3chip. More information on the CBC2 and CBC3 chips can be found in the CBC2 [4] andCBC3 [5] user manual. The formats of l1 data for CBC2 and CBC3 are illustrated infigure 2 and 3. The stub data format for CBC3 is illustrated in figure 4. Stub data forCBC2 chips consists just out of 1 line at 40MHz being active high if a stub was found.
5
Figure 2: CBC2 l1 data format.
Figure 3: CBC3 l1 data format.
Figure 4: CBC3 stub data format.
6
Figure 5: SSA l1 data format.
Figure 6: SSA stub data format.
1.3.1.2. PS modules A natural result of the hybrid sensor design of the PS modulesis the more complicated read-out of this module compared to the 2S case. The pixelsensors are read-out by MPA chips and the strip sensors are read-out by SSA chips. Bothchips need to communicate with each other the. The MPA chip for example receivesthe l1 and stub data from the SSA chip, processes it and passes it to the CIC. In anassembled module the data to the CIC will be coming solely from the MPA. However,for chip testing it is required that the FW can read out both chips. More informationon the MPA and SSA chip can be found in their manuals [6]. The format of both theSSA and MPA l1 and stub data can be found in figure 5, 6, 7 and 8.
1.3.2. Future DUTs
In the future the d19c FW should also implement the read-out of the RD53A chip,it should enable the read-out of outer tracker front-end hybrids over CIC chip and thetesting FW for MPA and SSA chips should be extended to a DAQ FW. These are themost obvious DUTs that will need read-out, but of course different configurations ofhardware might require dedicated FW. Also the scaling of the system to multi-modulesystems will need to be developed.
Figure 7: MPA l1 data format.
7
Figure 8: MPA stub data format.
2. Usage
2.1. Git Repository
The d19c firmware project is hosted on a CERN GitLab repository. The repositorycan be accessed using this [7] link, login with CERN account is required. The masterbranch normally contains the most stable source code version, while the other branchesare used for integration purposes.All the developments have to be carried out using private forks, push requests to theupstream repository are not going to be accepted. Once the stable version of the codeis ready the developer should submit a merge request to the current integration branch.The repository is managed by M. Haranko (as of 10th February 2018).
2.2. Vivado Project
Vivado 2016.4 is currently used as the IDE for firmware development/synthesis. Oneshould avoid using the other versions of Vivado, as their behaviour may be unpredictable.To create the Vivado project after cloning the remote repository, one needs to switch tothe project scripts directory. In the Vivado tcl command line:$ cd ./d19c−firmware/fw/prj/fpga fc7 daq firmware and create the project us-ing bash script $ source ./ create vivado project .sh. Please, make sure, that theVivado executable folder is in the PATH variable of your system. Normally, one needsto execute the setup.sh script, located in the Vivado IDE installation directory.
8
NUM HYBRIDS NUM CHIPS CHIP FMC1 or FMC2 IMPLEMENTATION
1
2CBC2
FMC 2CBC2
ELECTRICAL
8 FMC 8CBC21
CBC3FMC 1CBC3
2 FMC 2CBC3
1MPA
FMC MPA SSA BOARDSSA
Table 1: Firmware compilation settings.
2.3. Front-End Selection
To connect the front-end devices to the FC7 the FMC connectors are used. Theconnectors on the FMC card are labelled FMC l12 (first, top connector) and FMC l8(second, bottom connector). To configure the synthesis, one needs to set, which front-endis going to be used. There are two possible ways to do so:
1. One can use the Python script, which may briefly configure the hardware settingssuch as: FMC l12, FMC l8, readout buffer type (FIFO, DDR3), number of hybrids, num-ber of chips, chip type. The script is located in ./d19c-firmware/fw/prj/build configurefolder and called build configure .py. Please, execute it BEFORE launching Vivado.
2. More advanced configuration. One can set all the necessary settings usingthe user package basic.vhd file. Internally in the firmware these settings operate asON/OFF switches deciding whether certain blocks will be compiled for a given firmware.
Table 1 represents the list of valid options in the user package basic.vhd, which needto be set in order to compile for a certain hardware configuration. Also 2 hybrids of thesame type may be used - this is mostly a test feature for the moment. In this case, thenumber of chips is specified per hybrid, not the total number.
Using the READOUT BUFFER TYPE setting, one can choose the type of thedata readout buffer (FIFO or DDR3). The IMPLEMENTATION parameter is usedto define the implementation type: electrical implementation or the hybrid emulation.Also optical implementation is foreseen, but not implemented (for EMULATION im-plementation see 4.1.2).
After setting the hardware options, one needs to enable the appropriate constraintfiles. To do so, in the Vivado Project interface, browse the constraints section, select theneeded file according to the Table 2, and click ”Enable” or ”Disable”. Please, make sure,that all the constraint files are set properly, otherwise, the Write Bitstream operationwill fail.
2.4. Synthesis
Synthesis may be done also in two ways: using the Python script mentioned in Section2.3 or by clicking Generate Bitstream in the Vivado Project window. Both optionslead to the same result: *.bin and *.bit files are generated. The first image (*.bin) isused to load the firmware using the Ph2 Acf or by writing the bin image directly onthe SD card. The *.bit file may be used to load the firmware using the fc7−d19c.exe
9
Constraint file Suitable FMC typeusr io fmc lXX lvds.xdc FMC 2CBC2, FMC 8CBC2, FMC 1CBC3,
FMC FERMI TRIGGER BOARDusr io fmc lXX uib main.xdc FMC 2CBC3, FMC MPA SSA BOARDusr io fmc lXX dio5.xdc FMC DIO5
Note: lXX to be replaced with the FMC id : l12(FMC1, top), l8(FMC2, bottom).Note: also the no ddr3 controller.xdc and ddr3 controller.xdc constraints have to be set enabled/disableddepending on the readout type
Table 2: Constraint selection table.
script. Please, see Section 2.6 for details. Also the *.bit file can be used when one prefersto load the firmware directly to the FPGA using a JTAG connector.
2.5. IP address assignment
For the moment one is encouraged to use RARP to configure the IP of the board.A so-called rarpd daemon installed on one of the computers in the internal network isused. Roughly saying - RARP is the light version of the DHCP. By default, RARP isenabled on firmware side, so to set the desired IP one needs:
1. Download RARP Package [8].
2. Switch to the Download location and install the package (command depends onthe OS): $ rpm −Uvh package full name.rpm
3. One needs to find out, what is the pre-set MAC address of the board. It’s pro-grammed through the I2C during commissioning, there are couple of ways to getit:
• If one has an access to the board through the network, she/he can simplyswitch to the ./d19c-firmware/sw/d19cScripts folder and run $ pythonfc7 get mac address.py command, which will print out the programmedMAC address
• If there is no access to the board, one can use Wireshark to find out the MACaddress: when FC7 turns on, it starts to broadcast it’s MAC address. Whatone needs to do is:
– Power off the FC7.
– Run Wireshark with root privileges and start listening the Ethernet in-terface.
– Power on the FC7.
– The broadcast package with the MAC address of the board will be caught,see Figure 9. In this case, the MAC address is 08:00:30:00:28:27.
Figure 9: Wireshark Screenshot (MAC address broadcasting).
10
4. Configure the RARP server to set the desired IP of the board: run $ vim/etc/ethers and add something like 08:00:30:00:28:27 192.168.1.80.
5. Reload the RARP daemon by running $ /usr/sbin/rarpd −a command
2.6. Firmware Uploading
There are plenty of ways to load the firmware to the FC7:
• Directly writing the firmware to the uSD card - please, see the FC7 user manual[1].
• Using the scripts provided by the FC7 developers - please, see the FC7 user manual[1].
• Using the fc7−d19c.exe script, located in the ./d19c-firmware/sw/fc7/testsfolder. It is just an extended version of the scripts provided by the FC7 team.Below are several guidelines:
– $ ./bin/fc7−d19c.exe −i 192.168.1.80 −l - to list the files on the uSDcard.
– $ ./bin/fc7−d19c.exe −i 192.168.1.80 −n name.bin - to load the ex-isting file name.bin from the uSD card to the FPGA.
– $ ./bin/fc7−d19c.exe −i 192.168.1.80 −f name.bit −n name.bin - toload the file name.bit from the computer to the uSD card and to the FPGAafterwards. Please, note, that the *.bit image has to be used to load thefirmware using this method.
– $ ./bin/fc7−d19c.exe −i 192.168.1.80 −d name.bin - to delete the filename.bin from the uSD card.
* where 192.168.1.80 has to be replaced with the board IP.
• Using the Ph2 ACF to load the *.bin firmware image. Please, consult the helpmessage: $ fpgaconfig −h.
2.7. Middleware
Please, see the Middleware Short Guide for some guidelines for the Ph2 ACF oper-ation.
3. Firmware Description
3.1. Top level view
The top level structure of the user firmware of the d19c is pictorially represented infigure 10. More information on these top level blocks can be found in the subsequent
11
Figure 10: d19c top level diagram. This diagram approximates the firmware code: a basicidea is given of the blocks and there interconnects. It sketches the idea of theclocking regions also: there is the communication to the back-end which happensover IPBus, then most of the logic on the user code is clocked at 40MHz and thePhysical Interface Layer forms the border between the 320MHz region and the40MHz region. Again, this is an approximation. There are for instance parts of thelogic in the fast command block which run at 320MHz, also in this sketch the i2cto the front end lives in the 320MHz region which is of course not the case in thereal implementation.
sections. In each of the sections the corresponding top level .vhd file is mentioned at thestart of the section.
The LHC clock of roughly 40MHz dictates the operation speed for the front-endchips. As a result most of the logic in the d19c firmware is clocked at a speed of 40MHz.
The d19c is designed in such a way that when a new DUT has to be implementedminimal changes have to be done in the FW code. In essence all changes should becaught by what is called the Physical Interface Abstraction Layer. This layer forms theinterface between the FPGA and the front end connected to the FMC connectors. Inthis layer the incoming data is parsed into a pre defined bus which is then sent to thedata readout block. When implementing a new chip the parsing has to be integrated,but beyond this no major changes should be done on the Data Readout Block. Thephysical interface layer also receives the slow control commands for the front-end fromthe Command Processor Block. The Command Processor Block basically sends a signalcontaining the hybrid id, chip address, R/W bit, register address and data to be sentwhich the Physical Interface Abstraction Layer then translates into a proper data se-
12
quence according to a specific slow control protocol (e.g. i2c) and behaves as a master forthe communication. Also here different chips can make use of different types of protocols(e.g. small differences in the i2c protocol) and different masters can be implemented inthe Physical Interface Abstraction Layer. The Physical Interface Abstraction Layer alsotranslates the fast commands from the 40MHz regime to the 320MHz regime.
3.2. IPbus Control
Top files: ipbus decoder cnfg.vhd, ipbus decoder ctrl.vhd, ipbus decoder stat.vhd,ipbus decoder ddr3.vhd
IPbus control in the d19c almost completely follows the IPbus slave structure proposedby the FC7 team: system core contains the IPbus master, which transfers the commandsto the slaves, each of them has a specific address space. In the d19c there are four suchslaves:
• ipbus decoder cnfg.vhd - contains all the configurations of the d19c firmware.Each firmware block has its own address space allocated. Also this block containsdefault values of the configuration registers, which are asserted, when the resetcommand was sent to the firmware. Some of the registers (such as DIO5 con-figuration, clock source configuration) are not flushed when the reset command isissued - this is done due to the fact, that these registers set the clocking parameters.
• ipbus decoder ctrl.vhd - contains all the control registers of the d19c. Most ofthe bits within these registers are self-cleared on the next clock cycle.
• ipbus decoder stat.vhd - status registers of the d19c firmware. Used to monitorthe statuses of different blocks.
• ipbus decoder ddr3.vhd - the slave, which controls the on-board DDR3 memoryand allows to store the event data there. This interface is used only if compiled inthe DDR3 readout mode.
All the interfaces to/from the IPbus slaves are packed into the VHDL record struc-tures. The records are defined in the user package basic.vhd file, in the sectioncorresponding to the controlled block.The only key feature of the d19c IPbus structure is the memory (registers) alloca-tion. A developer sets the amount of registers allocated per each block in the reg-ister map package.vhd file, and during synthesis the total amount of registers to beallocated is calculated. The short user guide on the register allocation is available here[9]. There are several reasons of this choice, such as:
• In order to force developers to follow the same tested IPbus communication struc-ture.
• To make more smart memory allocation, which leads to much faster code synthesis,and better resource utilization.
13
When implementing a new block, one should allocate the dedicated space for theblock in the register map package.vhd and then define the records in the user package basic.vhdfile. Once this is done, one has to route the buses from the block to corresponding controlinterface (configuration, control or status).
3.3. Clock Generator Block
Top file: clock generator core.vhd
In normal mode (commissioning or production mode), the whole FEDs constituting theback-end DAQ should be synchronized by the common LHC clock (BX clock) generatedand propagated by a central point (TCDS system). In CMS Pixel upgrade phase 1 whichuses the microTCA technology, this clock is recovered by the AMC13 board (connectedto the MCH-2 slot of the crate) and forwarded in a parallel-way to each AMC boards(FC7) through the backplane. In the current state of the prototyping, the AMC13 isnot needed since
• An external clock coming from the FMC DIO5 could be used in test beams
• Only one FC7 in crate is used
But, in the future, when the number of hybrids to read-out will increase requiringseveral FC7 for the acquisition, the AMC13 board (or an equivalent board) will beneeded. The Clock Generator Block is composed of two sub-blocks (see figure 11).
The first block is responsible of the reception, the re-generation and the propagationof the 40MHz clock input and all its derivatives (160MHz clock, 160MHz clock shiftedby 90 degree and 320MHz clock) to the rest of the firmware. The 40MHz clock inputcould be sourced from:
• Either the fabric clock (by default the on-board 40MHz oscillator)
• Or the external clock from the FMC DIO5
The second block generates a reference clock clocked at 200MHz from the 125MHzEthernet clock which is completely independent from 40MHz clock. The reference clockis notably useful for the tap delay lines inside the iserdes blocks. The locked signals arereadable by IPbus for monitoring.
3.4. TTC Decoder Block
Top file: ttc decoder block.vhd
The TTC Decoder Block (see figure 12) is useful only if the AMC13 board is used. Inthis case, The AMC13 should be plugged in the MCH-2 slot from the microTCA crate.The role of the AMC13 is to transmit in a parallel-way to all the FEDs the triggers L1Aand the TTC commands (TTC B-Go Commands). The TTC Decoder Block is linked to
14
Figure 11: Clock generator.
the AMC13 board via one differential pair transiting through the backplane of the crateand carrying the encoded TTC signal clocked at 160MHz (4x the BX clock frequency)
The TTC Decoder Block has to handle the decoding and the propagating of theTTC commands to the whole firmware (notably to the Fast Command Block). This isdone from the encoded TTC input signal serialized at 160Mbps initially coming fromthe TCDS system and passing through the AMC13 board. Before being sent from theTCDS system, this signal is encoded which ensures an efficient transmission towards theback-end boards. The encoding scheme utilizes a Time Division Multiplexing (TDM)and a Bi-Phase Mark encoding (BPM) to encode and interleave two channels (A andB) onto a same channel. The channel A is dedicated to transmitting the level 1 acceptdecisions (L1A triggers), a one-bit decision being sent on every BX (bunch crossing).The channel B is suited to transmitting general or synchronization commands to controlproperly the acquisition. The TTC decoder implemented within the firmware handlesthe de-interleaving of the two channels and their decoding before delivering the L1Atriggers and the TTC commands to the rest of the firmware. The TTC commands are
• BC0 : Bunch Counter Reset
• EC0 : Event Counter Reset
• OC0 : Orbit Counter Reset
• RESYNC
3.5. AMC13 Block
Description coming soon...
15
Figure 12: TTC decoder.
3.6. Fast Command Block
Top file: fast command core.vhd
The Fast Command Block is used to create different fast commands at 40 MHz andforward them to the Physical Interface Abstraction Layer, where dedicated block (See3.8.1) creates the OSERDESE2 instance and encodes the commands to be sent at 320MHz. The block is able to send trigger commands in the continuous mode or in themode with fixed amount of triggers. The trigger to accept register is used to set themode: 0 - forwards all received triggers, until the ”stop triggering” command is issued,1...1000000 - sends the specified amount of trigger and then stop. The trigger sourcemay be selected using the trigger source register. The available options are (numberin the list corresponds to the value needed to be set in the register):
1. TTC - the commands are received from the TTC Block (See 3.4, 3.5). Makesure, that the TTC block is in the enabled state. The commands connected are:BC0, Resync, Calibration Pulse, L1A.
2. Stubs - triggering from stubs recovered from the Physical Interface AbstractionLayer. Currently implemented only for CBC chips. User may specify the delayof the stub trigger using the stub trigger latency register. The triggering canbe done using the StubOR signal (requiring at least one stub to be present on thestub lines), or the HitOR signal (requiring at least one fired channel). One moreoption is the stub trigger veto, which sets the veto on the following triggers withinthe specified time range (being set in 40 MHz clock cycles).
3. User Trigger - internal trigger generator, user-defined frequency. The trigger fre-quency may be selected using the user trigger frequency register. Value is setin kHz, in the range from 1 kHz to 1 MHz.
4. TLU - uses DIO5 to interface the Trigger Logical Unit (see 3.10), in the datahandshake mode also stores the trigger id in the event header. Make sure, thatdio5 en and tlu enabled registers are set to 1.
16
5. External Trigger Source - is used to get the external trigger either from the DIO5(input 2), TLU, NIM+ board or the LEMO connectors from the FMC L12 (usedfor the CBC3, MPA, SSA chips in the electrical readout mode). Delay of theexternal trigger may be set using the ext trigger delay value register. Please,note, that the delay value does not take into account the physical signal delay, andthe internal delay in the FPGA (a couple of clock cycles).
6. Test Pulse FSM - using a dedicated finite state machine to send the triggers. Eachtrigger sequence is composed of:
• Fast Reset (Resync)
• Wait for delay after fast reset clock cycles
• Calibration (Test) Pulse
• Wait for delay after test pulse clock cycles
• L1A (trigger) signal
• Wait for delay before next pulse clock cycles
Each of the signals may be disabled using the dedicated registers, disabling thesignal also disables the following delay.
7. Antenna trigger - used to trigger the NxCBC3 UIB antenna and send followingL1A signal. The antenna trigger is sent at user trigger frequency frequency.The L1A signal is sent with antenna trigger delay value delay.
8. Consecutive FSM - sending two consecutive L1A’s to test the high occupancyconditions.
Also in the External Trigger and TLU Trigger modes, the phase of the incoming triggerwith respect to the 40 MHz clock is captured. The phase is sampled using the 320 MHzclock, and stored in the event header.
3.7. Command Processor Block
Top file: command processor core.vhd
This block is responsible for the I2C commands handling: it receives the command fromIPbus and forwards it to the Physical Interface Abstraction Layer. There are two FIFOs:commands FIFO (where IPbus slave directly stores the requests from the middleware)and replies FIFO (where command processor block stores the replies to be read usingIPbus). Each command/reply follows the pre-defined format, described in Appendix B.Each command contains the command type field, which specifies the conditions of thetransaction:
0. Send transaction, one after the other, to a certain chip on a certain hybrid, specifiedin hybrid id, chip id fields.
17
1. Send transaction, one after the other, to all chips on a certain hybrid, specified inthe hybrid id field.
2. Send transaction to all chips, all hybrids.
3. Send a command to all chips on a certain hybrid, using the native broadcast I2Caddress. The readback is not possible in this mode.
4. Send a command to all chips, all hybrids, using the native broadcast I2C address.The readback is also not possible in this mode.
8. Send command to the MPA SSA Board. A different I2C master is used for safetyreasons. The master itself is the same, but the wrapper and command format isdifferent. The command format is described in Appendix B.
Note, that commands 1-2 sends the transactions only to active chips/hybrids. Theslave id field for both I2C masters is not the real I2C address, but the id of the I2Cslave. This allows to address up to 32 slaves using only 5 bits. Additionally this addsthe possibility to set non-changing slave parameters (such as number of bytes to beread/written, necessity of the stop condition, negative ACK of the slave) upon the d19cinitialization. All the parameters (including the I2C address) are set using the IPbusregisters.
3.8. Physical Interface Abstraction Layer
Top file: phy core.vhd
The Physical Interface Abstraction Layer forms the layer between the FPGA and thefront-end. It also abstracts the incoming and outgoing data formats for different front-ends in order to minimise the changes which need to be done on the downstream partof the firmware. The Physical Interface Abstraction Layer interfaces the fast command,the slow control commands and the data from the front-end with the other blocks ofthe firmware. More information on these data streams can be found in the followingsections.
3.8.1. IO to the Fast Command Block
Top file: fast cmd block.vhd
For the fast command the Physical Interface Abstraction Layer forms the transitionfrom the 40MHz regime to the 320MHz regime using an OSERDES (see figure 14). The110XXXX1 pattern [3] is continuously sent whilst a 4 bit bus at 40MHz is received fromthe Fast Command Block which holds the fast commands to be transmitted to the frontend. The register ctrl phy fast cmd phase can be used to change the phase of thefast command going to the chip with respect to the 320MHz clock sent to the chip.
18
FC7
CBC
3 em
ulat
or
…
Hyb
rid 0
Hybrid 2
Hybrid i
…C
BC3
0
CBC3 0
CBC3 j
Fast
com
man
d re
ceiv
erI2
C s
lave
Stub
dat
aTr
igge
red
data
gen
erat
or +
pip
elin
e +
fifo
Com
man
d
Proc
esso
rFa
st C
omm
and
bloc
k
40M
Hz
Physical Interface Abstraction layer
Fast
com
man
d
8 to
1 O
SERD
ES
110XXXX1
Mas
k re
gist
er c
onte
ntVT
H re
gist
er c
onte
nt
i2c
mas
ter
hybr
id 0
…
slow
con
trol m
uxde
mux
SDA MOSI
SDA MISO
SCL
i2c request
i2c reply
stub
da
ta
read
out
hybr
id 0
C
BC0
SDA MOSI
SDA MISO
SCL
…
…
…
stub
da
ta
read
out
hybr
id i
C
BCj
…
x5 stub data
x5 stub data
…
…
triggered data
triggered data
…
Dat
a Re
adou
t Blo
ck
…274 trig data bus to hybrid block
fast reset
trigger
test pulse trigger
orbit reset
stub
s
IPbu
s_de
code
r_cn
fg/c
trl/s
tatu
s
i2c wrapper hybrid 0en
i2c
mas
ter
hybr
id i
i2c wrapper hybrid i
chip addressreg address
rwdatavaliderror
data
enchip addressreg address
rwdatavaliderror
data
…40 bit stub data bus
…
trigg
ered
da
ta re
adou
t hy
brid
0
CBC
0
trigg
ered
da
ta re
adou
t hy
brid
0
CBC
j
trigg
ered
da
ta a
ll C
BCs
hybr
id
0 C
BC0
…276 trig data bus
…
…tri
gger
ed
data
read
out
hybr
id i
CBC
0
trigg
ered
da
ta re
adou
t hy
brid
i C
BCj
…
…
40 bit stub data bus
276 trig data bus
276 trig data bus
276 trig data bus
…
…
i2c request
…
…
i2c reply
…
i2c request
…
i2c reply
…
settings and status of the Physical Interface Abstraction Layer
1 to
8 IS
ERD
ES +
pha
se a
ligne
r + b
itslip
…5 x 8
bit bus
…
…8 bit bus
…8 bit bus
…8 bit bus
…8 bit bus
5 x 8 bit bus
data valid
…274 trig data bus to hybrid block
trigg
ered
da
ta a
ll C
BCs
hybr
id
0 C
BCj
data valid
…274 trig data bus to hybrid block
trigg
ered
da
ta a
ll C
BCs
hybr
id
i CBC
0
data valid
…274 trig data bus to hybrid block
trigg
ered
da
ta a
ll C
BCs
hybr
id
i CBC
j
data valid
Figure 13: Schematic representation of the Physical Interface Abstraction layer, connection tothe other blocks and the most important signals. As an example here the front-end to which it is connected is the CBC3 emulator in the configuration where thisemulator would be running on an other FPGA (see 4.1.2). The case shown here iswhere the FW would be configured to read-out multiple chips on multiple hybrids.This to show the instantiation of the block when scaling up the number of hybridsand chips. This configuration is however not physically possible due to the limitednumber of lines on the FMCs.
19
Figure 14: Data flow on the ports of an 8bit OSERDES/ISERDES. This naming of the portsis also used in the firmware code [10].
3.8.2. IO to the Command Processor Block
Top file: slow control muxdemux.vhd
The input slow control command to the Physical Interface Abstraction Layer specifieswhether a broadcast to all chips is requested, the hybrid id (4 bits), the chip id (5 bits),the page (only for CBC chips), read/write bit, the register address (up to 2 bytes tosupport 16 bit addressing for the MPA and SSA) and the data to be sent (multiple bytesto support the sequential read/write) (see appendix B).
The hybrid id is used in the multiplexer (slow control muxdemux.vhd) whichforwards the slow control command to the correct hybrid’s master and waits for anacknowledge from this master. When a command is forwarded to a hybrid the multi-plexer requires an acknowledge within a specified time and if not received it generates atime-out error. This time-out error is propagated to the Command Processor Block.
For each hybrid there is a wrapper (phy i2c wrapper.vhd) this wrapper for ex-ample handles the paging which is applied in the CBC chips [4], [5]. It also translatesthe chip id (5 bits) to the chip address (7 bits). This translation can be configured overIPBus. Also it handles the sending of the correct chip address in case a broadcast to allchips is requested[2]. During the state machine which drives the reading and writing ofthe page also it is checked if the slave responds with an error. If yes, an error code ispropagated from the wrapper to the Command Processor Block.
The wrapper then in his turn interfaces with the real i2c master (phy i2c master.vhd).Here an i2c state machine is encoded. It drives both the CLK and the SDA line for thei2c. The i2c clock frequency is configurable using register cnfg phy i2c freq.
The response from the slave is then sent back to the Command Processor Block inthe reply format as also defined in appendix B). It passes the read data from the i2cregister in case of an i2c read or just an acknowledge in case of a write. Also as discussedabove this communication can be used to propagate an error to the Command ProcessorBlock.
20
3.8.3. IO to the Data Readout Block
Top file: xxx connect.vhd
The IO from the Physical Interface Abstraction Layer to the Data Read Out block canbe separated in two distinct data streams, one for the l1 data and one for the stub data.In the DAQ one wants, for a single event to be able to match the stub data with the l1data. To do this two parameters need to be tuned: the l1 trigger latency and the stublatency. The l1 trigger latency is a setting on the front-end chip which programs thedepth of the RAM for the l1 data. The setting is dependent on the latency between thereception of the data in this l1 memory and the arrival of a trigger over the fast commandline and is thus directly dependent on the latency induced by the trigger system (speedof trigger logic, length of cables, etc). The stub latency is a setting on the FW side.In the Readout Block (see 3.9) the stub data is passed through a shift register with aprogrammable depth which represents the stub latency. Stub data has to be delayedin the firmware in order to be able to match it with the l1 data. Both parameters canbe obtained from scanning both latency ranges. When both l1 and stub latency arecorrectly set the stub data will match the l1 data.
Below the data paths for both l1 and stub data are described before they are sentto the Readout Block. For every1 chip data line coming to the FPGA an ISERDES[10] is used to translate a single data line at 320MHz to an 8 bit wide bus at 40MHz.Like this the internal logic on the FPGA can run at 40MHz which makes timing clo-sure much easier. More information on the sampling of the l1 data to the ISERDEScan be found in 3.8.5. In the xxx connect.vhd, where the xxx refers to the chip typee.g. CBC2, CBC3, MPA, one can see that the l1 and stub processing blocks are in-stantiated for the number of hybrids which are present. Details on how the processingof these data flows happen for each chip can be found below. One can see that inthe CBC3 connect.vhd and MPA connect.vhd files there are also respectively aCBC3 generator.vhd and MPA generator.vhd block present. These blocks gener-ate emulated chips so we can test the FW without actual chips connected. More on theemulators can be found in 4.1.2. For the MPA and SSA chip there are also some extrafunctionalities added for single chip testing. These are in the SSA data generator.vhdfile and the ssa lateral data generator.vhd. More info on these functionalities canbe found in 3.8.6.
3.8.3.1. l1 data
Top file: triggered data all xxxs.vhd
The processing of the l1 data for each chip is very similar. In the xxx trig datafolder there is a triggered data readout xxx.vhd file which decodes the incoming l1data in a wide bus, the triggered data all xxxs.vhd file checks for the correct headerand sets the valid flag. The trigger data readout wrapper xxx.vhd file instantiatesthe read-out for the number of chips present.
1Except for CBC2, there the data lines operate at 40MHz instead of 320MHz.
21
CBC2
Top file: trigger data cbc2 readout wrapper.vhd
Acquiring the data from the CBC2 is more straightforward then from e.g. the CBC3chip as the CBC2 chip only outputs data at 40MHz with respect to 320MHz for theCBC3. Therefore there is no need to pass the l1 data from the CBC2 chip through anISERDES. In the triggered data CBC2 readout.vhd the l1 data packet is decodedat 40MHz. The l1 header is looked for in the data and once it is found the rest ofthe data packet is decoded and build into a 274 bit wide bus representing the lengthof one l1 data packet. In triggered data all CBCs.vhd the packages are checkedand a flag is set if they are valid. This flag is used in the Data Readout Block. Intrigger data cbc2 readout wrapper.vhd multiple instantiations of the read-out aredone for all the chips. The l1 data goes as this 274 long bit bus to the Read Out Block.
CBC3
Top file: trigger data readout wrapper CBC3.vhd
For the read-out of the l1 data from CBC3 the block that processes the data in thePhysical Interface Abstraction Layer receives 8 bit wide buses from the ISERDESes.These 8 bit wide buses are then build into 274 bit long buses containing the full l1 datawhen the header pattern has been detected. This 274 bit bus is sent for every chip tothe Data Readout Block. The actual filling of this 274 long bit bus happens in a statemachine in triggered data readout CBC3.vhd. The valid flag is set in the trig-gered data all CBCs.vhd file and the trigger data readout wrapper CBC3.vhdinstantiates the same number of read-outs as there are chips and passes them on to thewrapper.
MPA Until now little has been done for the parsing of the MPA l1 data to the DataReadout Block. There is some code available in the phy/MPA/trig data folder whichfollows the same structure of the files as for CBC2 and CBC3. The work is mostly donehere in the triggered data readout mpa.vhd where the incoming data from the 8bit buses from the ISERDES are interpreted and a bus with a fixed length is created.The state machine on the fly interprets the zero suppressed data format (see figure 7)from the MPA and decodes it. The output of this is however not yet linked to the DataReadout Block. What can be used to view data from the l1 data lines is the SLVS debugblock as discussed in section 3.8.6.1.
SSA Until now nothing has been done for the forwarding of the SSA l1 data to theData Readout Block block. What can be used to view data from the l1 data lines is theSLVS debug block as discussed in section 3.8.6.1.
22
3.8.3.2. stub data
Top file: stub data all xxxs.vhd
Similarly as for the l1 data the 8 bit wide buses coming in from the ISERDES forthe stub lines are forwarded to the Data Readout Block block on buses that contain thefull event information. In the stub data readout xxx.vhd files the incoming data isprocessed and put on a wide bus. In the stub data all xxx.vhd file the number ofread-outs are instantiated for the number of chips present.
CBC2 The CBC2 has only a 0/1 flag on 40MHz line which encodes the presence ofa stub. Therefore the stub data all CBC2.vhd file is very straightforward. There isa lot of zero padding there to respect the output stub data format to the Data ReadoutBlock so the same data format for CBC2 and CBC3 can be used.
CBC3 In the stub data readout CBC3.vhd the data is acquired from all 5 stublines and once the sync bit has been found the state machine starts building a 40 bitwide bus representing the full stub data for 1 BX.
MPA Until now (25th February 2018) not much has been done for the parsing ofthe MPA stub data to the Data Readout Block. In stub data readout MPA.vhdthe 8 bit wide buses from the ISERDES are decoded into 80 bit wide buses. This toaccommodate for the fact that stub information from 1 BX might be spread over 2 BXs.The header of the BX packet is decoded on the fly and the stub data is split BX by BX.Until now there is no output yet of this bus to the Data Readout Block. What was useduntil now (25th February 2018) to view data from the stub data lines is the SLVS debugblock as discussed in section 3.8.6.1.
SSA Until now (25th February 2018) nothing has been done for the parsing of theSSA stub data to the Data Readout Block block. What was used until now (25th Febru-ary 2018) to view data from the stub data lines is the SLVS debug block as discussed insection 3.8.6.1.
3.8.4. FMCIO mappings
Top file: fmcio mapping core.vhd
In order to cope with the large diversity of signals connected between the front-endand the FPGA for the different configurations of hardware different mappings wereintroduced which are enabled according to the switches in table 1. The pin mappings aredefined in the fmcio package.vhd file. Pins have attributes such as serialised whichenables the ISERDES on this line or wrong pol, which tells the tool to invert the line.The fmcio mapping core.vhd file is sub-divided into several sections, responsible fordifferent hardware options:
23
• Electrical Implementation. Used when 1 or 2 hybrids are connected using theelectrical connection. Tool defines the connections, based on the FMC selec-tion in the user package basic.vhd. In case of 2xCBC2, 8xCBC2, 1xCBC3,2xCBC3, 1xMPA, 1xSSA, 1xMPA SSA hybrids, the routing from the blocks inthe Physical Interface Abstraction Layer to the fmcio pins is defined in the fm-cio mapping one hybrid.vhd file. In case of 8xCBC3 hybrid, the routing isdefined in the fmcio mapping 8cbc3.vhd file.
• Optical Implementation. Used to connect the hybrids with optical fibres. Notimplemented yet.
• There is also a dedicated mappings file for sending data from the emulator over theFMC pins (more info in 4.1.2). This mapping is described in the fmcio mapping emulator.vhdfile and the according mappings are in the fmcio package emulator.vhd file.
All three of fmcio mapping one hybrid.vhd, fmcio mapping 8cbc3.vhd andfmcio mapping emulator.vhd files follow the same ideology: they connect the physi-cal interface layer signals to the buffered FMC pins. Each one uses the fmcio mapping buf gen.vhdor fmcio mapping buf gen with phase tuning.vhd instance to generate the buffers(the one with phase tuning also does the serialization of the incoming data, if requested.See Section 3.8.5).
Also in the fmcio mapping core.vhd buffers for the triggering FMCs are gener-ated. Mappings for the DIO5, NIM+ are generated. For some FMC’s clock pins areused as the trigger or external clock inputs.
3.8.5. Phase sampling of incoming data
For chips, which output data at 320MHz, the phase tuning and further de-serializationis required. The phase tuning is the process of data alignment with respect to the FC7’sinternal 320MHz clock. The goal is to achieve sampling in the middle of the data bit,which will exclude the possibility of sampling at the moment of bit change. As thearrival phase is not known, the tuning has to be done by recording the bit transitions,and choosing the most optimal sampling point.The XAPP523[11] note was adapted to suit d19c needs. The note describes the 4xAsynchronous Oversampling algorithm, using ISERDESE2 instances. The basic princi-ple is shown in figure 15 and described further. Two ISERDESE2 are used: one samplesnon-shifted data and the second one samples the data from the same line, but shiftedby 45 degrees. Both of them do sampling on the rising and falling edge of the 160MHzclock (DDR) plus on the rising and falling edge of 160MHz clock shifted by 90 degrees.Each of them returns 4 samples per clock cycle. Combining the outputs of two, one gets4x oversampling of each 320MHz data bit.
24
Figure 15: 4x Asynchronous Oversampling.
The phase selection has to be done after. In order to do this a, special Data RecoveryUnit (DRU) was implemented. Based on the bit transition information (see [11] foradditional explanation), the proper phase is selected. The next stage is to convert the2 bit @160MHz bus into an 8 bit @40MHz bus. Sync pattern (by default, ”10000000”)has to be specified. The DRU aligns the dedicated counter in order to fix on the syncpattern.Each line can be set to do the data recovery independently, but only if it is able toproduce the synchronization pattern. This is however not necessary, as all the linesare supposed to have the same delays. That’s why for CBC3 chip only Stub 5 line isrunning in the synchronization mode, the other lines simply re-use the phase counterfrom this line. In case of MPA and SSA, every line can output the desired pattern, usingthe chip’s shift registers, but in practice again only one line is used for synchronization.For the SSA debug mode, lateral data outputs are tuned separately, as they are notsynchronized to the regular stub lines.
3.8.6. Extras for the MPA/SSA testing
3.8.6.1. SLVS Debug Block
Top file: slvs debug core.vhd
This part of the code is used for the low-level debugging of the chip data transmissionlines. It uses two generic approaches to store the data:
• slvs debug generic line.vhd - the instance stores the data for N@40MHz clockcycles after receiving the start signal. The data is saved in the allocated registers,which can be read using IPbus. This is reasonable for short periods of data capture
25
and is used to store 200 clock cycles for the L1 data line (starting from L1A signal),and 40 clock cycles for each of 5(8) CBC3/MPA(SSA) stub lines (starting fromcalibration pulse signal).
• slvs debug generic line fifo.vhd - the instance also stores the data for N clockcycles after the start signal, but in this case the data is stored in FIFO. Only 5 stublines are buffered. In the d19c this is used to store the stub lines of the MPA chipand first 5 stub lines of the SSA chip in order to capture pixel(strip) hit countersin the asynchronous readout mode[6]. There are two modes available:
– RAW Mode - the data is stored for 20000 BX after receiving the start signal.
– Parsed Counters Mode - the parsing is started after receiving the start signal.For the MPA, the counter header is detected, and the parsing is stopped after2040 counters were received. For the SSA, the header is not present, that’swhy parsing is started with a certain delay after receiving the start signal,then 120 counters are collected, assumed to be present every 8 BX. Manualalignment is needed.
Due to the amount of data needed to be stored (5 lines x 8 bits = 40 bits pereach BX), two IPbus registers had to be allocated. The data format is definedin the ipbus decoder ctrl.vhd file and depends on the readout mode. ExamplePython scripts are provided.
3.8.6.2. SSA like data sending
Top file: SSA data generator.vhd
In order to test the full functionality of the MPA chip also the data inputs to the MPAchip need to be tested without it being connected to an SSA chip. When in a modulethe SSA chip would send 8 lines of stub data (see figure 6) and 1 line of L1 data (seefigure 5) to the MPA. For testing single chips the functionality was added so that onecan sent configurable blocks of 8 BX long of SSA like stub data after receiving a fasttest pulse trigger and can also send configurable SSA like l1 data after reception ofa fast trigger signal. Both these l1 and stub data lines can be phase shifted with re-spect to the 320MHz clock sent to the chip using register ctrl phy ssa gen trig phaseand ctrl phy ssa gen stub phase. Also the delay with respect to the fast calibrationsignal/fast trigger signal and the actual output of the data is configurable with 25nssteps using the cnfg phy SSA gen delay trig data register. This structure is nicelyrepresented in the SSA data generator.vhd file. The l1 data is generated in thetrig data output SSA.vhd file. It has a state machine which when a trigger is re-ceived outputs a configurable SSA like l1 data packet in blocks of 8 bits to a shift registerwhich can be used to delay the data. This data is then sent to an OSERDES. Almostthe same happens in the stub data output SSA.vhd file for the SSA like stub data,there is also a state machine generating the 8 lines of stub data in packages of 8 bitsand passing them on to a shift register and an OSERDES. In both the l1 data and stubdata OSERDESes the clocks are the ones generated in phase shifted clocks ssa.vhd
26
which have an instantiation of an MMCM inside which can be used to phase shift the40MHz and 320MHz clocks. In this way we also shift the output of the OSERDES.
3.8.6.3. Lateral data sending
Top file: SSA lateral data generator.vhd
In order to test the full functionality of the SSA chip also the data inputs to the SSA chipneed to be tested without it being connected to other SSA chips. The SSA chip receiveslateral data from the neighbouring chips in a format as in figure 16. The firmware allowsto send emulated lateral data when a fast trigger is sent to the SSA chip. The con-tent can be configured using register cnfg phy SSA gen left lateral data formatand cnfg phy SSA gen right lateral data format. Also the phase of the lateraldata with respect to the 320MHz clock can be configured and the lateral data send-ing can be delayed in units of 25ns with respect to the fast trigger using registerctrl phy ssa gen lateral phase 1 and 2 and cnfg phy SSA gen delay lateral datarespectively. The above described functionality is encoded in SSA lateral data generator.vhdwhere in the lateral data output SSA.vhd file the lateral data is produced accord-ing to the configurable input, it is delayed in a shift register and then past through anOSERDES which is clocked by the clocks generated in phase shifted clocks ssa.vhdwhich can be phase shifted.
Figure 16: SSA lateral data.
3.9. Readout Block
Top file: be proc.vhd
The Back-End processor block converts the data from the Physical Interface Abstrac-tion layer (see Section 3.8.3) and loads it in the the readout FIFO (or the DDR3 mem-ory). This block is also responsible for the event formatting. Most of the constants andmethods for this block are defined in the user pkg be proc.vhd, data packing package.vhdfiles.
3.9.1. Data Buffering
Top file: data buffer.vhd
The first step of the data readout is the buffering of the data received from thePhysical Interface Abstraction layer. The generic buffer interface is instantiated here:for each chip type, buffers of different size are generated. The block takes the raw
27
data bus as the input and creates buffers separately for trigger and stub data paths.Currently, only CBC2/3 data buffers are implemented. Due to special data formattingin the physical interface layer, there is no difference between CBC2 and CBC3 chips forthe back-end processor: the CBC3 event format is used for both, not used values are setto 0 for CBC2 chips.
3.9.2. Counters
As information such as bunch crossing id, L1 counter, TDC value, TLU trigger idneeds to be placed in the event header, the dedicated counter blocks were developed, thecounters are stored in the FIFO’s allocated in the BX cnt proc.vhd, evnt cnt proc.vhd,TDC cnt proc.vhd, tlu fifo core.vhd files.
3.9.3. Data Packing
Top file: data packer main.vhd
All the FIFO outputs of the data and counter buffers mentioned above are connectedto this block. The block generates a common synchronous read signal, to get the fullevent information. Data is packed into the next stage FIFO, called packer fifo.vhd.There are two data packing options available: virgin raw (Figure 17) or zero suppressed(Figure 18) event. Both event formats start from the 6 word header (HEADER1), whichis sent once per event, followed by the hybrid header (HEADER2), which is sent onceper each hybrid included in the event. The HEADER2 is followed by the chip data. Sothe final event will look like (for M+1 hybrids and N+1 chips):
HEADER1 → HEADER2 (Hybrid 0) → Hybrid 0, Chip 0 Data → ... →Hybrid 0, Chip N Data → HEADER2 (Hybrid 1) → Hybrid 1, Chip 0 Data→ ... → Hybrid 1, Chip N Data → ... → HEADER2 (Hybrid M) →Hybrid M, Chip N Data.
The virgin raw format declares a fixed event size, which depends only on the hybrid/chipavailability. The chips/hybrids can be excluded from the DAQ chain using the disableIPbus setting. In contrary, the zero suppressed format does not fix the event size: thesize of each event is calculated depending on the amount of clusters/stubs available.
Below some explanations of the variables stored in the event headers are provided:
• BLOCK SIZE - total size of the event in 32 bit words, including all the head-ers. For the VR events this is calculated using the formula: BLOCK SIZE =HEADER1 + Nhybrids ∗ (HEADER2 + Nchips ∗ (Sizetrigdata + Sizestubdata)).
• HEADER1 SIZE - size of the HEADER1 in words. Currently, 6.
• FE DATA MASK - the mask of the enabled hybrids.
• DATA FORMAT VER - the data format version. Currently, virgin raw formathas an event format version 0x03, the zero suppressed data is sent with the versiontag of 0xF3.
28
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 00 HEADER1_SIZE FE_DATA_MASK BLOCK_SIZE
HEA
DER
11 CIC_ID CHIP_ID DATA_FORMAT_VER DUMMY_SIZE2 TRIGDATA_SIZE EVENT_NBR(L1A_CNT)3 BX_CNT4 STUBDATA_SIZE TLU_TIRGGER_ID TDC
5 CHIP_DATA_MASK HEADER2_SIZE EVENT_SIZE6 252 250 248 246 244 242 240 238 236 234 232 230 228 226 224 222 220 218 216 214 212 210 208 206 204 202 200 198 196 194 192 1907 188 187 185 183 181 179 177 175 173 171 169 167 165 163 161 159 157 155 153 151 149 147 145 143 141 139 137 135 133 131 129 1278 124 123 121 119 117 115 113 111 109 107 105 103 101 99 97 95 93 91 89 87 85 83 81 79 77 75 73 71 69 67 65 639 60 59 57 55 53 51 49 47 45 43 41 39 37 35 33 31 29 27 25 23 21 19 17 15 13 11 9 7 5 3 1
10 253 251 249 247 245 243 241 239 237 235 233 231 229 227 225 223 221 219 217 215 213 211 209 207 205 203 201 199 197 195 193 19111 189 187 185 183 181 179 177 175 173 171 169 167 165 163 161 159 157 155 153 151 149 147 145 143 141 139 137 135 133 131 129 12712 125 123 121 119 117 115 113 111 109 107 105 103 101 99 97 95 93 91 89 87 85 83 81 79 77 75 73 71 69 67 65 6313 61 59 57 55 53 51 49 47 45 43 41 39 37 35 33 31 29 27 25 23 21 19 17 15 13 11 9 7 5 3 1
1415 Stub3 Stub2 Stub1
16
HEA
DER2
(Hyb
rid 0
)
L1A
Dat
a(H
ybrid
0,
Chip
0)
L1Cnt pipeAddr Buf Ovf
Lat Err
Stub
Dat
a(H
ybrid
0,
Chip
0)
Bend3 Bend2 Bend1 sync=1
ErrFlag
sOR254
SOvf
Figure 17: Example virgin raw event - CBC3 chip.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 00 HEADER1_SIZE FE_DATA_MASK BLOCK_SIZE
HEA
DER
11 CIC_ID CHIP_ID DATA_FORMAT_VER DUMMY_SIZE2 EVENT_NBR(L1A_CNT)3 BX_CNT4 TLU_TIRGGER_ID TDC
5 CHIP_DATA_MASK HEADER2_SIZE EVENT_SIZE6 CHIP ID Data Type(0) MASK STRIP ID WIDTH STRIP ID WIDTH7 CHIP ID Data Type(0) MASK STRIP ID WIDTH STRIP ID WIDTH8 CHIP ID Data Type(0) MASK STRIP ID WIDTH STRIP ID WIDTH9 CHIP ID Data Type(0) MASK STRIP ID WIDTH STRIP ID WIDTH
10 CHIP ID Data Type(0) MASK STRIP ID WIDTH STRIP ID WIDTH11 CHIP ID Data Type(0) MASK STRIP ID WIDTH STRIP ID WIDTH
12 CHIP ID Data Type(1) STUB1 ADDRESS STUB1 BEND
13 CHIP ID Data Type(1) STUB2 ADDRESS STUB2 BEND
14 CHIP ID Data Type(2)
HEA
DER2
(Hyb
rid 0
)
Clus
tere
d H
it Da
ta
(Hyb
rid
0, C
hip
0)
sync=1
ErrFlag
sOR254
SOvf
Stub
Dat
a (H
ybri
d 0,
Ch
ip 0
)
sync=1
ErrFlag
sOR254
SOvf
Cl.Ovf
L1Cnt pipeAddr Buf Ovf
Lat Err G
ener
al
Data
(H
ybrid
0,
Chip
0)
Figure 18: Example zero suppressed event - CBC3 chip.
29
• DUMMY SIZE - the amount of dummy words attached. Dummy words areattached in order to do more efficient DDR3 readout. The single read transactionof the DDR3 is 256 bits, that’s why trailing zero words are attached to the event,to get the event size dividable by 8.
• CHIP DATA MASK - in HEADER2. The mask of active chips.
• HEADER2 SIZE - in HEADER2. The size of the HEADER2, currently 1.
• EVENT SIZE - in HEADER2. The total hybrid data size, including HEADER2.
Both VR and ZS data packers are similar finite state machines, which just read theevent from buffer, check the enabled flags and store this events in the packer FIFO. Themain difference, is that the zero suppressed packer (zs data packer.vhd) also containsCBC data clusterizers. The stub data clusterizer simply checks the amount of stubssent by the CBC chip, while the triggered data clusterizer splits data into odd and evenchannels, and then tries to find the clusters sequentially: starting from strip 127 goingdown to strip 1. The output of both clusterizers is number of stubs (clusters) and thedata array. The triggered data is stored two clusters per word: each word containsMASK parameter, which tells the amount of clusters available in the word (0,1,2).
3.9.4. Data Readout Buffer
Top file: readout.vhd
The last stage of the data readout (on the firmware side) is the readout buffer. Thereare two similar finite state machines: one is responsible for the data packing to the FIFO,second one stores the data in the DDR3 memory. The data storage option has to beselected upon firmware synthesis: the READOUT BUFFER TYPE constant in theuser package basic.vhd file selects the used buffer type.Each FSM operates in two modes: with and without the handshake. In the handshakemode, the FSM stores a specified amount of events in the readout FIFO (DDR3), raisesthe data ready flag and waits until all events were read. In the non-handshake mode,the events are stored until the full flag is raised by the readout FIFO (DDR3).
3.10. External Triggering (DIO5, TLU, NIM+, FMC L12)
Top files: dio5 core.vhd, tlu core.vhd
30
Figure 19: DIO5 FMC card.
There are several external triggering options provided in the d19c firmware:
• Using the LEMO connectors on the FMC L12 card, designed for 2xCBC3, 8xCBC3,MPA, SSA readout. SPARE1(?) connector is used for external triggering, SPARE2(?)is used for the external clock.
• Using the NIM+ trigger board (FNAL). This option provides: clocking, triggerand back-pressure handling.
• Using the DIO5 FMC. Input 2 is used for the external triggering, input 5 is usedfor the external clock source. Also this option provides TLU functionality. Addi-tionally, one can connect TLU clock input to channel 1 of the DIO5, TLU busysignal to channel 3 and TLU reset signal to channel 4. This will allow the com-munication with the TLU block. In order to use the Ethernet DUT connector ofthe TLU, one needs to build LVDS to TTL converter, sample schematics is shownin the Figure 20.
For all of the mentioned above options, external trigger source has to be selected, alsothe external clock can be used.
31
+5V
C510n
C610n
+3.3V
C410n
X3
LD1117AV33J2
PWR
+5V
L1
L2
C1100u
+5V
C2100n C3
10u
+3.3V
LED1
R9100
+5V
*1OE1A12Y41A22Y31A32Y21A42Y1GND 2A1
1Y42A21Y32A31Y22A41Y1*2OEVCC
X1SN74ACT244N
RIN1-RIN1+RIN2+RIN2-DOUT2-DOUT2+DOUT1+DOUT1- ~EN
DIN1DIN2VDDGND
ROUT2ROUT1
EN
X2DS90LV049TMT
87654321
J1RJ45
J3
TRIGGERCLOCKIN
J4
BUSYIN
J5
RESETOUT
J6
TRIGGEROUT
R147
R247
R3100
R4100
R550
R650R7
100
R8100
+3.3V
+5V
Figure 20: TLU Converter Schematics (as proposed by M.Haranko)
4. Firmware development and verification
4.1. Testing
To commission the testing or DAQ chain of a new chip as fast as possible emulatorsof chips were written to be able to test the firmware without having the actual chips orassemblies. Also the FW is part of the chain of testing the chips for the first time, soa good understanding of the FW is necessary in order to guarantee a good debuggingof the chip without having to debug the FW. The emulators implement i2c slaves, theyhandle the fast command, generate l1 like data after the reception of a trigger and sendout stub data in the format specified in the chip documentation. The emulators can beran both internally (i.e. the emulator and d19c firmware run on the same FPGA andlines are connected internally) or they can run on 2 separate FC7s, interconnected by adata cable. This external configuration mimics the connection of the real front-ends asclose as possible.
4.1.1. Simulation
Top file: sim usr general.vhd
Given the presence of the emulators one can run a full simulation of the d19c firmwarein different configurations. It is possible to follow the complete data flow. A top level
32
simulation of the user code is used to verify the design and interfaces of different blockswithin the complete framework and to debug when problems appear during operation.
Top file: sim phy core.vhd
A test bench for the Physical Interface Abstraction Layer is also provided. One can sendfast commands to the phy, start i2c communication and see the flow of the l1 and stubdata coming from the emulators.
Top file: sim be proc.vhd
Provides a test bench for the Back-End processor block. The stimulus sends a stubwith address 255, waits for 8 BX, sends the L1A signal, waits for 2 BX and sends thetriggered data. One can see the zero suppression path: the event is stored in the databuffers (1st stage), then used by clusterizers, then once the event is ready, the readoutblock stores it in the readout FIFO.
Top file: sim fast command core.vhd
A simulation of the Fast Command Block. Initially stimulus sends combined (Cal Pulse+ Resync) command via IPbus, duration is set to 4 BX. Later, the trigger source isswitched to the Test Pulse FSM (see Section 3.6), and 5 sequencies (Resync → CalPulse → L1A) are sent.
Top file: sim command processor core.vhd
A simulation of the Command Processor Core (I2C Processor). The I2C read transactionis sent. The simulation sends the automatic reply to the processor block, which laterhas to be stored in the reply FIFO - and checked by the stimulus.
4.1.2. Testing with emulators
As discussed in this section’s introduction the presence of the emulators allows tocompile the FW and take data as if a real chip was connected. Both an emulation of aCBC3 and an MPA chip are available in the FW.
The FW can be compiled in such a way that the emulator can run within the sameFPGA as the d19c FW, but also can be configured so that the emulator solely can runon another FPGA and one can then connect the FC7 running the d19c and connectphysically to the other FC7 in this way mimicking the real life situation as close aspossible (see figure 21).
Both emulations have an i2c slave, respond to the fast command triggers and producel1 data accordingly. Stub data is sent continuously. In case the emulator is running ona 2nd FC7 the emulator will generate the 40MHz clock from the 320MHz clock and thefast command like a real chip would do. This 40MHz clock drives the logic which sendsthe emulated data.
33
For the CBC3 emulator the interesting files can be found in phy CBC3Emulator.vhd.There is a CBC3 generator.vhd file where the number of hybrids and chips to be em-ulated are instantiated. Also here the fast cmd.vhd block is instantiated as the fastsignals are distributed to all chips and hybrids in the same way. The block receivesthe fast command either from the output feedback [10] of the fast command OSERDES(when the emulator is ran inside the same FPGA as the d19c) or from one of the FMCpins (when the emulator is ran on a different FPGA). This fast command is past throughan ISERDES, bitslips are performed on the ISERDES until the first 3 bits of the IS-ERDES’ output correspond to the 110 pattern and also the 40MHz clock is generatedfrom the sync pattern. Looking further into the description of the CBC3 emulator onearrives at the description of a single CBC3 chip in CBC3 top.vhd. Here there is theinstantiation of an i2c slave, the output of the stub and the l1 data. It was tried hereto emulate a bit the physical working of a chip. There is output from the i2c slave tothe l1 and stub data processing blocks. This communication represents the i2c settingsof a chip and how these can affect the output data of l1 and stub data. The stub and l1data which is created in bytes is passed to the fmcio mapping emulator.vhd wherethe bytes at 40MHz are serialised to a single line at 320MHz. This serialisation step isnot done in case the emulator runs in the same FPGA as the DAQ FW. In that case thebytes are directly passed to the read-out blocks without the transition to the 320MHzregime. The MPA emulator is coded in a very similar way as the CBC3 emulator andwill therefore not be discussed here.
In essence one can follow exactly the same discussion as for the CBC3 in the para-graph above replacing the ’CBC3’ names by ’MPA’.
Table 4 in appendix D shows how one can configure the firmware to use the emulator.There are a few additional flags in the user package basic.vhd to consider:
• Implementation - Implementation has to be set to Emulation
• output on fmc with emulator - when set to true the data generated by theemulator will be routed to the FMC pins.
• input on fmc with emulator - when set to true the emulator expects the fastcommand and the 320MHz clock to come from the FMC pins. This allows tointerface with another FPGA running the DAQ firmware.
• EMULATION TYPE - can be set to SENDER or DAQ. The presence of thisflag is mainly historical: due to the fact that during firmware development theFMC cards to connect to the front-ends were not necessarily available, connectingone FC7 to another had to be done with FMC cards which were at hand. This flagspecifies that in SENDER mode the firmware will be compiled with an emulatorinside. The DAQ mode will just compile the DAQ FW, but with an FMC mappingthat is not necessarily the default one. Like this when the IMPLEMENTATIONtype is specified to be EMULATION one can only configure FW for testing withan emulator set-up. Like this the ELECTRICAL implementation is reserved forthe connections to real hardware over the correct FMC cards.
34
Figure 21: Testing d19c firmware with external emulator.
4.1.3. Testing with real chips
Up to now FW has been tested on: 2CBC2, 8CBC2, 1CBC3, 2CBC3, 1MPA, 1SSA.
4.2. Performance
4.2.1. Resourse utilization
A dedicated resource utilization report was prepared for different hardware options.It is available in the doc/resource utilization folder of the GitLab repository and will bekept up-to-date with the most recent hardware options available.
A. IPBus register table
Register nameRegisteraddress
Mask Description
clock source 00000004 00030000
selecting the source of theclock (system-wise: 0 -AMC13, 2 - backplane, 3 -internal oscillator(default))
system fmc l12 pwr en 00000005 10000000 power to the l12 fmcsystem fmc l8 pwr en 00000005 20000000 power to the l8 fmcsystem fmc pg c2m 00000005 40000000 c2mi2c settings 0000000d ffffffff SYS I2C: settingsi2c enable 0000000d 00000800 SYS I2C: enablei2c bus select 0000000d 00000400 SYS I2C: select the bus
35
i2c prescaler 0000000d 000003ff SYS I2C: speedi2c command 0000000e ffffffff SYS I2C: i2c commandi2c strobe 0000000e 80000000 SYS I2C: modei2c mode16 0000000e 02000000 SYS I2C: -i2c mem 0000000e 01000000 SYS I2C: -i2c write 0000000e 00800000 SYS I2C: write/read
i2c slvaddr 7b 0000000e 007f0000SYS I2C: address of theslave
i2c regaddr 0000000e 0000ff00 SYS I2C: registeri2c wrdata 0000000e 000000ff SYS I2C: datai2c reply 0000000f ffffffff SYS I2C: i2c replyi2c reply status 0000000f 0c000000 SYS I2C: statusi2c reply error 0000000f 08000000 SYS I2C: errori2c reply done 0000000f 04000000 SYS I2C: donei2c reply 8b 0000000f 000000ff SYS I2C: short replyi2c reply 16b 0000000f 0000ffff SYS I2C: long reply
ipb daq system cnfg 40010000 ffffffffBase for the configurationregisters
cnfg clock ext clk en 40010001 000000011 - using external clock, 0 -using clock fromclock source register
cnfg ttc enable 40010002 00000001Enable the TTC handling(disabled by default)
cnfg amc13 sw TTS state valid 40010003 00000002 Enable the AMC13cnfg amc13sw TTS state 40010003 000000f0 Set the AMC13 state(??)
cnfg hybrid enable 40010005 0000ffff
Mask of the enabledhybrids: 0x0021 wouldmean that hybrids 1 and 5are enabled. By default allenabled.
cnfg chips enable hyb 00 40010006 000000ffChips enable mask forhybrid 0
cnfg chips enable hyb 01 40010006 0000ff00Chips enable mask forhybrid 1
cnfg chips enable hyb 02 40010006 00ff0000Chips enable mask forhybrid 2
cnfg chips enable hyb 03 40010006 ff000000Chips enable mask forhybrid 3
cnfg chips enable hyb 04 40010007 000000ffChips enable mask forhybrid 4
cnfg chips enable hyb 05 40010007 0000ff00Chips enable mask forhybrid 5
36
cnfg chips enable hyb 06 40010007 00ff0000Chips enable mask forhybrid 6
cnfg chips enable hyb 07 40010007 ff000000Chips enable mask forhybrid 7
cnfg chips enable hyb 08 40010008 000000ffChips enable mask forhybrid 8
cnfg chips enable hyb 09 40010008 0000ff00Chips enable mask forhybrid 9
cnfg chips enable hyb 10 40010008 00ff0000Chips enable mask forhybrid 10
cnfg chips enable hyb 11 40010008 ff000000Chips enable mask forhybrid 11
cnfg chips enable hyb 12 40010009 000000ffChips enable mask forhybrid 12
cnfg chips enable hyb 13 40010009 0000ff00Chips enable mask forhybrid 13
cnfg chips enable hyb 14 40010009 00ff0000Chips enable mask forhybrid 14
cnfg chips enable hyb 15 40010009 ff000000Chips enable mask forhybrid 15
cnfg i2c settings map slave 0 config 40011000 ffffffffConfiguration of the I2CSlave 0
cnfg i2c settings map slave 0 config i2c address40011000 fe000000
mask: i2c address of theslave
cnfg i2c settings map slave 0 config stop for rd en40011000 01000000
mask: enables stopcondition before sendingrepated start
cnfg i2c settings map slave 0 config nack en40011000 00800000
mask: negative ACK of theI2C slave (for the CBC,MPA, SSA should be 0)
cnfg i2c settings map slave 0 config register address nbytes40011000 00000C00
mask: number of bytes inthe register address
cnfg i2c settings map slave 0 config data wr nbytes40011000 000003E0
mask: number of bytes forwrite transactions
cnfg i2c settings map slave 0 config data rd nbytes40011000 0000001F
mask: number of bytes forread transactions
cnfg i2c settings map slave 1 config 40011001 ffffffffConfiguration of the I2CSlave 1
cnfg i2c settings map slave 2 config 40011002 ffffffff ...cnfg i2c settings map slave 3 config 40011003 ffffffff ...cnfg i2c settings map slave 4 config 40011004 ffffffff ...cnfg i2c settings map slave 5 config 40011005 ffffffff ...
37
cnfg i2c settings map slave 6 config 40011006 ffffffff ...cnfg i2c settings map slave 7 config 40011007 ffffffff ...cnfg i2c settings map slave 8 config 40011008 ffffffff ...cnfg i2c settings map slave 9 config 40011009 ffffffff ...
cnfg i2c settings map slave 10 config4001100A ffffffff ...
cnfg i2c settings map slave 11 config4001100B ffffffff ...
cnfg i2c settings map slave 12 config4001100C ffffffff ...
cnfg i2c settings map slave 13 config4001100D ffffffff ...
cnfg i2c settings map slave 14 config4001100E ffffffff ...
cnfg i2c settings map slave 15 config4001100F ffffffff ...
cnfg i2c settings map slave 16 config40011010 ffffffff ...
cnfg i2c settings map slave 17 config40011011 ffffffff ...
cnfg i2c settings map slave 18 config40011012 ffffffff ...
cnfg i2c settings map slave 19 config40011013 ffffffff ...
cnfg i2c settings map slave 20 config40011014 ffffffff ...
cnfg i2c settings map slave 21 config40011015 ffffffff ...
cnfg i2c settings map slave 22 config40011016 ffffffff ...
cnfg i2c settings map slave 23 config40011017 ffffffff ...
cnfg i2c settings map slave 24 config40011018 ffffffff ...
cnfg i2c settings map slave 25 config40011019 ffffffff ...
cnfg i2c settings map slave 26 config4001101A ffffffff ...
cnfg i2c settings map slave 27 config4001101B ffffffff ...
cnfg i2c settings map slave 28 config4001101C ffffffff ...
38
cnfg i2c settings map slave 29 config4001101D ffffffff ...
cnfg i2c settings map slave 30 config4001101E ffffffff ...
cnfg i2c settings map slave 31 config4001101F ffffffff
Configuration of the I2CSlave 31
cnfg fast triggers to accept 40012000 ffffffffnumber of triggers toaccepted (0 - continuoustriggering)
cnfg fast user frequency 40012001 fffffffffrequency of thetrigger source=3 (1...1000kHz range).
cnfg fast source 40012002 0000000f
trigger source: 1 - TTC, 2 -Stubs, 3 - User Trigger, 4 -TLU, 5 - External trigger,6 - test pulse FSM, 7 - UIBantenna FSM, 8 -Consecutive triggers FSM
cnfg fast stub mask 40012003 0000ffffmask of the hybrids totrigger stubs (useful forcoincidence)
cnfg fast stub veto length 40012003 01ff0000
on stubs self-triggering:number of 40MHz clockcycles to raise VETO andignore the coming triggers
cnfg fast delay after fast reset 40012004 fffffffftrigger source 6: delay afterthe fast reset command
cnfg fast delay after test pulse 40012005 fffffffftrigger source 6: delay afterthe test pulse
cnfg fast delay before next pulse 40012006 fffffffftrigger source 6: delay afterthe l1a before the next fastreset (if enabled)
cnfg fast stub trigger delay 40012007 000001ffsetting the delay of thestub trigger (trigger source2)
cnfg fast ext trigger delay value 40012007 0003fe00delay of the externaltrigger (trigger source 4,5)
cnfg fast delay after antenna trigger40012007 07fc0000
antenna FSM: delay afterantenna trigger
cnfg fast delay between consecutive trigeers40012008 0000ffff
consecutive FSM: delaybetween two consecutiveL1A signals
39
cnfg fast backpressure enable 40012009 00000001
enable internalbackpressure handling.should be enabled bydefault, and disabled onlyif triggering system knowshowto handle d19cbackpressure
cnfg fast stubor enable 40012009 00000002trigger source 2: 1 - triggeron StubOR, 0 - trigger onHitOR
cnfg fast initial fast reset enable 40012009 00000004enable initial fast resetafter sending start triggercommand
cnfg fast tp fsm fast reset en 40012009 00000010trigger source 6: enablefast reset signal
cnfg fast tp fsm test pulse en 40012009 00000020trigger source 6: enable calpulse signal
cnfg fast tp fsm l1a en 40012009 00000040trigger source 6: enable l1asignal
cnfg phy slvs raw mode en 40013000 00000001
1 - outputs 20000 clockcycles of the raw data fromthe stub lines, 0 - outputsparsed counters for theMPA/SSA chips
cnfg phy i2c freq 40013001 0000000f
setting of the i2c frequencyclock. (d0, 0.1MHz),(d1,0.5MHz), (d2,0.4MHz),(d3,0.8MHz), (d4,1MHz),(d5,2MHz), (d6,4MHz),(d7,5MHz), (d8,8MHz),(d9,10MHz)
cnfg phy i2c master en 40013001 00000100enable I2C master, 1 bydefault
cnfg phy SSA gen delay trig data 40013004 00001f00
For testing of the singleMPA chip. Registercontrols the delay (25nsunits) with which the fakeSSA like l1 data is sendwith respect to the l1trigger.
40
cnfg phy SSA gen delay stub data 40013004 001f0000
For testing of the singleMPA chip. Registercontrols the delay (25nsunits) with which the fakeSSA like stub data is sendwith respect to the testpulse trigger.
cnfg phy SSA enable gen l1 data 40013004 01000000
For testing of the singleMPA chip. When 1 thesending of SSA like l1 datais enabled.
cnfg phy SSA gen left lateral data format40013005 000000ff
For testing of the singleSSA chip. Registercontains the value that willbe send to the left lateraldata line on reception of atest pulse trigger.
cnfg phy SSA gen right lateral data format40013005 0000ff00
For testing of the singleSSA chip Register containsthe value that will be sendto the right lateral dataline on reception of a testpulse trigger.
cnfg phy SSA gen delay lateral data 40013005 001f0000
For testing of the singleSSA chip. Register controlsthe delay (25ns units) withwhich the fake SSA likelateral data is send withrespect to the test pulsetrigger.
41
cnfg phy SSA gen stub data format 0 040013006 ffffffff
For testing of the singleMPA chip. Register tochange the data content ofthe stub data send by theSSA data generator. Whatyou put here as contentwill be put on the stubdata line. For every line(there are 8 in total) youcan configure an eventwhich is 64 bits long, sothis is 8 25ns clock cycles.the MSB from i 0 will beoutputted first until theLSB of i 0 and followed byi 1 MSB until i 1 LSB, thisfor every line.
cnfg phy SSA gen stub data format 0 140013007 ffffffff See above
cnfg phy SSA gen stub data format 1 040013008 ffffffff See above
cnfg phy SSA gen stub data format 1 140013009 ffffffff See above
cnfg phy SSA gen stub data format 2 04001300a ffffffff See above
cnfg phy SSA gen stub data format 2 14001300b ffffffff See above
cnfg phy SSA gen stub data format 3 04001300c ffffffff See above
cnfg phy SSA gen stub data format 3 14001300d ffffffff See above
cnfg phy SSA gen stub data format 4 04001300e ffffffff See above
cnfg phy SSA gen stub data format 4 14001300f ffffffff See above
cnfg phy SSA gen stub data format 5 040013010 ffffffff See above
cnfg phy SSA gen stub data format 5 140013011 ffffffff See above
cnfg phy SSA gen stub data format 6 040013012 ffffffff See above
42
cnfg phy SSA gen stub data format 6 140013013 ffffffff See above
cnfg phy SSA gen stub data format 7 040013014 ffffffff See above
cnfg phy SSA gen stub data format 7 140013015 ffffffff See above
cnfg phy SSA gen l1 data format 0 40013016 00ffffff
For testing of the singleMPA chip. Registers tochange the data content ofthe L1 data send by theSSA data generator onarrival of l1 trigger.Channel data 119 until 96,so 8 MSBs are not used inthis register
cnfg phy SSA gen l1 data format 1 40013017 ffffffffSee above. Channel data95 downto 64.
cnfg phy SSA gen l1 data format 2 40013018 ffffffffSee above. Channel data63 downto 32.
cnfg phy SSA gen l1 data format 3 40013019 ffffffffSee above. Channel data31 downto 0.
cnfg phy SSA gen HIP data format4001301a 00ffffff
See above. Content of theHIP flags.
cnfg phy SSA gen offset SSA BX cnt format4001301b 000001ff
For testing of the singleMPA chip. Register tochange the offset value ofthe BX counter of the L1data send by the SSA datagenerator. After a resync =fast reset this countercomes to the value definedbelow. After a reset of theFW the counter starts at 1.So in general this BXcounter starts counting at1, increments by 1 when itreceives a trigger, is set tothe value specified belowwhen it receives a resync.
cnfg readout packet nbr 40015000 0000ffffnumber of events in onepacket, 0 means 1 event
43
cnfg readout data handshake enable 40015001 00000001enable data handshake -should be enabled
cnfg readout int trig enable 40015001 00000004enabling internal emulatorto generate trigger to thebackend processor block
cnfg readout int trig rate 40015001 000001f0internal emulator triggerrate
cnfg readout zero suppression enable40015001 00000200 enable zero suppression
cnfg readout trigger type 40015001 0000f000
selecting the internalemulator as the triggersource (0 - normal l1a, 1 -emulated trigger)
cnfg readout data type 40015001 000f00000 - normal data, 1 - patterngenerated internally in thereadout block
cnfg readout common stubdata delay 40015001 1ff00000 latency of the stub data
cnfg dio5 en 40016005 00000001enable DIO5 block -disabled by default
cnfg dio5 ch1 sel 40016000 ffffffffConfiguration of thechannel 1
cnfg dio5 ch1 out en 40016000 00000001 1 - Output, 0 - Input
cnfg dio5 ch1 term en 40016000 00000002Enable 50 Ohmtermination
cnfg dio5 ch1 threshold 40016000 0000ff00 0-255 threshold
cnfg dio5 ch2 sel 40016001 ffffffffConfiguration of thechannel 2
cnfg dio5 ch3 sel 40016002 ffffffffConfiguration of thechannel 3
cnfg dio5 ch4 sel 40016003 ffffffffConfiguration of thechannel 4
cnfg dio5 ch5 sel 40016004 ffffffffConfiguration of thechannel 5
cnfg tlu handshake mode 40017000 00000003
TLU handhsake mode: 0 -no handshake, 1 - simplehandshake, 2 - datahandshake
cnfg tlu enabled 40017000 00000010 Enable TLU block
ipb daq system ctrl 40020000 ffffffffBase for the controlregisters
clk 40 reset 40020001 00000001Clock generator block reset40MHz clock
44
ref clk reset 40020001 00000004Reset Reference clockgenerator
ttc dec reset 40020002 00000001 Reset TTC countersamc13 link reset 40020003 00000001 Reset AMC13 linkctrl command global reset 40021000 00000001 Global FC7 resetctrl command i2c 40021001 ffffffff I2C Master controlctrl command i2c reset 40021001 00000001 Reset I2C master
ctrl command i2c reset fifos 40021001 00000004Reset I2C command/replyFIFOs
ctrl command i2c command fifo 40021002 ffffffff Command FIFO
ctrl command i2c command type 40021002 f0000000
Command type: 0 - sendcommand to certainchip/hybrid, 1 - send to allchips of certain hybrid, 2 -send to all chips/hybrids, 3- send native broadcast tocertain hybrid, 4 - sendnative broadcast to allhybrids, 8 - send commandto the MPA SSA BoardMaster
ctrl command i2c command word id 40021002 08000000 word id: 0/1 alternating
ctrl command i2c command hybrid id40021002 07800000 the id of the hybrid
ctrl command i2c command chip id 40021002 007C0000 slave id
ctrl command i2c command readback 40021002 00020000readback the writecommand
ctrl command i2c command read 40021002 00010000 1 - read, 0 - writectrl command i2c command register 40021002 0000ffff register addressctrl command i2c command page 40021002 00000100 CBC pagectrl command i2c command data 40021002 00ffffff data to writectrl command i2c reply fifo 40021003 ffffffff Reply FIFOctrl command i2c reply type 40021003 f0000000 command typectrl command i2c reply hybrid id 40021003 07800000 hybrid idctrl command i2c reply chip id 40021003 007C0000 chip idctrl command i2c reply register 40021003 0000ff00 register addressctrl command i2c reply data 40021003 000000ff data byte 1ctrl fast 40022000 ffffffff TBPctrl fast reset 40022000 00000001 reset fast blockctrl fast start 40022000 00000002 start triggeringctrl fast stop 40022000 00000004 stop triggering
45
ctrl fast load config 40022000 00000008load the configuration fromthe ipbus registers
ctrl fast reset test pulse 40022000 00000010 reset test pulse FSM
ctrl fast signal fast reset 40022000 00010000send single fast reset fromipbus
ctrl fast signal test pulse 40022000 00020000send single cal pulse fromipbus
ctrl fast signal trigger 40022000 00040000 send single l1a from ipbusctrl fast signal orbit reset 40022000 00080000 send single bc0 from ipbusctrl fast signal i2c refresh 40022000 00100000 send i2c refresh to CBC2
ctrl fast signal duration 40022000 f0000000duration of the fast signal 0to 15, has to be encoded inparallel
ctrl phy chip hard reset 40023000 00000001 hard reset of the chipctrl phy phase tune again 40023000 00000002 do the phase tuning againctrl slvs debug fifo1 data 40023001 ffffffff slvs debug fifo 1ctrl slvs debug fifo2 data 40023002 ffffffff slvs debug fifo 2
ctrl phy fast cmd phase 40023003 000000ff
Change the phase of thefast cmd wrt the 320MHzclock going to the chip.The unit of the phase1/168th of the 320MHzclock cycle. Whatch out:this is an incrementalprocess.
ctrl phy ssa gen trig phase 40023003 0000ff00
For testing of single MPAchip. Change the phase ofthe fake L1 like data goingto the MPA wrt the320MHz clock going to thechip. The unit of the phase1/168th of the 320MHzclock cycle. Whatch out:this is an incrementalprocess.
46
ctrl phy ssa gen stub phase 40023003 00ff0000
For testing of the singleMPA chip. Change thephase of the fake stub datagoing to the MPA wrt the320MHz clock going to thechip. All 8 lines are phaseshifted with the sameamount. The unit of thephase 1/168th of the320MHz clock cycle.Whatch out: this is anincremental process.
ctrl phy ssa gen lateral phase 1 40023004 000000ff
For testing of the singleSSA chip. Change thephase of the fake leftlateral data going to theSSA chip. The unit of thephase 1/168th of the320MHz clock cycle.Whatch out: this is anincremental process.
ctrl phy ssa gen lateral phase 2 40023004 0000ff00
For testing of the singleSSA chip. Change thephase of the fake rightlateral data going to theSSA chip. The unit of thephase 1/168th of the320MHz clock cycle.Whatch out: this is anincremental process.
ctrl readout reset 40025000 00000001reset readout block.WARNING: it is notclearing
ctrl readout done 40025000 00000002 ...ctrl readout run fifo 40025001 ffffffff fifo readoutctrl dio5 load config 40026000 00000001 set dio5 thresholdsipb daq system stat 40030000 ffffffff Base for the status registersstat error full 40030001 00000fff i2c error block FIFOstat error block id 40030001 0000000f block idstat error code 40030001 00000ff0 error code
stat error counter 40030002 000001ffnumber of errors in theFIFO
47
stat firmware implementation 40030003 0000000fimplementation type: 0 -optical, 1 - electrical, 2 -emulation
stat firmware chip type 40030003 000000f0chip type: 0 - CBC2, 1 -CBC3, 2 - MPA, 3 - SSA
stat firmware num hybrids 40030003 0000ff00 number of hybridsstat firmware num chips 40030003 00ff0000 number of chips per hybrid
stat firmware fmc1 type 40030003 0f000000
fmc1(top) type: 0 - none, 1- DIO5, 2 - 2xCBC2, 3 -8xCBC2, 4 - 2xCBC3, 5 -8xCBC3 FMC1, 6 -8xCBC3 FMC2, 7 -1xCBC3, 8 - MPA SSABoard (Rutgers), 9 - FermiTrigger Board, 14 -OptoQuad
stat firmware fmc2 type 40030003 f0000000fmc2(bottom) type, seeabove
stat clk 40 locked 40030004 00000001 locked the MMCM
stat ref clk locked 40030004 00000002reference clock MMCMlocked
stat ttc rdy 40030005 00000001 TTC block readystat ttc dec single err cnt 40030006 0000ffff ...stat ttc dec double err cnt 40030006 ffff0000 ...stat command i2c fsm 40031001 0000000f Status of the I2C FSM
stat command i2c fifo commands empty40031002 00000001
I2C Commands FIFOEmpty
stat command i2c fifo commands full40031002 00000002 I2C Commands FIFO Full
stat command i2c fifo replies empty 40031003 00000001 I2C Replies FIFO Emptystat command i2c fifo replies full 40031003 00000002 I2C Replies FIFO Full
stat command i2c nreplies present 40031004 ffffffffNumber of I2C repliesavailable
stat fast fsm configured 40032000 000000401 when fast commandblock is configured
stat fast fsm state 40032000 00000030status of the triggeringFSM
stat fast fsm source 40032000 0000000f trigger source
stat fast trigger in counter 40032001 ffffffff
trigger in counter - notconsidering backpressure,so may be bigger than realamounts of triggers sent
48
stat phy phase tuning done 40033002 00010000phase tuning done signal(for all chips)
stat phy phase fsm state chip0 40033002 00f00000
status of the tuning fsm,chip 0: 1 - idle(means, thatchip is in the reset state, orno clock coming to thechip), 4 - tuned, 5 - failed
stat phy phase fsm state chip1 40033002 0f000000 same for chip 1
stat phy phase fsm state chip0 ssa lateral l40033002 0000000f
status of the phase tuningfor ssa lateral data l
stat phy phase fsm state chip0 ssa lateral r40033002 000000f0
status of the phase tuningfor ssa lateral data r
stat slvs debug general 4003300F ffffffff TBP
stat slvs debug mpa l1 ready 4003300F 00000001SLVS debug L1 linebuffered
stat slvs debug mpa stub ready 4003300F 00000002SLVS debug stub databuffered
stat slvs debug mpa counters ready 4003300F 00000004SLVS debug - countersbuffering done
stat slvs debug mpa counters store fsm state4003300F 000000f0 TBP
stat slvs debug mpa l1 0 40033010 ffffffffL1 data first register - next49 are booked
stat slvs debug mpa stub 0 40033042 ffffffffStub data - next 79 arebooked
stat slvs debug lateral 0 40033092 ffffffff Lateral dataevnt cnt 40034000 00ffffff L1 counter (last value)evnt cnt buf empty 40034000 01000000 L1 counter FIFO empty
data payload buf empty 40034000 02000000Data Payload FIFO empty(all FIFO’s together)
trigdata buf empty 40034000 04000000Triggered data bufferempty
frame cnt chip1 40034001 ffffffff frame counters for chip 0frame cnt chip2 40034002 ffffffff ...frame cnt chip3 40034003 ffffffff ...frame cnt chip4 40034004 ffffffff ...frame cnt chip5 40034005 ffffffff ...frame cnt chip6 40034006 ffffffff ...frame cnt chip7 40034007 ffffffff ...frame cnt chip8 40034008 ffffffff ...
readout req 40035000 00000001Readout Requirement forthe handshake mode
49
fsm status 40035000 00000ff0 Readout FSM Status
words cnt 40035001 ffffffffNumber of words in thereadout FIFO
stat dio5 not ready 40036000 00000001Still configuring the DIO5thresholds
stat dio5 error 40036000 00000002Error when configuringthresholds via I2C
cnfg mpa ssa board i2c master en 40019000 00000001Enabled bit of the MPASSA Board I2C master
cnfg mpa ssa board i2c freq 40019000 0000003CI2C Frequency - same asfor the Phy Block Master
cnfg mpa ssa board slave 0 config 40019002 ffffffff Config of the Slave 0
cnfg mpa ssa board slave 0 config i2c address40019002 fe000000 I2C Address
cnfg mpa ssa board slave 0 config register address nbytes40019002 000000C0
Register address number ofbytes
cnfg mpa ssa board slave 0 config data wr nbytes40019002 00000030 Number of write data bytes
cnfg mpa ssa board slave 0 config data rd nbytes40019002 0000000C Number of read data bytes
cnfg mpa ssa board slave 0 config stop for rd en40019002 00000002 Stop for RD enable
cnfg mpa ssa board slave 0 config nack en40019002 00000001 negative ack at the end
cnfg mpa ssa board slave 1 config 40019003 ffffffff ...cnfg mpa ssa board slave 2 config 40019004 ffffffff ...cnfg mpa ssa board slave 3 config 40019005 ffffffff ...cnfg mpa ssa board slave 4 config 40019006 ffffffff ...cnfg mpa ssa board slave 5 config 40019007 ffffffff ...cnfg mpa ssa board slave 6 config 40019008 ffffffff ...cnfg mpa ssa board slave 7 config 40019009 ffffffff ...cnfg mpa ssa board slave 8 config 4001900A ffffffff ...cnfg mpa ssa board slave 9 config 4001900B ffffffff ...cnfg mpa ssa board slave 10 config 4001900C ffffffff ...cnfg mpa ssa board slave 11 config 4001900D ffffffff ...cnfg mpa ssa board slave 12 config 4001900E ffffffff ...cnfg mpa ssa board slave 13 config 4001900F ffffffff ...cnfg mpa ssa board slave 14 config 40019010 ffffffff ...cnfg mpa ssa board slave 15 config 40019011 ffffffff ...
ctrl mpa ssa board reset 40029000 00000001board reset signal(is itused??)
50
mpa ssa i2c request command type 40021002 f0000000MPA ssa reques commandtype
mpa ssa i2c request word id 40021002 0C000000 word idmpa ssa i2c request word0 slave id 40021002 03E00000 slave idmpa ssa i2c request word0 board id 40021002 00100000 board idmpa ssa i2c request word0 read 40021002 00010000 readmpa ssa i2c request word0 register 40021002 0000ffff register addressmpa ssa i2c request word1 data 40021002 00ffffff data to writempa ssa i2c reply slave id 40021003 f8000000 reply slave idmpa ssa i2c reply board id 40021003 04000000 board idmpa ssa i2c reply err 40021003 01000000 errormpa ssa i2c reply data 40021003 00ffffff data back
mpa ssa i2c wrapper state 40039000 0000000fstate of the I2C wrapper ofthe mpa ssa board
51
B. I2C Transaction Format
52
C. Practicalities about the VHDL code
• In the code files or signals which have trig in the name have something to do withthe l1 or triggered data stream.
• The code, and also this document, uses the name fast command to address the T1command, this is the line at 320 MHz sent from FPGA to chips and which carrythe fast reset, l1 trigger,...
• Most of the type definitions can be found in the fw/src/usr/usr/user package basic.vhd
53
D. FW compilation settings using the emulator.
54
NU
MH
YB
RID
SN
UM
CH
IPS
CH
IPIM
PL
EM
EN
-T
AT
ION
FM
C1
FM
C2
ou
tpu
ton
fmc
wit
hem
ula
tor
inp
ut
on
fmc
wit
hem
ula
tor
EM
UL
A-
TIO
NT
YP
E
SLV
SD
EB
UG
EN
AB
LE
D
des
crip
tion
12
CB
3E
MU
LA
TIO
NF
MC
NO
NE
FM
CN
ON
EFA
LS
EFA
LS
ED
AQ
FA
LS
E1
FC
7se
t-up,
this
FP
GA
runs
the
DA
Qand
the
lines
are
inte
r-nally
connecte
dto
the
em
ula
tor
12
CB
C3
EM
UL
AT
ION
FM
C2C
BC
2F
MC
NO
NE
FA
LS
EFA
LS
ED
AQ
FA
LS
E2
FC
7se
t-up,
this
FP
GA
runs
the
DA
Q,
connecti
on
wit
h2C
BC
2F
MC
12
CB
C3
EM
UL
AT
ION
FM
C2C
BC
2F
MC
NO
NE
TR
UE
TR
UE
SE
ND
ER
FA
LS
E2
FC
7se
t-up,th
isF
PG
Aru
ns
the
em
ula
tor,
connec-
tion
wit
h2C
BC
2F
MC
11
MP
AE
MU
LA
TIO
NF
MC
NO
NE
FM
CN
ON
EFA
LS
EFA
LS
ED
AQ
TR
UE
1F
C7
set-
up,
this
FP
GA
runs
the
DA
Qand
the
lines
are
inte
r-nally
connecte
dto
the
em
ula
tor
11
MP
AE
MU
LA
TIO
NF
MC
2C
BC
2F
MC
NO
NE
FA
LS
EFA
LS
ED
AQ
TR
UE
2F
C7
set-
up,
this
FP
GA
runs
the
DA
Q,
connecti
on
wit
h2C
BC
2F
MC
11
MP
AE
MU
LA
TIO
NF
MC
2C
BC
2F
MC
NO
NE
TR
UE
TR
UE
SE
ND
ER
FA
LS
E2
FC
7se
t-up,th
isF
PG
Aru
ns
the
em
ula
tor,
connec-
tion
wit
h2C
BC
2F
MC
11
MP
AE
MU
LA
TIO
NF
MC
MP
AS
SA
BO
AR
D
FM
CN
ON
ET
RU
ET
RU
ES
EN
DE
RFA
LS
E2
FC
7se
t-up,th
isF
PG
Aru
ns
the
em
ula
tor,
con-
necti
on
over
the
standard
FM
Cfo
rconnecti
on
toa
single
MP
Aor
SSA
chip
.T
he
DA
QF
Wfo
rth
isse
t-up
isth
edefa
ult
one
specifi
ed
inta
ble
??
Tab
le4:
FW
com
pilat
ion
sett
ings
for
test
ing
wit
hem
ula
tor.
55
E. Practicalities about the HW
The FC7 can be ran in 2 different configurations, either in crate (figure 222) or in atable top set-up (figure 23). For simple test stands the table top solution is preferred asit is much less expensive. For larger size systems where there is the need to run multipleFC7s at the same time a crate set-up is required.
Running in crate requires the following parts:
• uTCA crate with an AMC13 compatible backplane.
• uTCA power module.
• MCH shelf manager.
• Optionally an AMC13 (on the time of writing (3th March 2018) the set-up doesnot use the AMC13 board yet)).
Running on table top requires the following parts:
• Table top interconnect card developed for FC7.
• Ethernet adapter
• Power adapter
In both cases also a uSD card for the FC7 is useful to store FW images and dependingon the set-up one should acquire the correct FMC cards and cables to connect to thefront-end.
Figure 22: FC7 in crate. Figure 23: Table top set-up for FC7.
2The image shows an example set-up, not all parts are mandatory for every set-up.
56
References
[1] FC7 Home Page - https://espace.cern.ch/project-FC7/SitePages/Home.
aspx
[2] i2c for CMS PS and 2S modules - https://espace.cern.ch/CMS-MPA/
SiteAssets/SitePages/Documents/I2C_bus_specifications_V2_0.pdf
[3] Fast commands for outer tracker. - https://espace.cern.ch/CMS-MPA/
SiteAssets/SitePages/Documents/clk-T1_distribution_V3.pdf
[4] CBC2 user manual - http://www.hep.ph.ic.ac.uk/~dmray/CBC_
documentation/CBC2_User_Guide_v1.1.doc
[5] CBC3 user manual - http://www.hep.ph.ic.ac.uk/ASIC/cbc3/
[6] MPA SSA home page - https://espace.cern.ch/CMS-MPA/SitePages/
Documents.aspx
[7] d19c gitlab - https://gitlab.cern.ch/cms_tk_ph2/d19c-firmware
[8] RARP Package - https://pkgs.org/download/rarpd
[9] IPBus register allocation memory - https://gitlab.cern.ch/cms_tk_ph2/
d19c-firmware/blob/master/doc/IPBus_Tutorial.md
[10] Xilinx UG471 7 Series FPGAs SelectIO Resources - https://www.xilinx.com/
support/documentation/user_guides/ug471_7Series_SelectIO.pdf
[11] LVDS 4x Asynchronous Oversampling Using 7 Series FPGAs and Zynq-7000 APSoCs - https://www.xilinx.com/support/documentation/application_notes/xapp523-lvds-4x-asynchronous-oversampling.pdf
57