92
CERN LHCC 2003-031 LHCb TDR 10 9 September 2003 LHCb Trigger System Technical Design Report Printed at CERN Geneva, 2003 ISBN 92–9083–208–8

Trigger System Technical Design Report - WebDocs < …ikisel/referat/lhcc-2003-031.pdfCERN LHCC 2003-031 LHCb TDR 10 9 September 2003 LHCb Trigger System Technical Design Report Printed

Embed Size (px)

Citation preview

CERN LHCC 2003-031LHCb TDR 109 September 2003

LHCbTrigger System Technical Design Report

Printed at CERNGeneva, 2003

ISBN 92–9083–208–8

ii

iii

The LHCb Collaboration

Brasilian Center for Research in Physics, CBPF, Rio de Janeiro, BrasilR. Antunes Nobrega, A. Franca Barbosa1) , I. Bediaga, G. Cernicchiaro, E. Correa de Oliveira, J. Magnin,J. Marques de Miranda, A. Massafferri, E. Polycarpo, A. Reis

Federal University of Rio de Janeiro, UFRJ, Rio de Janeiro, BrasilK. Akiba, S. Amato, T. da Silva, J.R.T. de Mello Neto, B. de Paula, L. de Paula, M. Gandelman, J.H. Lopes,B. Marechal, F. Marinho, D. Moraes2) , N. Pelloux1), C. Perreira Nunes

LAPP Annecy, IN2P3-CNRS, Annecy-Le-Vieux, FranceJ. Ballansat, D. Boget, P. Delebecque, I. De Bonis, D. Decamp, C. Drancourt, N. Dumont Dayot, C. Girard,B. Lieunard, M.-N. Minard, B. Pietrzyk, H. Terrier

LPC Clermont, IN2P3-CNRS and University Blaise Pascal, Clermont-Ferrand, FranceZ. Ajaltouni, G. Bohner, C. Carloganu, R. Cornat, O. Deschamps, P. Henrard, J. Lecoq, R. Lefevre, S. Monteil,P. Perret, C. Rimbault, A. Robert

CPPM Marseille, IN2P3-CNRS and University of Aix-Marseille II, Marseille, FranceE. Aslanides, J.-P. Cachemiche, B. Dinkespiler, P.-Y. Duval, R. Le Gac, V. Garonne, O. Leroy, P.-L. Liotard,M. Menouni, A. Tsaregorodtsev, B. Viaud

LAL Orsay, IN2P3-CNRS and University of Paris-Sud, Orsay, FranceG. Barrand, C. Beigbeder-Beau, R. Beneyton, D. Breton, O. Callot1) , D. Charlet, B. D’Almagne, B. Delcourt,O. Duarte, F. Fulda Quenzer, B. Jean-Marie, J. Lefrancois, F. Machefert, P. Robbe, M.-H. Schune, V. Tocut,K. Truong

Technical University of Dresden, Dresden, GermanyR. Schwierz, B. Spaan

Max-Planck-Institute for Nuclear Physics, Heidelberg, GermanyM. Agari, C. Bauer, D. Baumeister, J. Blouw, N. Bulian, H.P. Fuchs, W. Hofmann, K.T. Knopfle, S. Lochner,A. Ludwig, M. Schmelling, B. Schwingenheuer

Physics Institute, University of Heidelberg, Heidelberg, GermanyS. Bachmann, P. Bock, H. Deppe, F. Eisele, S. Henneberger, P. Igo-Kemenes, R. Rusnyak, U. Stange, U. Trunk,M. Walter, D. Wiedner, U. Uwer

Kirchhoff Institute for Physics, University of Heidelberg, Heidelberg, GermanyI. Kisel, V. Lindenstruth, M.W. Schulz2)

Laboratori Nazionali dell’ INFN, Frascati, ItalyG. Bencivenni, C. Bloise, F. Bossi, P. Campana, G. Capon, P. de Simone, C. Forti, G. Lanfranchi, F. Murtas,L. Passalacqua, V. Patera3), M. Poli Lener, A. Sciubba3)

University of Bologna and INFN, Bologna, ItalyG. Avoni, G. Balbi, M. Bargiotti, A. Bertin, M. Bruschi, A. Carbone, S. de Castro, P. Faccioli, L. Fabbri,D. Galli, B. Giacobbe, F. Grimaldi, I. Lax, U. Marconi, I. Massa, M. Piccinini, N. Semprini-Cesari, R. Spighi,V. Vagnoni, S. Vecchi, M. Villa, A. Vitale, A. Zoccoli

University of Cagliari and INFN, Cagliari, ItalyW. Bonivento, S. Cadeddu, A. Cardini, V. de Leo, C. Deplano, A. Lai, D. Raspino, B. Saitta

iv

University of Ferrara and INFN, Ferrara, ItalyW. Baldini, V. Carassiti, A. Cotta Ramusino, P. Dalpiaz, S. Germani, A. Gianoli, M. Martini, F. Petrucci,M. Savrie

University of Florence and INFN, Florence, ItalyA. Bizzeti, M. Lenti, M. Lenzi, G. Passaleva, P.G. Pelfer, M. Veltri

University of Genoa and INFN, Genoa, ItalyS. Cuneo, F. Fontanelli, V. Gracco, G. Mini, P. Musico, A. Petrolini, M. Sannino

University of Milano-Bicocca and INFN, Milano, ItalyT. Bellunato, M. Calvi, C. Matteuzzi, M. Musy, P. Negri, D. Perego, L. Trentadue4)

University of Rome, “La Sapienza” and INFN, Rome, ItalyG. Auriemma5), V. Bocci, C. Bosio, E. Dane, D. Fidanza5), A. Frenkel, G. Martellotti, G. Penso, S. Petrarca,D. Pinci, G. Pirozzi, W. Rinaldi, R. Santacesaria, C. Satriano5), A. Satta

University of Rome, “Tor Vergata” and INFN, Rome, ItalyG. Carboni, S. de Capua, D. Domenici, R. Messi, G. Natali, L. Pacciani, E. Santovetti

NIKHEF, The NetherlandsG. van Apeldoorn(i,iii), N. van Bakel(i,ii), T.S. Bauer(i), M. van Beuzekom(i), J.F.J. van den Brand(i,ii), H.J. Bul-ten(i,ii), M. Doets(i), R. Hierck(i), L. Hommels(i), J. van Hunen(i), E. Jans(i), T. Ketel(i,ii), S. Klous(i,ii),M.J. Kraan(i), M. Merk(i), F. Mul(ii), J. Nardulli(i), A. Pellegrino(i), G. Raven(i,ii), H. Schuijlenburg(i),T. Sluijk(i), P. Vankov(i), J. van Tilburg(i), H. de Vries(i), L. Wiggers(i), M. Zupan(i)(i) Foundation of Fundamental Research of Matter in the Netherlands(ii) Free University Amsterdam(iii) University of Amsterdam

Research Centre of High Energy Physics, Tsinghua University, Beijing, P.R.C.M. Bisset, J.P. Cheng, Y.G. Cui, Y. Dai, Y. Gao, H.J. He, C. Huang, C. Jiang, Y.P. Kuang, Q.Li, Y.J. Li,Y. Liao, J.P. Ni, B.B. Shao,J.J. Su, Y.R. Tian, Q. Wang, Q.S. Yan

Institute of Nuclear Physics and University of Mining and Metallurgy, Krakow, PolandK. Ciba, K. Galuszka, L. Hajduk, P. Kapusta, J. Michalowski, B. Muryn, Z. Natkaniec, A. Oblakowska-Mucha,G. Polok, M. Stodulski, M. Witek1), P. Zychowski

Soltan Institute for Nuclear Studies, Warsaw, PolandM. Adamus, A. Chlopik, Z. Guzik, A. Nawrot, K. Syryczynski, M. Szczekowski

National Institute for Physics and Nuclear Engineering, IFIN-HH, Bucharest-Magurele,RomaniaC. Coca, O. Dima, G. Giolu, C. Magureanu, M. Orlandea, S. Popescu1), A.M. Rosca6), P.D. Tarta

Institute for Nuclear Research (INR), Moscow, RussiaS. Filippov, J. Gavrilov, E. Guschin, V. Kloubov, L. Kravchuk, S. Laptev, V. Laptev, V. Postoev, G. Rybkine,A. Sadovski, I. Semeniouk, V. Strigin

Institute of Theoretical and Experimental Physics (ITEP), Moscow, RussiaS. Barsuk1), I. Belyaev1), B. Bobchenko, V. Dolgoshein, A. Golutvin, O. Gouchtchine, V. Kiritchenko, V. Ko-chetkov, I. Korolko, G. Pakhlova, E. Melnikov1), A. Morozov, P. Pakhlov, A. Petriaev, D. Roussinov, V. Rusinov,S. Semenov, S. Shuvalov, A. Soldatov, E. Tarkovski

v

Budker Institute for Nuclear Physics (INP), Novosibirsk, RussiaK. Beloborodov, A. Berdiouguine, A. Bondar, A. Bozhenok, A. Buzulutskov, S. Eidelman, V. Golubev,P. Krokovnyi, S. Oreshkin, A. Poluektov, S. Serednyakov, L. Shekhtman, B. Shwartz, Z. Silagadze, A. Sokolov,A. Vasiljev

Institute for High Energy Physics (IHEP-Serpukhov), Protvino, RussiaK. Beloous, V. Brekhovskikh, R.I. Dzhelyadin, Yu.P. Gouz, I. Katchaev, V. Khmelnikov, V. Kisselev, A. Ko-belev, A.K. Konoplyannikov, A.K. Likhoded, V.D. Matveev, V. Novikov, V.F. Obraztsov, A.P. Ostankov,V.Romanovski, V.I. Rykalin, M.M. Shapkin, A. Sokolov, M.M. Soldatov, V.V. Talanov, O.P. Yushchenko

Petersburg Nuclear Physics Institute, Gatchina, St. Petersburg, RussiaG. Alkhazov, V. Andreev, B. Botchine, V. Ganja, V. Goloubev, S. Guetz, A. Kashchuk, V. Lazarev, E. Maev,O. Maev, G. Petrov, N. Saguidova, G. Sementchouk, V. Souvorov1), E. Spiridenkov, A. Vorobyov, An. Vorobyov,N. Voropaev

University of Barcelona, Barcelona, SpainE. Aguilo, R. Ballabriga7) , M. Calvo, S. Ferragut, Ll. Garrido, D. Gascon, R. Graciani Diaz, E. Grauges Pous,S. Luengo7), D. Peralta, M. Rosello7), X. Vilasis7)

University of Santiago de Compostela, Santiago de Compostela, SpainB. Adeva, P. Conde8), C. Lois Gomez9), A. Pazos, M. Plo, J.J. Saborido, M. Sanchez Garcia, P. Vazquez Regueiro

University of Lausanne, Lausanne, SwitzerlandA. Bay, B. Carron, O. Dormond, L. Fernandez, R. Frei, G. Haefeli, J.-P. Hertig, C. Jacoby, P. Jalocha,S. Jimenez-Otero, F. Legger, L. Locatelli, N. Neufeld1), J.-P. Perroud, F. Ronga, T. Schietinger, O. Schneider,L. Studer, M.T. Tran, S. Villa, H. Voss

University of Zurich, Zurich, SwitzerlandR. Bernet, R.P. Bernhard, Y. Ermoline, J. Gassner, St. Heule, F. Lehner, M. Needham, P. Sievers, St. Steiner,O. Steinkamp, U. Straumann, A. Vollhardt, D. Volyanskyy, M. Ziegler10)

Institute of Physics and Technologies, Kharkiv, UkraineA. Dovbnya, Yu. Ranyuk, I. Shapoval

Institute for Nuclear Research, National Academy of Sciences, Kiev, UkraineV. Aushev, V. Kiva, I. Kolomiets, Yu. Pavlenko, V. Pugatch, Yu. Vasiliev

University of Bristol, Bristol, UKN.H. Brook, R.D. Head, A. Muir, A. Phillips, A. Presland, F.F. Wilson

University of Cambridge, Cambridge, UKA. Buckley, K. George, V. Gibson, K. Harrison, C.R. Jones, S.G. Katvars, J. Storey, C.P. Ward, S.A. Wotton

Rutherford Appleton Laboratory, Chilton, UKC.J. Densham, S. Easo, B. Franek, J.G.V. Guy, R.N.J. Halsall, G. Kuznetsov, P. Loveridge, D. Morrow,J.V. Morris, A. Papanestis, G.N. Patrick, M.L. Woodward

University of Edinburgh, Edinburgh, UKR. Chamonal, S. Eisenhardt, A. Khan, J. Lawrence, F. Muheim, S. Playfer, A. Walker

vi

University of Glasgow, Glasgow, UKA.G. Bates, A. MacGregor, V. O’Shea, C. Parkes, A. Pickford, M. Rahman, F.J.P. Soler11)

University of Liverpool, Liverpool, UKS. Biagi, T. Bowcock, G. Casse, R. Gamet, M. George, D. Hutchcroft, J. Palacios, G. Patel, I. Stavitskiy,M. Tobin, A. Washbrook

Imperial College, London, UKL. Allebone, G.J. Barber, W. Cameron, D. Clark, P. Dornan, A. Duane, U. Egede, A. Howard, S. Jolly,R. Plackett, D.R. Price, T. Savidge, D. Websdale, R. White

University of Oxford, Oxford, UKM. Adinolfi, J.H. Bibby, M.J. Charles12), C. Cioffi, G. Damerell, N. Harnew, F. Harris, I.A. McArthur, C. Newby,J. Rademacker, L. Somerville, A. Soroko, N.J. Smale, S. Topp-Jorgensen, G. Wilkinson

CERN, Geneva, SwitzerlandG. Anelli, F. Anghinolfi, N. Arnaud, F. Bal, A. Barczyk, J.C. Batista Lopes, M. Benayoun13), V. Bobillier,A. Braem, J. Buytaert, M. Campbell, M. Cattaneo, Ph. Charpentier, J. Christiansen, J. Closier, P. Collins,G. Corti, C. D’Ambrosio, H. Dijkstra, J.-P. Dufey, D. Eckstein, M. Ferro-Luzzi, W. Flegel, F. Formenti,R. Forty, M. Frank, C. Frei, C. Gaspar, P. Gavillet, A. Guirao Elias, T. Gys, F. Hahn, S. Haider, J. Harvey,J.A. Hernando Morata, E. van Herwijnen, H.J. Hilke, R. Jacobsson, P. Jarron, C. Joram, B. Jost, S. Koestner,D. Lacarrere, M. Letheren, C. Lippmann14), R. Lindner, M. Losasso, P. Mato Vila, M. Moritz, H. Muller,T. Nakada15), C. Padilla, U. Parzefall, W. Pokorski, S. Ponce, F. Ranjard, W. Riegler, G. Aglieri Rinella,E.M. Rodrigues16), D. Rodriguez de Llera Gonzalez, S. Roiser, T. Ruf, H. Ruiz Perez, B. Schmidt, T. Schneider,A. Schopper, A. Smith, F. Teubert, N. Tuning, O. Ullaland, P. Vannerem, W. Witzeling, K. Wyllie, Y. Xie

1) also at CERN, Geneva2) now at CERN, Geneva3) also at Dipartimento di Energetica, University of Rome, “La Sapienza”4) also at Universita degli Studi di Parma5) also at University of Basilicata, Potenza6) also at Humbolt University, Berlin7) also at departament d’Engineria Electronica La Salle, Universitat Ramon Llull, Barcelona8) now at DESY, Hamburg9) now at University of Zurich10) now at University of California, Santa Cruz11) also at Rutherford Appleton Laboratory, Chilton12) now at University of Iowa, Iowa City13) now at Universites de Paris VI et VII (LPNHE), Paris14) now at GSI, Darmstadt15) also at Lausanne, on leave from PSI, Villigen16) supported by a Marie Curie Fellowship under contract number HPMF-CT-2002-01708

Technical Associates InstitutesEspoo-Vantaa Institute of Technology, Espoo, FinlandEcole d’ingenieurs, Geneva, Switzerland

vii

Acknowledgements

The LHCb Collaboration is greatly indebted to all the technical and administrative stafffor their important contributions to the design, testing and prototype activities. Weare grateful for their dedicated work and are aware that the successful construction andcommissioning of the LHCb experiment will also in future depend on their skills andcommitment.

“Trigger” on cover page:The compact edition of the Oxford English DictionaryComplete text reproduced micrographically2 volumesOxford University Press(23 edition, 1984)

viii

Contents

1 Introduction 11.1 Physics requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Front-End Architecture Requirements . . . . . . . . . . . . . . . . . . . . . 21.3 Implementation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Organization of this Document . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Level-0 Calorimeter Triggers 92.1 Concepts of the L0 Calorimeter Trigger . . . . . . . . . . . . . . . . . . . . 92.2 L0 Calorimeter Trigger performance . . . . . . . . . . . . . . . . . . . . . . 112.3 ECAL and HCAL FE card . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 PreShower FE card . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 Validation Card . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.5.1 ECAL candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.5.2 HCAL candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.6 SPD Multiplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.7 Backplane and links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.8 Selection Crate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.8.1 Electromagnetic Candidates . . . . . . . . . . . . . . . . . . . . . . 152.8.2 SPD multiplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.8.3 HCAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.8.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.9 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.10 Debugging and Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Level-0 Muon Trigger 193.1 Overview of the muon system . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Trigger implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 Trigger performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3.1 Low-energy background . . . . . . . . . . . . . . . . . . . . . . . . 243.3.2 Beam halo muons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.3 Hardware parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4 Technical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.1 ODE Trigger interface . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.2 Processing Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.4.3 Muon selection board . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.4 Controller board . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

ix

x CONTENTS

3.4.5 Backplane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.4.6 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.4.7 DAQ Event size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.4.8 Debugging and monitoring tools . . . . . . . . . . . . . . . . . . . . 30

4 Level-0 Pile-Up System 314.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.3 Technical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.3.1 Beetle Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.3.2 Prototype Vertex Finder Board . . . . . . . . . . . . . . . . . . . . 354.3.3 Trigger System Architecture . . . . . . . . . . . . . . . . . . . . . . 36

5 Level-0 Decision Unit 395.1 L0DU inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.2 L0DU overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.3 L0DU Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.4 Studies for the final implementation . . . . . . . . . . . . . . . . . . . . . . 415.5 Debugging and monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Level-1 and High Level Trigger 436.1 Level-1 and HLT hardware implementation . . . . . . . . . . . . . . . . . . 44

6.1.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.1.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.2 The Level-1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.2.1 VELO 2D Track Reconstruction . . . . . . . . . . . . . . . . . . . . 526.2.2 Primary Vertex Search . . . . . . . . . . . . . . . . . . . . . . . . . 526.2.3 Level-0 Object Matching to 2D Tracks . . . . . . . . . . . . . . . . 536.2.4 VELO 3D Track Reconstruction . . . . . . . . . . . . . . . . . . . . 536.2.5 Level-0 Object Matching to 3D Tracks . . . . . . . . . . . . . . . . 536.2.6 VELO TT Matching and Momentum Determination . . . . . . . . 536.2.7 L1 timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.3 The HLT Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.3.1 VELO tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.3.2 VELO TT tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.3.3 Long tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.3.4 Tracking summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7 Performance Simulation 597.1 Performance of Level-0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7.1.1 Bandwidth Division . . . . . . . . . . . . . . . . . . . . . . . . . . . 607.2 Performance of Level-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.2.1 Generic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 637.2.2 Specific Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 637.2.3 Final decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647.2.4 Efficiencies and bandwidth division . . . . . . . . . . . . . . . . . . 64

7.3 Performance of the High Level Trigger . . . . . . . . . . . . . . . . . . . . 64

CONTENTS xi

7.3.1 Level-1 Confirmation . . . . . . . . . . . . . . . . . . . . . . . . . . 667.3.2 Exclusive Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

7.4 Trigger Performance Robustness . . . . . . . . . . . . . . . . . . . . . . . . 677.4.1 Resolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687.4.2 Execution Time and Event Size . . . . . . . . . . . . . . . . . . . . 687.4.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

8 Project Organization 718.1 Cost and Funding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718.2 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718.3 Division of Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

A Scalability of Level-1 75

References 78

xii CONTENTS

Chapter 1 Introduction

The LHCb experiment [1, 2] is designedto exploit the large number of bb-pairs pro-duced in pp interactions at

√s=14 TeV at

the LHC, in order to make precise studiesof CP asymmetries and rare decays in b-hadron systems. LHCb is a single arm spec-trometer covering the range1 1.9 <η< 4.9.The spectrometer is shown in Figure 1.1,and consists of the Vertex Locator (VELO),the Trigger Tracker (TT), the dipole mag-net, two Ring Imaging Cherenkov detec-tors (RICH1&2), three tracking stationsT1–T3, the Calorimeter system and theMuon system. LHCb uses a right handedcoordinate system with the z-axis point-ing from the interaction point towards themuon chambers along the beam-line, andthe y-axis is pointing up-wards. All of thesystems are described in detail in their re-spective Technical Design Reports [3-10],apart from TT and a new layout of RICH1which are described in the LHCb Reopti-mization TDR [2]. The LHCb experimentplans to operate at an average luminosity of2×1032 cm−2s−1, i.e. much lower than themaximum design luminosity of the LHC,which makes the radiation damage moremanageable. A further advantage is thatat this luminosity the number of interac-tions per crossing is dominated by singleinteractions, which facilitates the triggeringand reconstruction by assuring low channeloccupancy. Due to the LHC bunch struc-ture and low luminosity the frequency ofcrossings with interactions visible2 by the

1The pseudo-rapidity η = −ln(tan(θ/2)) where θ is theangle between a track and the beam-line.

2An interaction is defined to be visible if it produces at

spectrometer is about 10 MHz, which hasto be reduced by the trigger to a few hun-dred Hz, at which rate the events are writ-ten to storage for further offline analysis.This reduction is achieved in three triggerlevels: Level-0 (L0), Level-1 (L1) and theHigh Level Trigger (HLT). Level-0 is imple-mented in custom electronics, while Level-1and the HLT are executed on a farm of com-modity processors.

In the remainder of this introduction therequirements from both the physics and theFront-End implementation will be given,followed by an overview of the whole triggerarchitecture.

1.1 Physics requirements

At a luminosity of 2× 1032 cm−2s−1 the10 MHz of crossings with visible pp interac-tion are expected to contain a rate of about100 kHz of bb-pairs. However, in onlyabout 15% of the events will at least one B-meson have all its decay products containedin the acceptance of the spectrometer. Fur-thermore the branching ratios of B-mesonsused to study CP violation are typically lessthan 10−3. The offline selections exploit therelatively large b mass and lifetime to selectthose b-hadrons, and stringent cuts haveto be applied to enhance signal over back-ground and thus increase the CP sensitiv-ity of the analysis. Hence the requirementfor the trigger is to achieve the highest effi-ciency for these offline selected events. The

least two charged particles with sufficient hits in the VELOand T1–T3 to allow them to be reconstructible.

1

2 CHAPTER 1. INTRODUCTION

250mrad

100mrad

M1

M3M2

M4 M5

RICH2HCAL

ECALSPD/PS

Magnet

T1T2T3

z5m

y

5m

10m 15m 20m

TTVertexLocator

RICH1

Figure 1.1: The layout of the LHCb spectrometer showing the VELO, the two RICH detectors, the four trackingstations TT and T1–T3, the 4 Tm dipole magnet, the Scintillating Pad detector (SPD), Preshower (PS), Electro-magnetic (ECAL) and Hadronic (HCAL) Calorimeters, and the five muon stations M1–M5.

trigger should be able to achieve high effi-ciency for a large variety of final states.

The trigger should allow overlapping andpre-scaled triggers. It should be possible toemulate the trigger from the data written tostorage, which will give an additional han-dle on trigger efficiencies and possible sys-tematics.

1.2 Front-End ArchitectureRequirements

A detailed description of the requirementsto the Front-End (FE) electronics can befound in [11]. A schematic overview of theFE system is given in Figure 1.2. Here welimit ourselves to those requirements whichhave a particular influence on the designof the Level-0 and Level-1 triggers. TheLevel-0 and Level-1 decisions are transmit-ted to the FE electronics by the TTC sys-tem [12].

The latency of Level-0, which is the timeelapsed between a pp interaction and thearrival of the Level-0 trigger decision at the

FE, is fixed to 4 µs [13]. This time in-cludes the time-of-flight, cable length andall delays in the FE, leaving 2 µs for theactual processing of the data in the Level-0trigger to derive a decision. The FE is re-quired to be able to readout events in 900ns, which combined with the 16-event-deepde-randomizer before the L1-buffer and a1 MHz Level-0 accept rate gives less than0.5% deadtime [14]. Level-0 will deliver itsdecision every 25 ns to the Readout Super-visor [15], which emulates the L0-buffer oc-cupancy, and prevents buffer overflows bythrottling the L0 accept rate.

The Level-1 trigger is a variable latencytrigger and the data delivered to the trig-ger processors by the FE system must bedelivered in chronological order and taggedwith bunch and event identifiers. Despitethe variable latency, Level-1 delivers a de-cision for each event in the same order tothe Readout Supervisor. The maximumLevel-1 output rate is fixed to 40 kHz,and overflow of the Level-1 de-randomizerbuffers is prevented by a throttle systemcontrolled by the Readout Supervisor. The

1.3. IMPLEMENTATION OVERVIEW 3

Figure 1.2: The FE architecture of LHCb.

depth of the L1-buffer3, combined with a1 MHz Level-0 rate, and the requirementto deliver the decisions chronologically or-dered allows a latency of up to 58 ms.

1.3 Implementation Overview

Given the physics requirement to achievea maximum trigger efficiency for offline se-lected events, the main aim of the trigger

3The size of the Level-1 buffer is 2 M words. With a 36-word-long event fragment, which contains 32 channels andup to four data tags, this allows up to 58254 events to bebuffered.

implementation is to enable trigger algo-rithms to have access to the same data asthe offline analysis, and anticipates the se-lection algorithms as closely as possible atthe highest possible rate. Figure 1.3 showsan overview of the sub-detectors participat-ing in the three trigger levels.

Level-0

The purpose of Level-0 is to reduce the LHCbeam crossing rate of 40 MHz, which con-tains about 10 MHz of crossing with visi-ble pp-interactions, to the rate at which in

4 CHAPTER 1. INTRODUCTION

PS S

PD

��������

��������

�����

�����

����������

����������

Level−1decisionsorter

Pile−UpTT

M2M1

M3M4

EC

ALHCAL

Magnet

M5

VELO

Storage

200 Hz40 kHz HLT1 MHz

per crossing# interactions

T3 T2 T1

To FE

40 MHz

All DATA

Readout Supervisortiming & fast control

Muon Trigger

SPD−multiplicity

Calorimeter Triggers

Level−0 Level−1

Pile−Up System

RICH2

RICH1

CPU farm L1 & HLT

L0 DecisionUnit definesL0−trigger

hadron, e, Highest E clusters:

γ, π0

Two highestp muonsT

T

VELO

Figure 1.3: Overview of the three trigger levels. Stations M1–M5 are used to reconstruct two muons per quadrant.The SPD, PS, ECAL and HCAL are used to reconstruct the hadron, e, γ and π0 with the largest transverseenergy, the charged particle multiplicity, and the total energy. The Pile-Up detector is used to recognize multipleinteractions per crossing. Level-1 uses the information from VELO, TT, and Level-0 to reduce the rate to 40 kHz.T1–T3 and M2–M4 could be included in Level-1. The HLT uses all data in the event apart from the RICH toreduce the rate to 200 Hz. Level-0 is executed in full custom electronics, while Level-1 and HLT are softwaretriggers which share a commodity farm of 1800 CPUs.

1.3. IMPLEMENTATION OVERVIEW 5

principle all sub-systems could be used forderiving a trigger decision. Due to theirlarge mass b-hadrons decay to give a largeET lepton, hadron or photon, hence Level-0reconstructs:

• the highest ET hadron, electron andphoton clusters in the Calorimeter,

• the two highest pT muons in the MuonChambers,

which information is collected by theLevel-0 Decision Unit to select events.Events can be rejected based on globalevent variables such as charged track mul-tiplicities and the number of interactions,as reconstructed by the Pile-Up system, toassure that the selection is based on b-signatures rather than large combinatorics,and that these events will not occupy a dis-proportional fraction of the data-flow band-width or available processing power in sub-sequent trigger levels.

All Level-0 triggers are fully syn-chronous, i.e. their latency does not dependupon occupancy nor on history. All Level-0electronics is implemented in full customboards.

The implementation of the calorimetertrigger is based on forming clusters byadding the ET of 2×2 cells on the FE-boards, and selecting the clusters with thelargest ET. Clusters are identified as e,γ or hadron depending on the informationfrom the Scintillating Pad Detector (SPD),Preshower (PS), Electromagnetic (ECAL)and Hadronic (HCAL) Calorimeter. TheET of all HCAL cells is summed to rejectcrossings without visible interactions. Thetotal number of SPD cells with a hit arecounted to provide a measure of the chargedtrack multiplicity in the crossing.

The muon chambers allow stand-alonemuon reconstruction with a pT resolution of20%. Track finding is performed by process-ing units, which combine the strip and paddata from the five muon stations to formtowers pointing towards the interaction re-

gion. One crate per quarter houses the trig-ger boards which reconstruct the two muonswith the largest pT.

The Pile-Up system aims at distinguish-ing between crossings with single and mul-tiple visible interactions. It uses four sili-con sensors of the same type as those usedin the VELO to measure the radial posi-tion of tracks, covering −4.2 < η < −2.9.The Pile-Up system provides the positionof the primary vertex candidates along thebeam-line and a measure of the total back-ward charged track multiplicity. The Pile-Up information allows a relative luminositymeasurement which is not affected by sys-tem deadtime, and monitors the beam con-ditions.

The Level-0 Decision Unit (L0DU) col-lects all information from Level-0 compo-nents to form the Level-0 Trigger. TheL0DU is able to perform simple arithmeticto combine all signatures into one decisionper crossing. This decision is passed tothe Readout Supervisor which transmits itsLevel-0 decision to the FE.

Level-1

At the 1 MHz output rate of Level-0 theremaining analogue data is digitized andall data is stored for the time needed toprocess the Level-1 algorithm. All sub-systems which deliver data to Level-1 makeuse of the same TELL1-board [16] to storethe data in the L1-buffer, to perform zero-suppression and formatting, and to inter-face to Level-1. The Level-1 algorithm willbe implemented on a commodity proces-sors farm, which is shared between Level-1,HLT and offline reconstruction algorithms.The Level-1 algorithm uses the informationfrom Level-0 , the VELO and TT. The al-gorithm reconstructs tracks in the VELO,and matches these tracks to Level-0 muonsor Calorimeter clusters to identify them andmeasure their momenta. The fringe field ofthe magnet between the VELO and TT is

6 CHAPTER 1. INTRODUCTION

used to determine the momenta of particleswith a resolution of 20–40%. Events are se-lected based on tracks with a large pT andsignificant impact parameter to the primaryvertex.

The event building architecture is in-spired by the one described in the OnlineSystem TDR [9], but adapted to profit fromnew technologies due to the delayed start-up of the LHC. The same event buildingnetwork is used to collect the Level-1 de-cisions from all the processors, after whichthey are sorted according to their Level-0event number and transmitted to the Read-out Supervisor, which transmits its Level-1decision to the FE. The maximum Level-1output rate has been fixed to 40 kHz to al-low the FE to execute more elaborate zero-suppression algorithms than the ones usedto prepare the information for the Level-1trigger. The implementation is easily scal-able to allow the inclusion of stations T1–T3 and M2–M5. This will improve theLevel-1 performance, and this implemen-tation is described in Appendix A, but allperformance figures given in this TDR willassume that the Level-1 algorithm will notuse T1–T3 or M2–M5.

High Level Trigger

The HLT will have access to all data.Since the design of the Online System [9]the LHCb spectrometer has been reop-timized [2], resulting in a considerablysmaller data-size. Level-1 and HLT eventbuilding now share the same network, andthis TDR supersedes the implementationas described in [9]. The HLT and Level-1algorithms run concurrently on the sameCPU nodes, with the Level-1 taking prior-ity due to its limited latency budget. TheHLT algorithm starts with reconstructingthe VELO tracks and the primary vertex,rather than having this information trans-mitted from Level-1. A fast pattern recog-nition program links the VELO tracks to

the tracking stations T1–T3. The final se-lection of interesting events is a combina-tion of confirming the Level-1 decision withbetter resolution, and selection cuts ded-icated to specific final states. While themaximum output rates of the first two trig-ger levels are dictated by the implementa-tions of the FE hardware, the output rateof the HLT is kept more flexible. Consider-ing the channels currently under study onecould envisage output rates of a few Hz.However, the RICH information is not cur-rently used by the HLT, and selection cutshave to be relaxed compared to the finalselection to study the sensitivity of the se-lections and profit from refinements to thecalibration constants. These considerationslead to an output rate of 200 Hz of eventsaccepted by the HLT. The total CPU farmwill contain about 1800 nodes. It is derivedfrom the the expected CPU power in 2007,and performance studies discussed below,that the L1 and HLT algorithms will useabout 55% and 25% of the available com-puting resources respectively. The remain-ing resources are used to fully reconstructevents accepted by the HLT, including theparticle identification, before being writtento storage.

1.4 Organization of this Doc-ument

While the trigger is logically divided intothree trigger levels, its implementation isdivided into five hardware sub-systems,four Level-0 sub-systems and one sub-system for Level-1 and HLT combined.The Level-0 sub-systems are the Calorime-ter Triggers, the Muon Trigger, the Pile-Up Trigger and the Level-0 Decision Unit.Level-1 and the HLT form one sub-systemfrom the technical design point of view,since they share the same event buildingnetwork and processor farm. In the fol-lowing chapters each of the five sub-system

1.4. ORGANIZATION OF THIS DOCUMENT 7

designs is described separately, includingR&D and prototyping. Where applicable,the sensitivity of the performance of a sub-system to the LHC environment, so calledrobustness, will also be included, while theperformance of the trigger as a whole willbe described in Chapter 7. The last chapterdeals with project organization.

8 CHAPTER 1. INTRODUCTION

Chapter 2 Level-0 Calorimeter Triggers

The purpose of the Calorimeter Trig-gers is to select and identify particles withhigh ET deposit in the calorimeters. Aschematic view of the calorimeter is shownon Figure 2.1, showing the four detectorsinvolved:

���

��������

� � � � �

� � �

Figure 2.1: Schematic side view of the calorimeter sys-tem.

• The SPD (Scintillator Pad Detector)identifies charged particles, and allowselectrons to be separated from pho-tons.

• The PreShower detector, after 2.5 ra-diation length of lead, identifies elec-tromagnetic particles.

• The electro-magnetic calorimeterECAL, of the shashlik type, measuresthe energy of electromagnetic showers.

• The hadronic calorimeter HCAL,made of iron with scintillator tiles,measures the energy of the hadrons.

The first three detectors have the samecell geometry, displayed in Figure 2.2. Thecells are about 4 × 4 cm2 in the central re-gion, 6 × 6 cm2 in the middle region and

Figure 2.2: Layout of the SPD, Preshower and ECALcells. Each square represents 16 cells.

12 × 12 cm2 in the outer region. The ex-act size of the cells is proportional to theirdistance from the vertex in order to obtaina pointing geometry, and the total num-ber of cells in each detector is 5984. TheHCAL contains 1468 cells, with only twosizes, 13×13 cm2 and 26×26 cm2, such thatthe HCAL cell boundaries project to ECALcell boundaries. More details are given inthe Calorimeter TDR [4].

2.1 Concepts of the L0Calorimeter Trigger

The idea of the Calorimeter Triggers is tosearch for high ET particles: electrons, pho-tons, π0 or hadrons. The way to identifyeach flavour is described in Section 2.5.

Showers are relatively narrow, with en-ergy deposits in a small area. A zone of

9

10 CHAPTER 2. LEVEL-0 CALORIMETER TRIGGERS

� � � � � � � � � � � � �� � � � � � � � � � � � � � � �

� � � � � � � � � � �

� � � � � � � � � � � � � �

� � � � � � � ! � � � � � �

����"�

����!���

#��"��

���� �

����

$ � � � � ! % �

� � � � � � � � � � � �� � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � �� � � � � � � � � � # � � � � � � " �� � � � � � � � � �

&�� � �

�% $ � � � � ! % �

� � � � � � � ! � ' ! �

� � � � � � � � � �

( � � � � ! � � ) *

� ) � � � � � � � � � � !

� ) � � � �� � � � � !

� � � � �

� ) � � � �� � � � � !

� ) � � � �

� � � � � � � �

� � � � ! % �" � � � � + � � � � � � �

� � ! # , � � � � ! # , � �

� � � � � � � � � �

� � � , � # , � �

- ! � � $ � � � � � � ! � � � � � �# � � � � � � " � � � � � � � � �- ! � � � � � � � , � � # � �� � � " � � � � � � � � � � � � � � �

� � � � � � � � � � � � � �

� � � � � � ��

$ � � � � � � ! � � � � �

' (

� ) � � � �� � � � � � � !

� ) � � � �# � � � � !

� � � � � � � � � �

� ) � � � �

� � � � � � � �

� � � � � � � � � �

' (

� . � � � � � � � � � � ! � ' ! �

� � ! # , � � � � � � � � � �

� � � ) � � � � � % � � � � + �

� � � � � � � � � � � � �

� ) � � � �

� ) � � � � �

) � � � � � � � �

� ) � � � �

) � � � � � � � �

� � � � � � � � � �

� ( - (

� � ! # , � �� � �

� , � � # � � � *

� � �

� � � � � � � � � �

� � �� , � � # � � � *

� � �

Figure 2.3: Overall view of the Calorimeter Triggers.

2 by 2 cells is used, large enough to con-tain most of the energy, and small enoughto avoid overlap between various particles.Only the particle with the highest ET islooked at. Therefore at each stage only thehighest ET candidate is kept, to minimizethe number of candidates to process.

These candidates are provided by a threestep selection system:

• A first selection of high ET depositsis performed on the Front-End (FE)card, which is the same for ECAL andHCAL. Each card handles 32 cells, andthe highest ET sum over the 32 sumsof 2 × 2 cells is selected. To computethese 32 sums, access to cells in othercards is an important issue.

• The Validation Card merges theECAL with the PreShower andSPD information, prepared by thePreShower FE card, to identify thetype of electromagnetic candidate,

electron, photon and π0. Only onecandidate per type is selected and sentto the next stage. The same card alsoadds the energy deposited in ECALin front of the hadron candidates.A similar card computes the SPDmultiplicity in the PreShower crates.

• The Selection Crate selects the candi-date with the highest ET for each type,it also produces a measure of the TotalET in HCAL and the total SPD mul-tiplicity.

An overall view of the Calorimeter Trig-gers is shown on Figure 2.3. Care has beentaken to simplify the connections, and thesystem is fully synchronous, which will fa-cilitate commissioning and debugging.

The first two steps are performed on theplatform of the calorimeter, at a locationwith a radiation dose below 50 Gy dur-ing the whole lifetime of the experiment,and where Single Event Upsets (SEU) are

2.3. ECAL AND HCAL FE CARD 11

expected to occur. Each component hasbeen tested for radiation and SEU robust-ness [17, 18]. Anti-fuse PGAs are used, and“triple voting” techniques for all permanentmemories. In this technique, each bit isstored in three copies, and a majority voteis taken before using it, so that one singlebit flip does not affect the value.

2.2 L0 Calorimeter Triggerperformance

The L0 Calorimeter Trigger provides 7 in-puts to the Decision Unit, as shown on Fig-ure 2.3. Frequently, the same energy de-posit produces several candidates, as theelectron, photon and π0 triggers are not ex-clusive. A hadron with a large deposit inECAL may also produce electron, photonor π0 candidates. This overlap has advan-tages in terms of robustness of the systemand allows cross-checks of the trigger be-haviour, but makes difficult an analysis ofthe exclusive performance of each type ofcandidate. The overall performance in termof trigger efficiency is discussed in Chap-ter 7.

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5

Minimum biasB→ππ

Nominal Threshold

ET cut (GeV)

Effi

cien

cy (

%)

L0 hadron efficiency

Figure 2.4: Performance of the Hadron Trigger.

To illustrate how the signal events aredifferent from the minimum-bias events,Figure 2.4 shows the fraction of events overa given hadron threshold, as function ofthis ET threshold, both for minimum-biasevents and for offline selected B → ππevents. The nominal threshold is indicated,at which about 7% of the minimum-biasevents are kept, with an efficiency around70% on the signal.

2.3 ECAL and HCAL FE card

The processing on the FE card is describedin [19]. It is divided into several steps, per-formed synchronously in pipeline mode:

• Preparation of the data: the Calorime-ter Trigger uses as input the 12 bitADC value of each cell. This digitisa-tion is already pedestal corrected [20],but has to be converted to 8-bit ET.

• Collection of the data: In order tocompute all 32 sums over 2 × 2 cells,one has to access the neighbouringcells. A dedicated backplane connectsneighbouring cards, while LVDS mul-tiplexed links are used for the otherconnections.

• Computation of the 2×2 sums in par-allel.

• Selection of the highest of the 32 sums,keeping track of its address.

• Computation of the total ET on thecard.

• Sending the result of the processing tothe next stage.

The conversion from ADC value to ET isperformed by multiplying the ADC value bya 8 bit scale factor, and selecting the proper8 bits. If the result would be larger than 8bits, it is saturated to 255. This multiplica-tion with saturation process is performed inone clock cycle [20]. The nominal scale fac-tor is such that the value of ET covers the

12 CHAPTER 2. LEVEL-0 CALORIMETER TRIGGERS

range 0 to 5.1 GeV. This choice is a com-promise between the highest trigger thresh-old, around 4.5 GeV, and the loss of ac-curacy due to rounding effects. The gainis adjusted channel by channel to compen-sate for possible mis-calibration of the PM.The ET range can be varied, if needed, fromabout 3 GeV up to 10 GeV full scale.

The next step is to get the ET value forall the cells involved in the 32 sums. EachFE card covers an area of 8× 4 cells. In or-der to build the 32 sums, 8 right neighbours,4 top neighbours, and one top-right neigh-bour are needed, as shown on Figure 2.5.

� / � � � � � � � $ � � � � , � � # � � � � � � � ! %

/ � � � � � � � � ! � � � � � � � � % # � � ! �

0 � ) � � � , � � �� " � � � � � � � � � �

Figure 2.5: Connections to get the neighbouring cells.

Note that neighbouring cells of differ-ent size are not connected, as this wouldintroduce unnecessary complications. Thetwo halves of the detector are also not con-nected, as this would require to disconnectcables when opening the detector for main-tenance. These two limitations introduceonly a very small inefficiency, at the per millevel.

It is clear that local signals have to bedelayed, to allow the remote informationto arrive. The slowest signal is the cornerone, which arrives in two steps. All otherinformation is placed in pipe-lines, imple-mented using the internal memory of thePGA, with controllable length. The delayin these pipe-lines will be adjusted once thefinal layout is made, but it is estimated tobe one clock cycle for every backplane link,and two clock cycles for an LVDS multi-plexed link, since the cables will be around

2 m long. The corner signal waits 1 cycle,the LVDS signals wait 2 cycles, the back-plane signals 3 cycles and the local signals4 cycles. Note that these local signals aresent on 80 MHz multiplexed lines on theboard, to reduce the number of I/O pins onthe PGA. Details of this synchronisation,and of the backplane configuration, can befound in [19].

When the 45 signals are available, the32 sums are computed in parallel. Thesum is saturated when it overflows, around5.1 GeV, which occurs for about one thirdof the B → ππ events as can be seen on Fig-ure 2.4. Saturation has no effect on the per-formance, as the trigger relies only on thepresence of a hight ET cluster. The highestof the 32 sum is then selected, in a series ofbinary comparisons: First 16 comparisonsof two sums, then 8 comparisons of the 16previous results, and so on. Five steps areneeded to get the highest ET sum of theFE card, with its address on 5 bits. Thisis performed in pipeline mode, 3 cycles areneeded for the sum and the comparisons.

The Total ET of the card is also pro-duced, by summing the appropriate 8 sumsof 4 cells. This result is also saturated. Thistotal sum is the input for the local π0 trig-ger.

The card produces a 21 bit output onthe backplane, for the Validation Card: 8-bit highest ET sum, 5-bit address and 8-bit total ET. It also sends on two multi-plexed LVDS links, towards the PreShowerFE card for ECAL and towards the Valida-tion Card for HCAL, the 8-bit highest ET

sum and its 5-bit address, together with 8-bits Bunch Crossing IDentifier (BCID) forsynchronisation.

2.4 PreShower FE card

The PreShower FE card digitizes thePreShower analog signals, corrects themfor pedestal, gain and for the spill-over

2.5. VALIDATION CARD 13

of earlier signals, and receives and syn-chronises the SPD information [21]. ThePreShower trigger information is obtainedby comparing the PreShower signal to athreshold, producing a Yes/No output. TheSPD also provides binary information. ThePreShower FE card handles 64 channels,and covers two ECAL cards. For each beamcrossing, the PreShower FE card receivesthe address of two ECAL candidates, andfor each candidate, sends to the correspond-ing Validation Card the 4 PreShower bitsand the 4 SPD bits in front of the 4 cells ofthe candidate [21].

As for the Calorimeter card, access tothe neighbouring information is needed.The same backplane is used, transportingthis time only 2 bits per cell, but there are8 vertical neighbours instead of 4, as thecard covers 8×8 cells. The core of the pro-cessing is a 81 × 2 bit wide pipe-line (asthe ECAL input arrives several clock cyclesafter the PreShower and SPD signals areready) and the appropriate multiplexer toextract the 4 × 2 bits of the wanted cellsfor the proper beam crossing. A prototypeto test this processing has been built and isshown on Figure 2.6.

The input is the 5-bit address producedby the ECAL card, with 8 bits of BCID toselect the proper event. The output is 8 bitsof data, plus the BCID for cross-checking.This is sent to the ECAL Validation Card.As already mentioned, there are two inde-pendent inputs and two outputs, each cor-responding to one half of the card.

The card computes also the SPD mul-tiplicity, by counting how many of the 64bits have fired. This multiplicity is codedon 7 bits and is sent to the SPD Multiplic-ity card using the backplane lines that areused in the ECAL crate to connect to theValidation Card.

As the PreShower crates will be in thesame racks as the ECAL crates, the cablelength between ECAL and PreShower cardswill be around 2 m.

Figure 2.6: Photograph of the prototype of the triggerpart of the Preshower FE card.

2.5 Validation Card

The Validation Card [22] has two largelydisconnected functions. First it handles thecandidates from the 8 ECAL cards in thehalf-crate, doing a “validation” with thePreShower and SPD information, in orderto produce electron, photon and π0 candi-dates. From the 8 ECAL cards it is con-nected to, only one candidate of each typeis selected, the one with the highest ET.The second part handles the HCAL can-didates, it adds to its energy the energy re-leased at the same location in ECAL. Upto 4 HCAL candidates are connected to theValidation Card, and there is one output,with updated energy, for each input.

2.5.1 ECAL candidates

The 8-bit PreShower information is con-verted to a photon flag (PreShower and notSPD) and an electron flag (PreShower and

14 CHAPTER 2. LEVEL-0 CALORIMETER TRIGGERS

SPD). A Look Up Table (LUT) is used,with 3 bits output for each flag and triplevoting, to be insensitive to SEU. Using aLUT allows the requirements to be modi-fied, in particular the way the SPD is used,and a possible veto if too many of the 4 cellshave fired. The ECAL inputs are delayedby about 5 cycles to wait for the PreShowerinformation to arrive. Then the photonsand the electrons candidates are sent to aselector ’highest in eight’ similar to the oneon the ECAL/HCAL FE board. The resultis an 8-bit ET candidate with a 8-bit ad-dress, the 3 new address bits keep track ofwhich input was selected.

The local π0 selection is quite similar,see [23]. For the so called “local π0”, wherethe two photons are expected on the sameFE card, one uses the Sum ET of the FEcard as measure of the π0 ET. The highestin eight is selected. A similar validation bythe PreShower and the SPD is foreseen.

The “global π0” candidate, when the twophotons are on neighbouring cards, is ob-tained by adding the ET candidates of twoneighbouring cards. This is a simple add-and-saturate on 8-bit, followed by a selec-tion “highest in eight”. The address issomewhat arbitrary, it will be the addressof the candidate of the first card.

These four outputs of the ValidationCard are obviously quite similar. Thereis some flexibility in the validation by thePreShower and SPD, thanks to the use ofa LUT to define which combinations arevalid.

2.5.2 HCAL candidates

The motivation here is to add to the HCALET the ECAL ET in front, in order to im-prove the hadron energy estimate. Insteadof bringing the ECAL information to theHCAL candidates, the HCAL candidatesare sent to the ECAL crate. This reducesthe number of connections between the twodetectors by a factor 2.5, at the price of

some duplicate candidates [22]. As a sideeffect, the selection of the best version ofthe duplicated candidates has to be done inthe Selection Crate.

The first processing is a time alignment,to handle the same event in ECAL andHCAL. Then the processing is in threesteps:

• For each ECAL card, a single HCALcard can match. This is not thesame pattern for each Validation Card,and is therefore performed by a pro-grammable multiplexer.

• The ECAL and HCAL address arecompared using a LUT, 5+5 bits in-put, 3 bits output for triple voting andSEU immunity. This indicates if theECAL candidate is “in front” of theHCAL candidate. Eight such LUT areneeded, one per ECAL card.

• As several ECAL candidates can be infront of an HCAL candidate, one se-lects for each HCAL card the matchedECAL candidate with the highest ET

and then adds its ET to the HCALET to obtain an updated HCAL can-didate, which is sent to the selectioncrate.

This section of the Validation Card is shownin Figure 2.7. The internal memory of thePGA is used as a LUT.

+

� � �

� � � � � � � � � , � � # � � � ! )

' (

+1

$ � � � � � � ! � ' (

� � � ! � � ) *� � � � � � �

� � +

� � � � � � ! � � � � � �

� � +

� � � � � � � ! � � " � � ) � � � � �� ! � � � � � � � !

+�

� � 2

Figure 2.7: HCAL validation logic.

All outputs of the Validation Card areoptical links (32 bits at 40 MHz) towardsthe Selection Crate. The information oneach link is similar: 8-bit ET of the can-didate, 8-bit address and 8-bits for BCID.

2.8. SELECTION CRATE 15

In the remaining 8-bits we intend to senda “cable number” field, allowing cablingchecks.

2.6 SPD Multiplicity

In the PreShower crates, a card is locatedin the same slot as the Validation Card inthe ECAL crate. This card receives via thebackplane 8 SPD multiplicity values com-puted by 8 PreShower FE cards. It addsthese 8 numbers and outputs the sum onan optical fiber, in a similar format as theECAL Validation Card, using the two 8-bitsaddress and ET field described previouslyto transport the 10-bit multiplicity value.This will allow the computing the total SPDmultiplicity in the Selection Crate.

2.7 Backplane and links

There is a large data flow between variousboards, all at a frequency of 40 MHz. Theproblem has been simplified as much as pos-sible by using a dedicated backplane [24] toimplement most of the links. As shown inFigure 2.5, 9 of the 13 links between FEboards are via the backplane. The linkbetween the FE board and the ValidationCard is also via the backplane. They usemultiplexed LVDS signals, where 4 pairs al-low the transmission of 21 bits at 40 MHz.The same backplane is used for PreShower,ECAL and HCAL crates, the cost of anyunused connection being overwhelmed bythe simplification in debugging and main-tenance.

The first part of the backplane, contain-ing the power lines and the distribution ofthe timing and control signals, is shown onFigure 2.8.

Links between crates are implementedwith multiplexed LVDS signals. UsingCat-5 cables, safe transmission is possi-ble for lengths up to 20 m [24]. Mostof the connections are between crates in

Figure 2.8: Photograph of the power backplane.

the same rack, either inside ECAL, in-side PreShower or between ECAL andPreShower. Longer connections exist be-tween HCAL and ECAL. The crates areon two platforms on top of the calorime-ters, which move independently when thedetector is opened. The cable length shouldallow for opening without decabling, 10 mshould be enough, which is safe for the qual-ity of the link.

2.8 Selection Crate

As can be seen in Figure 2.3, the SelectionCrate [25] handles a lot of inputs, 4 times28 optical links for electromagnetic candi-dates, and 80 links for HCAL. It is madein two parts. One handles the electromag-netic candidates, essentially selecting theone with the highest ET for each type, andthe second part handles the HCAL candi-dates, in a slightly more complex way. Oneshould note that the Selection Crate is inthe barracks, and hence is not submittedto radiation or SEU problems, which allowsthe use of FPGAs.

2.8.1 Electromagnetic Candidates

Upon reception, the processing (after timealignment) is to select the highest ET ofthe 28 inputs, by successive binary compar-ison. The address of the final candidate, 8-bit received and 5-bits from this selection,is converted to the official calorimeter cellidentifier on 14 bits, using a LUT. The re-sulting candidate, 8-bit ET and 14-bit ad-dress, plus 8-bit BCID and a 2-bit status, is

16 CHAPTER 2. LEVEL-0 CALORIMETER TRIGGERS

sent to the L0 Decision Unit (L0DU). The 4types of candidates (electron, photon, localπ0 and global π0) are handled exactly thesame way.

2.8.2 SPD multiplicity

The functionality is similar: the 16 inputsare time aligned, then the 10-bit numbersare added without saturation, by a cascadeof pair addition, and the result on 13 bitsis sent to the L0 Decision Unit. The samehardware board can be used, with a differ-ent code for the processing FPGA. There isno address to send, and the 14-bit addressfield is used to send the result to the L0DU.

2.8.3 HCAL

The processing is similar, with two extrasteps to eliminate duplicates and to obtainalso the sum over the whole calorimeter.After time alignment, the duplicates areremoved: 30 HCAL cards have their can-didates sent to two Validation Cards, andthus to the Selection Crate. For each pair ofinputs coming from the same HCAL card,only the one with the highest ET is kept.Then, the HCAL card with the highest ET

is selected as in the ECAL case.The sum of the 50 cards is performed,

without saturating the result. This sumwill be used to detect a minimal activityin the detector, with a threshold at a fewGeV. It may also be used to detect dirtyevents, produced by piled-up interactions,and hence saturation at 5 GeV ET is notallowed.

As 80 optical links cannot be received ona single board, the HCAL processing is per-formed on 3 boards, receiving respectively28, 28 and 24 links. A simple connection al-lows one of the boards to perform the finalselection for the highest ET and total sum,based on the 3 intermediate results.

The output of the HCAL selection isthen the highest ET HCAL candidate, with

the same cell identifier processing and samefinal format as for the ECAL candidates,and the Total ET in HCAL. As there is noaddress in this case, the 14-bit address fieldis used to send the value, as for the SPDmultiplicity.

2.8.4 Implementation

Despite the diverse functionalities, thewhole Selection Crate can be implementedwith a single type of board, where somesmall part (inter-card connection, secondoutput) is unused in the ECAL case. Theboards will be equipped with three paral-lel 12-channel optical receivers connectedto 28 low power consumption deserializersTLK2501 from Texas Instruments. Afterdeserialization, the 28 16-bit words are de-multiplexed 2:1 to 32 bits and synchronizedto the TTC clock by 28 small FPGAs.

The synchronisation mechanism hasbeen already successfully tested. It simplyrequires writing the data into a FIFO withthe deserializer clock while reading it withthe TTC clock.

The processing itself is reasonably sim-ple, but requires a large number of connec-tions. A single FPGA1 with 812 I/O pinscan do the job.

The 28 inputs of each board are savedand transmitted to the DAQ after theLevel-0 decision, to enable a detailed mon-itoring of the correct behaviour of the sys-tem. Like for most sub-systems in LHCbthe TELL1-board [16] is used for this pur-pose. A simple zero suppression, removingcandidates with exactly zero ET, gives anacceptable event size of around 500 bytesfor final readout. The Selection Crate infor-mation is also made available to the Level-1processors, as explained in Chapter 6, witha threshold at 1 GeV to reduce the averagedata size to 70 bytes.

A prototype of the processing part hasbeen built and tested in 2002, and is shown

1For example the FPGA XILINX XC2VP50-5FF 1148C

2.10. DEBUGGING AND MONITORING 17

in Figure 2.9.

Figure 2.9: Photograph of the prototype of the Selec-tion Board.

2.9 Latency

The latency can be analysed in terms of in-ternal processing time, transport time anddelays for synchronisation of the inputs.

• FE boards: Seven cycles. On theECAL and HCAL FE boards, the pro-cessing is 1 cycle for converting theADC to ET and 3 cycles for the com-putation of the 2× 2 sums and the se-lection of the highest. Time alignmentof the inputs will require another 3 cy-cles.

• Validation Card: Eight cycles: Theprocessing in the PreShower FE cardadds to the latency: the ECAL can-didate has to be received (2 cycles),

the answer extracted (1 cycle) andthen transmitted (2 cycles) to the Val-idation Card. Five cycles are thusrequired for these operations. TheECAL input to the Validation Cardhave to wait during that time. Pro-cessing in the Validation Card is quitesimple, and will take 3 cycles. TheHCAL processing will take the sametime, and the slower transmission(due to longer cables between HCALand ECAL crates) is smaller thanthe latency due to the wait for thePreShower.

• Processing time in the Selection Cratetakes 9 cycles for ECAL candidates,and 14 for HCAL candidates. Two cy-cles are requested to de-serialize andsynchronize the data fluxes to theTTC clock. Data processing on theboard takes 5 clock cycles. It takes twocycles to transfer data to the L0DUor to send them to the hadron master.Three more cycles have to be added tothe previous 9 for the final hadron se-lection, and another two cycles to thehadron data transfer to the L0DU.

The total latency, not counting the opticaltransmission from the calorimeter platformto the barracks, is then below 30 cycles, or750 ns, well within the budget, as discussedin Chapter 5.

2.10 Debugging and Monitor-ing

To monitor the correct behaviour of the sys-tem, the inputs are logged with the data:The 8-bits ET of each ECAL and HCALcell, and the PreShower and SPD bits ofeach cell, are read out. As mentioned ear-lier, the inputs of the Selection Crate arealso logged with the event, allowing check-ing that they correspond to what is ex-pected from the individual cell inputs. Theresult of the Selection Crate is logged by the

18 CHAPTER 2. LEVEL-0 CALORIMETER TRIGGERS

L0 Decision Unit, which permits to monitorthe Selection Crate. Local tests of the FEcards and of the Validation Card are fore-seen, with inputs from a memory writableby ECS and results logged in a FIFO read-able by ECS. This will allow the debug-ging of the system and in-situ checks out-side data taking periods.

Chapter 3 Level-0 Muon Trigger

The muon system has been designed tolook for muons with a high transverse mo-mentum: a typical signature of a b-hadrondecay.

An overview of the muon system is givenfirst followed by the description of the L0muon trigger implementation, its perfor-mance as a function of various running con-ditions and its technical design.

3.1 Overview of the muon sys-tem

The muon detector [6] consists of five muonstations interleaved with muon filters (Fig-ure 1.1). The filter is comprised of the elec-tromagnetic and hadronic calorimeters andthree iron absorbers. Stations M2-M3 aredevoted to the muon track finding while sta-tions M4-M5 confirm the muon identifica-tion. The first station M1 is placed in frontof the calorimeter and plays an importantrole for the transverse-momentum measure-ment of the muon track. Station M1 im-proves the transverse momentum resolutionby about 30%.

Each station has two detector layerswith independent readout. A detector layercontains two gaps in station M2-M5. Toachieve the high detection efficiency of 99%per station and to ensure redundancy, thesignal of corresponding physical channelsin the two gaps and two layers are logi-cally OR-ed on the chamber to form a log-ical channel. The total number of physi-cal channels in the system is about 120,000while the number of logical channels is

25,920.Each station is subdivided into four re-

gions with different logical-pad dimensions,as shown in Figure 3.1. Region and padsizes scale by a factor two from one regionto the next. The logical layout in the fivemuon stations is projective in y to the in-teraction point. It is also projective in xwhen the bending in the horizontal planeintroduced by the magnetic field is ignored.

The logical pad dimensions are summa-rized in Table 3.1. Compared to M1 thepad size along the x axis is twice smallerfor M2-M3 and twice coarser for M4-M5.

Pads are obtained by the crossing of hor-izontal and vertical strips wherever possi-ble. Strips are employed in stations M2–M5 while station M1 and region 1 (R1) ofstations M4-M5 are equipped with pads.

Strips allow a reduction in the numberof logical channels to be transfered to themuon trigger. The processor receives 25,920bits every 25 ns forming 55,296 logical padsby crossing strips.

Each region is subdivided into sectorsas shown in Figure 3.1. They are definedby the size of the horizontal and verticalstrips and match the dimension of underly-ing chambers.

The L0 muon trigger looks for muontracks with a large transverse momentum,pT. The track finding is performed on thelogical pad layout. It searches for hits defin-ing a straight line through the five muonstations and pointing towards the interac-tion point (Figure 3.2). The position of atrack in the first two stations allows the de-termination of its pT.

19

20 CHAPTER 3. LEVEL-0 MUON TRIGGER

Region 4

Region 3

Region 2

4804

2002

1001

500

250

250

300 300 600 1200 2402

4003

Beam Pipe Shielding

50mm x 250mm

25mm x125mm

12.5mm x 63mm

6.3mm x31mm

Logical channel

Logical channel

Logical pad

Reg 1

Sector

Figure 3.1: Front view of one quadrant of muon station 2, showing the dimensions of the regions. Insideeach region a sector is shown. It is defined by the size of the horizontal and vertical strips. The intersectionof the horizontal and vertical strips, corresponding to the logical channels, are logical pads. The regionand channel dimensions scale by a factor two from one region to the next.

To simplify the processing and to hidethe complex layout of stations, we sub-divide the muon detector into 192 towerspointing to the interaction point as shownin Figure 3.3. A tower contains logical padswith the same layout: 48 pads from M1,2 × 96 pads from M2 and M3, 2 × 24 padsfrom M4 and M5. Therefore the same al-gorithm can be executed in each tower, thekey element of the trigger processor. Eachtower is connected to a Processing Unit(PU).

All logical channels belonging to a towerare sent to a PU using six high speed optical

links. The intersection between a tower anda station maps a sector. The correspondinglogical channels are transported on a ded-icated optical link to ease the connectivitybetween the muon detector and the triggerand the data distribution within a proces-sor.

The data flow, however, is more complexfor stations M2-M3 region R1 and R2. Inregion R1, a sector is shared by two towerswhile in region R2, a tower maps to two sec-tors (Figure 3.1 and Figure 3.3). The firstcase requires additional exchange of logicalchannels between PUs while the second one

3.2. TRIGGER IMPLEMENTATION 21

Table 3.1: The logical pad size in the four regions of each station projected to M1, and the number of opticallinks per tower and their content in term of logical channels.

Station Region Pad size links logical channels/linkat M1 [cm2] per tower pads H-strips V-strips Total

M1 R1 1×2.5 2 24 – – 24R2 2×5 2 24 – – 24R3 4×10 2 24 – – 24R4 8×20 2 24 – – 24

M2 or M3 R1 0.5×2.5 1 – 16 12 28R2 1×5 2 – 4 12 16R3 2×10 1 – 4 24 28R4 4×20 1 – 4 24 28

M4 or M5 R1 2×2.5 1 24 – – 24R2 4×5 1 – 8 6 14R3 8×10 1 – 4 6 10R4 16×20 1 – 4 6 10

pp

µ−

µ+

M1 M2 M3 M4 M5

Muon stations

B

Figure 3.2: Track finding by the muon trigger. For each logical-pad hit in M3, hits are sought in M2, M4and M5, in a field of interest (highlighted) around a line projecting to the interaction region. When hitsare found in the four stations, an extrapolation to M1 is made from the hits in M2 and M3, and the M1hit closest the extrapolation point is selected. The track direction indicated by the hits in M1 and M2 isused in the pT measurement for the trigger, assuming a particle from the interaction point and a singlekick from the magnet. In the example shown, µ+ and µ− cross the same pad in M3.

requires eight optical links instead of six,as shown in Table 3.1. A unique process-ing board containing four PUs deals with allcases by programming differently the FP-GAs and by grouping two interconnectedPUs in region R2.

The L0 muon trigger is implementedwith the four quadrants of the muon sys-tem treated independently.

3.2 Trigger implementation

The L0 muon trigger algorithm and itsimplementation are described in detail in

LHCb notes [26, 27].

The logical channels are transportedfrom the Front-End electronics to the muontrigger through a total of 148 high speedoptical ribbons of 12 fibres each.

Track finding in each region of a quad-rant is performed by 12 PUs, arranged onprocessing boards in groups of four for re-gions R1, R3 and R4, and in pairs for regionR2.

A PU collects data from the five muonstations for pads and strips forming a tower,and also receives information from neigh-

22 CHAPTER 3. LEVEL-0 MUON TRIGGER

Figure 3.3: A quadrant of the muon system show-ing the tower layout. Thick lines delimit the frac-tion of the system analyzed by a processing board.In this view the interaction point is shifted to ∞.

bouring towers, although they are in an-other region, to avoid inefficiency on bound-aries. Logical channels are merged whenthey are transferred from region Ri to Ri+1.In the opposite direction, logical channelsare transported as is and replicated in fourchannels to match the granularity of the re-ceiving PU. Therefore all data collected ina tower have the same granularity.

Track finding in a PU starts from the 96logical pads defined by the intersections ofhorizontal and vertical strips representingthe unit’s input from station M3. The tracksearch is performed in parallel for all pads.

For each logical-pad hit in M3 (trackseed), the straight line passing through thehit and the interaction point is extrapo-lated to M2, M4 and M5. Hits are lookedfor in these stations in search windows,termed Fields Of Interest (FOI), approxi-mately centered on the straight-line extrap-olation. FOIs are open along the x-axis forall stations and along the y-axis only forstations M4 and M5. The size of the FOIdepends on the station considered, the level

of background, and the minimum-bias re-tention required. When at least one hit isfound inside the FOI for each of the stationsM2, M4 and M5, a muon track is flaggedand the pad hit in M2 closest to the extrap-olation from M3 is selected for subsequentuse.

The track position in station M1 is de-termined by making a straight-line extrap-olation from M3 and M2, and identifyingin the M1 FOI the pad hit closest to theextrapolation point.

Since the logical layout is projective,there is a one-to-one mapping from padsin M3 to pads in M2, M4 and M5. Thereis also a one-to-one mapping from pairs ofpads in M2 and M3 to pads in M1. Thisallows the track-finding algorithm to be im-plemented using only logical operations.

Once track finding is completed, an eval-uation of pT is performed for muon tracks.The pT is determined from the track hitsin M1 and M2, using look-up tables. Thenumber of muon tracks per PU is limited totwo. When more candidates are found, theyare discarded and the PU gives an overflow.

The two muon tracks of highest pT areselected first for each processor board, andthen for each quadrant of the muon sys-tem. The information for up to eight se-lected tracks is transmitted to the L0 Deci-sion Unit.

3.3 Trigger performance

The L0 muon trigger is designed fora minimum-bias output rate of around200 kHz1. This is obtained by optimizingthe parameters of the algorithm given bythe horizontal and vertical dimensions ofthe FOI and by the cut on pT. Decreas-ing the dimension of the FOI and increas-ing the cut on pT reduces the output rate.The size of the largest FOIs is an important

1About 2/3 of the output rate is devoted to a single muontrigger and 1/3 to a di-muon trigger.

3.3. TRIGGER PERFORMANCE 23

parameter for the processor since they de-fine the number of data exchanged betweenPUs. This number determines the dimen-sion of busses connecting PUs. The largestFOI2 are given in Table 3.2.

Table 3.2: The maximum size of the FOI along thex and y coordinates. It is expressed in terms of padswith respect to the pad lying on the straight line passingthrough the hit in M3 and the interaction point. A FOIof ±3 corresponds to a total width of 7 pads.

M1 M2 M4 M5x ±3 ±5 ±3 ±3y – – ±1 ±1

Figure 3.4 shows the transverse momen-tum determined for L0 muon candidatesfound in minimum-bias and B0

s → J/ψφsamples when the FOI are optimized for anoutput rate of 125 kHz (single muon trig-ger). The corresponding trigger efficiencyis shown in the bottom plot as a function ofa cut on pT. The origin of the muon candi-dates in the accepted minimum-bias eventsis given in Table 3.3. They mainly comefrom pion and kaon decays in flight. Theresolution on the transverse momentum wasmeasured to be 20% for muons coming froma b-quark.

Table 3.3: Origin of candidate triggering minimum-bias events when the rate for the single muon triggeris fixed to 125 kHz. The table includes hadron punch-through.

[%]b-hadron 2.2c-hadron 3.3Pion 63.2Kaon 28.5Other particles (p, n, τ ,...) 1.1Ghost tracks 1.7

2They were obtained by optimizing the trigger efficiencywhile minimizing the size of the FOI. Studies were performedas a function of the output rate and running conditions [28,29].

0

0.025

0.05

0.075

0.1

0.125

0.15

0.175

0.2

0.225

0 1 2 3 4 5

µ from minimum bias

µ from signal

Computed muon pT after FOI cuts [GeV/c]

Arb

itra

ry u

nit

s

0

0.2

0.4

0.6

0.8

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5Cut on muon pT [GeV/c]

Sig

nal

eff

icie

ncy

(%

)

B0s→J/ψ (µ+µ-)Φ(K+K-)

0

200

400

600

800

1000

L0(

µ) o

utp

ut

rate

(kH

z)

min. bias interactions

L0(µ) trigger rate = 125kHz

Figure 3.4: Top: reconstructed transverse momen-tum for minimum-bias and for B0

s → J/ψφ events.It is encoded on 8 bits and saturated to 5 GeV/c.Both samples are normalized to unity. Bottom:the trigger efficiencies for minimum-bias and forB0

s → J/ψφ events as a function of the cut on pT.In both plots the dimensions of the FOI are opti-mized for an output rate of 125 kHz (single muontrigger). The B0

s → J/ψφ events are selected byoffline reconstruction.

The robustness of the muon trigger im-plementation has been studied by varyingthe minimum-bias retention level, and theoperating conditions defined by the levelof low-energy background in the chambers,the level of beam halo muons, as well as

24 CHAPTER 3. LEVEL-0 MUON TRIGGER

chamber parameters [28, 29]. The parame-ters of the trigger algorithm are optimizedin each case. The performance on usefulevents selected by the reconstruction andtagging procedure are given in Chapter 7.Here, the relative loss in the efficiency whenthe running conditions are deteriorated ispresented.

3.3.1 Low-energy background

The energy thresholds of the Geant simu-lation in the region behind the calorime-ters are set to higher values than in therest of the detector to save CPU time spentin tracking inside the iron filters. As aconsequence the low energy component ofthe muon chambers hits rate [6] in stationsM2-M5 is strongly suppressed. To restorethe correct rate, background hits are addedduring digitization. They are extractedfrom a parametrization obtained with a dif-ferent version of the simulation programwhich contains lower energy thresholds anda more detailed geometry of the detectorand the beam optics. The low energy back-ground is constituted by low energy parti-cles, mainly electrons and charged hadronsand, for the large arrival time, to thermalneutrons. Since the simulation of these pro-cesses is affected by large uncertainties, inthe robustness test, conservative safety fac-tors from 2 to 5 have been applied to thetotal number of hits according to the rel-ative importance of this component in thefive muon stations. The loss induced by thelevel of low-energy background depends onthe Level-0 output rate. It varies between2% (300 kHz) and 8% (100 kHz).

3.3.2 Beam halo muons

The charged-particle flux associated withthe beam halo in the accelerator tunnel con-tains muons of a rather wide energy spec-trum with its largest flux at small radii [6].In particular those halo muons traversing

the detector in the same direction as par-ticles from the interaction point can causea muon trigger. The average number ofbeam halo muons depends strongly on thelevel of residual gas in the beam pipe [30].We define the nominal condition as the sec-ond year after 10 days of running and theworst condition by applying a safety factorof two on the expected level of residual gasand three on the beam current. In nomi-nal condition, the average number of beamhalo muons is equal to 0.015 per bunchfor particle travelling from the entrance ofthe cavern toward the muon detector and0.026 for particle coming from the otherside. Studies, performed on minimum-biassamples with superimposed beam-halo par-ticles [28, 29], show that the beam halodoes not affect the trigger performance innominal conditions. Increasing the level ofresidual gas in the beam pipe and the beamcurrent, however, decreases the trigger effi-ciency by less than 8% . The magnitudeof these losses is similar to the one inducedby the maximum level of low-energy back-ground and add linearly with it. Thesestudies also show that the muon trigger israther insensitive to the beam halo comingfrom the other side of the interaction pointsince particles are not in time in that case.

3.3.3 Hardware parameters

The chamber response depends on severalparameters: cluster size, single gap effi-ciency and electronic noise. The most sen-sitive one is the cluster size [28]. An overallincrease by 30% decreases the trigger effi-cency by less than 5%.

The implementation of the muon triggeralgorithm limits the number of muon candi-dates per PU to two, applies data compres-sion algorithms when data are transferredbetween towers belonging to different re-gions, and encodes the pT on 8 bits. Thesesimplifications have no significant effect onthe trigger performance.

3.4. TECHNICAL DESIGN 25

3.4 Technical Design

The muon trigger is divided into four inde-pendent parts running on the quadrants ofthe muon detector. They are located be-hind the shielding wall in an environmentwithout radiation. A processor is a 9U cratewith 15 Processing Boards, 3 Muon Selec-tion Boards and a Controller. All theseboards are interconnected through a cus-tom backplane as shown in Figure 3.5.

Figure 3.5: The data flow of a processor connectedto a quadrant of the muon detector.

The architecture is fully synchronous,pipelined and massively parallel. The pro-cessing frequency is 40 MHz. Data ex-change frequency between boards and be-tween PUs is 80 MHz.

In this section, we present briefly thetechnical design of these components. A de-tailed description can be found in [27, 31].

3.4.1 ODE Trigger interface

The main task of the muon front-end elec-tronics is to form the logical channels, totag each logical signal with a bunch crossingidentifier and to send time-aligned data tothe trigger [6] as shown in Figure 3.6. Thebuilding of the logical channels from thephysical ones is performed by 7632 Front-

On C

HA

MB

ER

SIn C

RA

TE

S

Off Detector Electronics(ODE)

PhysicalChannels

(120k)

DAQ

BX SynchronizationFine Time measurement

L0 & L1 buffersTrigger & DAQ interfaces

Trigger

Intermediate Boards (IB)

In CR

AT

ESOut

Logical channelGeneration

(IB)

TTCRx(clock)

DIALOG

Front EndBoards (FEB)

Service Boards

ECS

Front-end controlsDCS nodes

Low voltagePulse system

ASD ASD ASD

Programmable Delays

Logics, DAQs, I2C Node

42k channels

25k channels

8.6k logical

channels

17.3k logical

channels

Figure 3.6: Simplified scheme of the muon front-end architecture.

End boards mounted on the detector andby 152 Intermediate Boards. The remain-ing tasks are handled by 148 Off DetectorBoards (ODE) located on the left and rightside of the muon detector.

A trigger interface located in an ODEboard receives up to 192 logical channelsand pushes every 25 ns twelve 32-bit wordson 12 high-speed optical links grouped inone optical ribbon cable. The Gigabit Opti-cal Link transmitter (GOL) [32], developedat CERN, encodes and serializes the 32-bitword with its clock using the 8B/10B pro-tocol. The resulting 1.6 GHz electric signalis converted to an optical signal by a ribbontransmitter3.

The jitter of the input clock driving theGOL has to be lower than the jitter of theclock delivered by TTCrx components bya factor less than three to guarantee a biterror rate below 10−12. A Filtering cir-cuit is implemented in the interface. Itwill be either the radiation hard jitter fil-ter ASIC from CERN, named QPLL [33],

3from AgilentTM

26 CHAPTER 3. LEVEL-0 MUON TRIGGER

or a discrete narrow bandwidth phase lockloop (PLL) controlling a voltage crystal os-cillator [34].

Figure 3.7: The prototype of the ODE Board withits trigger interface.

We developed a prototype of a ribbon ofhigh-speed optical links with a filter basedon a narrow bandwidth PLL controlling avoltage crystal oscillator [34]. We obtaineda bit error rate below 10−15 with the TTCclock. The effect of single event upsets hasbeen estimated on the part of the ribbonoptical link implemented in the ODE. Weobtain an equivalent bit error rate of 3 ×10−11 in the radiation environment of themuon detector. However, the cross-sectionof the optical ribbon transmitter has still tobe measured.

Figure 3.7 shows a photograph of theprototype of the ODE board with its triggerinterface.

The 148 ribbon cables coming from ODEboards are connected to a passive patch

panel. It merges fibres related to a towercoming from different stations into a singleoutput ribbon. The input cable is about80 m long. Output ribbons are connectedto processing boards.

3.4.2 Processing Board

The diagram of the processing board [31] isshown in Figure 3.8. A board contains:

• two receivers for two ribbon opticallinks corresponding to 24 single opti-cal channels;

• six FPGAs: one for each PU, one forthe BCSU (Best Candidates SelectionUnit) and one for the L1MU (L1 Man-agement Unit;

• eight look-up tables;

• one ECS interface based on a creditcard PC [9];

• one interface to the custom backplane.

The hardware of the processing board isunique but the programming of the 60 PUshoused in a processor depends on their lo-cation in the system.

Six optical channels coming from a towerare connected to a PU The correspondingFPGA receives six 16-bit words, their 80MHz clocks and 6 × 2 control bits. The in-put data are time aligned using the bunchcrossing identifier and control bits. Dataare exchanged with neighbouring PUs andthe muon finding algorithm is executed.Table 3.4 shows the maximum informationexchange between PUs.Each PU outputs a 38-bit word with ad-dresses of hits in station M1, M2 and M3for the two candidates, the bunch crossingidentifier and a status.

For each candidate, the pT is computedusing a look-up table and encoded on an 8-bit word. The look-up table is implementedin a 32k×8 static RAM.

The next step is performed by the BCSUwhich selects the two candidates with the

3.4. TECHNICAL DESIGN 27

Figure 3.8: Scheme of the processing board where the data flow is only shown for one PU.

Table 3.4: The number of logical channels ex-changed between PUs. Busses named “Horizon-tal”, “Vertical” and “Crossing” link PUs located inthe same board while the bus named “Backplane”connects PUs spread over several boards.

Top Bottom

Left Right Left Right

Backplane From 94 90 88 86Backplane To 110 82 96 96

Vertical From 42 42 42 42Vertical To 42 42 42 42

Horizontal From 82 72 72 82Horizontal To 72 82 82 72

Crossing From 2 12 12 2Crossing To 2 12 12 2

highest pT among the eight proposed by thePUs. Results are stored in a 60-bit wordwhich is sent to the muon selection boardvia an 80 MHz point-to-point connection.It contains the addresses of two candidatesin station M1, M2 and M3, their pT, thebunch crossing identifier and a status.

Each PU and BCSU houses a L0 bufferand its derandomizer buffer. It receives theinput and output words of the components.Their width is 512 bits for a PU and 320bits for a BCSU.

Each L0 buffer is connected to an L1buffer through a bus, 16-bits wide, runningat 40 MHz. The L1MU activates transfersbetween L0 and L1 buffers and between theL1 buffer and the controller board. It alsohouses the L1 derandomizer buffer. A L1buffer is implemented with a synchronousdynamic RAM, 1M×16 wide. The dataare transferred to the controller via a serialpoint-to-point 4-bit wide link.

Figure 3.9 shows a photograph of theprototype of the Processing Board. Thisprototype is very close to the final designexcept for the size of the L1 buffer, whichis too small. Each FPGA exchanges datawith its neighbours at 80 MHz and is con-nected to the ECS interface.

3.4.3 Muon selection board

The Muon Selection Board contains onlyone FPGA with a functionality very sim-ilar to the best-candidate selection unit. Itis connected to five processing boards andreceives their best candidates. The chip se-lects the two candidates with the highest pT

among the 10 proposed. The 60-bit outputword is sent to the controller board througha 80 MHz point-to-point connection.

28 CHAPTER 3. LEVEL-0 MUON TRIGGER

Figure 3.10: Scheme of the controller board.

3.4.4 Controller board

The diagram of the controller board isshown in Figure 3.10.

The controller board is connected to thethree muon selection boards and receivestheir best candidates. An FPGA with pro-gramming very similar to the BCSU se-lects the two candidates with the highestpT among the six proposed. The resultis encoded into two 32-bit words, one foreach candidate, containing the address ofthe candidate in station M1, its pT valuesand a status. These two words are sent tothe L0 Decision Unit. Inputs and output ofthis component are stored in a 320-bit wideL0 buffer.

The controller board is also linked to the

15 processing boards of the card to receivethe output of their L1 derandomizer buffers.A FPGA builds the events, strips dupli-cated information such as bunch crossingand event identifiers and applies a zero sup-pression algorithm. The output is sent tothe data acquisition system via two GigabitEthernet links.

The controller board is the interface withthe TTC system [12]. It receives TTC sig-nals through the TTC receiver chip and dis-tributes them to processing and controllerboards via dedicated links running on thebackplane.

3.4. TECHNICAL DESIGN 29

Figure 3.9: The prototype of the Processing Boardwhere a PU is implemented in one FPGA with600,000 gates and 652 pins.

3.4.5 Backplane

The backplane distributes power supplies(+5 V, +3.3 V, +48 V, GND), the main40 MHz clock and service signals comingfrom the TTC system. Clocks are sent in-dividually from the controller to each pro-cessing boards by point-to-point links whileservice signals are broadcast on a commonbus at 40 MHz.

The backplane connects:

• processing boards together to ex-change neighbouring information;

• a processing board to a muon selec-tion board and muon selection boardsto the controller;

• each processing board to the controllerto transfer the content of L1 buffers.

All these connections rely on point-to-pointlinks running at 80 MHz.

Analog simulations have been madeto find the most appropriated impedance

matching scheme. All point-to-point linksare terminated either on the processingboards or on the controller board side whilebussed signals are adapted on the back-plane. Table 3.5 summarizes the numberof pins required to connect a board to thebackplane. We implement a compact PCIconnector with shield and guide lugs forcentering.

Table 3.5: Number of pins required to connect aprocessing, muon selection and controller boardsto the backplane.

Processing Muon Controllerboard selection

board

Signal 443 163 219Ground 198 198 198Power 106 106 106Free 43 323 267

Total 790 790 790

3.4.6 Latency

The latency for the muon trigger is 1050 nsas shown in Table 3.6, well within specifica-tion given in Chapter 5. It starts with thefirst data arriving on the fastest optical linkand it ends when the results are serializedon the link connected to the L0 DecisionUnit.

Table 3.6: Breakdown of the latency for the muontrigger expressed in terms of LHC clock cycles.

Clock cycle

Optical link deserialization 2Optical link synchronization 4Neighbouring exchange 5Muon tracking 1M1 pad finding 1Candidates selection within a PU 5pT computation 1pT selection within a board 4Final selection 17Serialization to L0DU 2

Total 42

30 CHAPTER 3. LEVEL-0 MUON TRIGGER

3.4.7 DAQ Event size

We log the input and output words ofeach processing element in the L0 buffersto monitor the trigger during data takingand to trace any bias which might be in-troduced. The target event size is about1 kBytes after zero suppression.

3.4.8 Debugging and monitoringtools

The strategy to debug a processing boardor a processor relies on the ECS interface,L0 buffers and a simulation of the hardware.The ECS interface can fill specific buffers lo-cated in the PUs with test patterns. Theyemulate data transport by the optical linksfor 16 consecutive events. The test buffersinject data in place of the buffers receivingthe output of the optical transceiver stage.The trigger then runs on 16 consecutiveevents and stops. Input values provided bythe test buffers, neighbouring data from ad-jacent PUs and pT computation are loggedto L0 buffers and systematically transferredto L0 derandomizer buffers. The ECS inter-face can read the content of L0 derandom-izer buffers and we can compare the resultswith the expected values provided by a sim-ulation of PUs and BCSUs.

Chapter 4 Level-0 Pile-Up System

Upstream of the VELO system [7], a setof two planes of silicon strip detectors isused to determine the number of primaryinteractions within one bunch crossing. Thesilicon detectors of this Pile-Up system areequipped with special fast readout electron-ics to allow their data to be made availableat Level-0. LHCb aims to run with an av-erage luminosity of 2×1032 cm−2s−1, how-ever to achieve this average all sub-systemsare able to cope with luminosities up to5×1032 cm−2s−1. Figure 4.1 shows the rateof crossings with their expected number ofvisible interactions1 in the luminosity rangefor which the spectrometer has been opti-mised.

]-1s-2cm32

L[100.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Rat

e [M

Hz]

0

5

10

15

20

25

30

35

40Inelastic cross-section = 63.3 mb

0 interactions

1

2 3

Figure 4.1: Rate of crossings with their number of pp-interactions assuming σvisible = 63 mb, as a function ofluminosity.

Crossings with multiple interactionstrigger at Level-0 and subsequent triggerlevels more based on combinatorics ratherthan on genuine b-decay candidates, and in

1Chapter 7 describes the physics simulation, and definesvisible interactions, which are expected to have a cross-section σvisible = 63 mb.

addition tend to occupy a disproportionallarge share of the event building bandwidthand the available processing power. Remov-ing these crossings can even give a gain inthe number of signal events collected, sinceother trigger cuts can be relaxed to saturatethe allowed bandwidth. Note that the Pile-Up system detects only tracks in the back-ward direction, and hence it cannot mis-take B-decays in the acceptance of LHCbfor pile-up interactions.

The Pile-Up system also provides a rel-ative measurement of the luminosity, sincein the luminosity range of LHCb the rate ofcrossings with zero, one and multiple inter-actions allows its determination using Pois-son statistics.

4.1 Concept

The Pile-Up system [35] - [37] consists oftwo planes (A and B) perpendicular tothe beam-line and located upstream of theVELO, as shown in Figure 4.2. Eachplane consists of two overlapping VELO R-sensors [38], which have strips at constantradii, and each strip covers 45◦. In both

Pile-UpPlanes

AB

B A

Figure 4.2: Top view of the layout of VELO planesand the Pile-Up detector planes A and B at -22.0/23.5and -30.0/31.5 cm respectively. The interaction regioncontaining 95% of the luminosity is expected to be 16cm wide along the beam-line, and is indicated as well.

planes the radii of track hits, ra and rb,

31

32 CHAPTER 4. LEVEL-0 PILE-UP SYSTEM

are recorded. The hits belonging to tracksfrom the same origin have the simple rela-tion k = rb/ra, giving:

zv =kza − zb

k − 1(4.1)

where zb , za are the detector positions andzv is the position of the track origin onthe beam axis, i.e. the vertex. The equa-tion is exact for tracks originating fromthe beam-line. All hits in the same oc-tant of both planes are combined accordingto equation 4.1 and the resulting values ofzv are entered into an appropriately binnedhistogram, in which a peak search is per-formed, as shown in Figure 4.3. The resolu-

AB

Rb(i) Ra(i)

bundle/z-axis

ZaZb Zv1 Zv2

z (cm)

Vertex2

Vertex1

-10 100

1

2

3

4

0

En

trie

s

Figure 4.3: Basic principle of detecting vertexes in acrossing. The readout hits of plane A and B are com-bined in a coincidence matrix. All combinations areprojected onto a zv–histogram. The peaks indicatedcorrespond to the two interaction vertexes in this par-ticular Monte-Carlo event. After the first vertex find-ing the hits corresponding to the two highest bins aremasked, resulting in the hatched histogram.

tion of zv is limited to around 3 mm by mul-tiple scattering and the hit resolution of theradial measurements. To limit the numberof channels which have to be processed, fourneighbouring strips of the sensors are OR-ed on the FE-chip, and hence the latter ef-fect dominates. All hits contributing to thehighest peak in this histogram are masked,after which a second peak is searched for.The height of this second peak is a mea-sure of the number of tracks coming from asecond vertex, and a cut is applied on thisnumber to select crossings.

4.2 Simulation Results

The effect on the number of signal eventsis shown in Figure 4.4, where for differentsignal channels the combined Level-0 andLevel-1 efficiency is plotted as a functionof a cut on the number of track candidatesfrom the second largest multiplicity vertexas detected by the Pile-Up at a luminosityof 2×1032 cm−2s−1. For every Pile-Up cut,the thresholds for the other trigger variablesare modified to fill the allowed bandwidthof the two trigger levels. The Level-0 al-gorithm described in Chapter 7 is found togive better results for channels with muonsif the Pile-Up cut is not applied if the sum ofthe transverse momenta of the two largestpT muons is above its threshold, and thesechannels show a different sensitivity com-pared to hadronic decays. Figure 4.4 alsoshows that the system reduces the aver-age event size, which allows a smaller eventbuilding network in Level-1, and reducesthe necessary processing time in subsequenttrigger levels. The gain in event yield dueto pile-up detection increases with luminos-ity, which is shown in Figure 4.5, where theexpected yield for offline reconstructed andtriggered Bs → DsK events is given for thenominal Pile-Up cut of 3, and without thePile-Up System.

The number of hits detected in the Pile-

4.3. TECHNICAL DESIGN 33

Bs -> DsKL1×L

0 ch

anne

l effi

cien

cy (

%)

Bd -> π+π-

L1×L

0 ch

anne

l effi

cien

cy (

%)

Bs -> J/ψ(µµ)φ

L1×L

0 ch

anne

l effi

cien

cy (

%)

Time

Pile-Up Peak2 Cut

L1 T

ime

and

Clu

ster

s

Clusters

0

10

20

30

40

50

60

70

0.85

0.9

2 3 4 5Pile−Up Cut No Cut

Figure 4.4: Combined L0×L1 efficiency for threephysics channels as function of a cut on the numberof tracks detected in the second vertex. Also shownis the efficiency without the Pile-Up system (No Cut).The bottom plot shows the corresponding number ofVELO+TT clusters and Level-1 execution time, nor-malized to their “No Cut” values, in minimum-biasevents after Level-0. The nominal Pile-Up Cut is in-dicated by the dashed line.

]-1s-2cm32L[100 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

K A

nnua

l Yie

ld [e

vent

s]s

->

Ds

B

2500

3000

3500

4000

4500

5000

5500

6000

No PU Veto

PU Veto cut at 2

PU Veto cut at 3

Figure 4.5: Expected yield for Bs → DsK per year af-ter Level-1 as a function of the luminosity with andwithout the Pile-Up System. The cut on the numberof tracks in the second vertex is 2 and 3 respectively.

Up system gives a measure of the chargedtrack multiplicity in the event close to theprimary vertex, and is used in combinationwith the SPD-multiplicity in addition to thenumber of interactions to flag ’complicated’events [39].

4.3 Technical Design

The Pile-Up system is an integral part ofthe VELO [7] as far as the sensors, theirmechanical mounting and the readout ofthe analog pipeline after Level-0 are con-cerned. Also the control system and powersupplies are identical to the VELO. In addi-tion, however, the Pile-Up system uses thesignals of the integrated comparators of theBeetle [40] chips on its four hybrids. Theoutput of the four neighbouring compara-tors is OR-ed, resulting in 256 LVDS linksrunning at 80 Mbit/s per hybrid, whichsend the L0-signals via the Repeater Sta-tion on the vertex tank to an Optical Trans-mission Station. From where the data ofthe 1024 signal pairs will be transferred viaoptical links to the trigger logic which is lo-cated in the radiation-free electronics bar-racks. Figure 4.6 shows an overview of thesystem.

The radiation levels at the hybrid, Re-peater Station and Optical TransmissionStation are given in Table 4.1. The sensorsare located in a radiation area necessitatingtheir replacement every few years, like forthe VELO. The hybrids, although far lesssensitive, have then to be replaced as well.For the Pile-Up System no active elementswill be placed at the Repeater Station. Theradiation level at the Optical Station is tol-erable for using commercial optical links atthat location .

The design of the optical links for themuon system (Chapter 3) will be followed.Timing information is lost when using se-rial optical transmission links. Therefore atime stamp consisting of part of the BCID

34 CHAPTER 4. LEVEL-0 PILE-UP SYSTEM

Table 4.1: Yearly radiation levels at several locationsof the Pile-Up System electronics.

Location DoseHybrid 2 kGyRepeater Station 200 GyOptical Station <1 Gy

2*4

Repeater

Serial−to−parallelOptical Connection

Optical ConnectionParallel−to−serial

(Vertex Finder Boards)

Board

1024 signal pairs

1024 signal pairs

L0 Central Trigger

Hybrid

Pile−Up System

<3m

10m

60m 96 opt. fibers wall

Figure 4.6: Overview of the Pile-Up System.

is included in the data. Hence a TTCrxchip is included in the Optical TransmissionStation, the receiving side is passive. Thetime stamp data occupies part of the opticallinks. The number of needed connections isthen 2 ribbons per hybrid, giving 96 opticalfibers in total.

4.3.1 Beetle Chip

The Beetle [40] chip is used in LHCb in theVELO, TT and IT stations, in the RICHand in the Pile-Up. It is designed in a com-

Figure 4.7: Prototype VELO hybrid with a 182◦ Si-detector mounted. The detector strips are circular arcs,the pitch increases with the distance to the beam axis.

mercial 0.25 µm CMOS technology and hasa die size of 6.1 × 5.4 mm2. In case of theVELO and Pile-Up systems, the readoutchip will be positioned only 5 cm from theLHC beam, and the Beetle has been de-signed to be radiation hard, and to avoidrisk of Single Event Latch-up. Radiationhardness of the Beetle has been demon-strated for up to 300 kGy [41].

Beetle 1.1 was used to check the full ana-log operation [42], including a 16-chip hy-brid, with a prototype R-sensor, as shownin Figure 4.7.

In the Beetle chip four detector input

4.3. TECHNICAL DESIGN 35

notCompOut

D Q

Ibuf

Vd

1 of 160+16+10 cells

Write

Read

Reset

1 of 128 channels

buffer

pipeline

pipeline readout−amplifier

comparator

1 of 16 channels

Or LVDS @ 80 MHzMux

CompOut

Polarity

Icom

p

Ithm

ain

Ithde

lta

CompClk

Reset

Vdcl

IvoltbufIpipe

Figure 4.8: Comparator, pipeline and output part ofthe Beetle chip.

channels are combined, at the cost of a de-crease in vertex accuracy, by a Logic-OR atthe comparator stage for providing fast sig-nals that immediately can be used in theLevel-0 trigger system (Figure 4.8). Twogroups are multiplexed on one output line,giving 16 LVDS outputs at 80 Mbit/s perchip. There are two output modes: a)tracking mode with a time over thresholdpulse or b) pulse mode for one clock pe-riod. The latter where one signal will notproduce spillover in the next bunch cross-ing, is used. The comparator part has beentested for single chips [43]. Typical thresh-old curves for the Beetle 1.1 are shown inFigure 4.9. The sigma on the threshold isabout 0.07 MIP. Despite a satisfactory per-formance in noise and efficiency for singlechannels, not all channels could be operatedin discriminator mode due to a too large off-set spread combined with a too small rangein the DACs to set the individual thresh-olds. These deficiencies have been rectifiedin the Beetle 1.3 design.

Since the Pile-Up sensors and hy-brids are not inside the acceptance ofLHCb, thicker sensors than the VELO, i.e.300 µm, will be used to have more signal,and the hybrid design is not limited by radi-ation length consideration, hence allowing

Signal (MIP)O

ccup

ancy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Figure 4.9: Threshold scans for three Beetle 1.1 com-parator channels with different offsets.

easier suppression of common mode. Thehybrid shown in Figure 4.7 has been testedin the CERN testbeam [42]. Based on thisVELO design an eight-layer Pile-Up hybridhas been designed for the Beetle 1.2/1.3,now including the LVDS outputs of the dis-criminators.

4.3.2 Prototype Vertex FinderBoard

Since the Vertex Finder Board (VFB) is themost complicated board of the processingsystem, it has been prototyped first as a 6UVME board (Figure 4.10). The trigger al-gorithm has been partitioned into code fortwo FPGA’s. The XCV3200E is with 4 Msystem gates the largest FPGA of the Xil-inx Virtex-E 1.8 family. With the presentdesign the device utilisation is about 70%,close to the maximum possible. Timinganalysis and tests on the prototype boardshow that the required 40 MHz operationis feasible. The present implementation ofthe trigger algorithm in the FPGAs takes50 steps of 25 ns, close to initial predictions.Additional steps are required for data align-ment and serialisation. Studied is whether

36 CHAPTER 4. LEVEL-0 PILE-UP SYSTEM

Figure 4.10: Prototype Vertex Finder Board.

VME

Ctrl

JTAG

Coincidence

histogram2nd peak

3rd peak

1st peak2nd peak(3rd peak)bcnt

2nd peak

Trigger

1st peak

FPGAXCV3200E

LVDSoutputs

DEMUX

AB@80MHz

@40Mhz Coincidence

histogram1st peak

mask 1st peak

A

B

A

BBuffer

192LVDS

160

160

FPGAXCV3200E

PLLClk

Reset

tmstdotditck

1st peak

JTAG

Ctrl

A

B

matrix

matrix

Figure 4.11: Prototype Vertex Finder Board diagram.

the algorithm can be optimized further toregain some processing steps to fulfill theoverall latency requirements.

A Test Board with a smaller FPGA pro-vides the logic to supply test patterns tothe Vertex Finder Board. The test patternsare loaded via VME into the FPGA, whichstores these patterns in memory.

In Figure 4.11 the schematics of the pro-

totype of a Vertex Finder Board is shown.In the left FPGA all hit patterns of theSi-detectors are first stored in a correlationmatrix. Hits from tracks with the same ori-gin have equal ratio k = rb/ra. All channelcombinations are stored in the CoincidenceMatrix. A z-histogram is formed by sum-ming all entries of wedges between lines ofconstant ratio k in the matrix. The numberof processing steps is always the same, doesnot depend on whether detector strips arehit or not. A linear search for the highestpeak in the histogram is performed. Thenthe input bits related to that peak (hencehaving the same k value) are removed fromthe data stream that is passed on furtherto the next FPGA. There the second high-est peak is searched for. All processing ispipelined, with 100 ns intervals. Results areoutput via LVDS lines.

4.3.3 Trigger System Architecture

Input data are routed further by the Mul-tiplexer Boards. The processing of the ver-tex finding algorithm (Figure 4.12) is per-formed in the Vertex Finder Boards. TheMultiplexer Boards distribute the eventsby round-robin scheduling. Each Ver-tex Finder Board processes one eventas indicated. Processing results are de-multiplexed by the Output Board. TheOutput Board interfaces the processor sys-tem to the central Level-0 trigger. Justone 9U/40 cm crate will be needed for thewhole processor system. The crate layoutand the internal connections are shown inFigure 4.13.

Multiplexer Boards

The number of input signals is 1024. Eightoptical ribbons will be connected to fourMultiplexer Boards. The optical to electri-cal transition will be directly at the Multi-plexer Board level. Each Multiplexer Boardis connected to all Vertex Finder Boards

4.3. TECHNICAL DESIGN 37

TO DAQ

TO L0DU

OU

TP

UT

BO

AR

D

MU

LTIP

LEX

ER

BO

AR

D

VERTEX FINDER

VERTEX FINDER

VERTEX FINDER

VERTEX FINDER

clock

Q1(top)Q2(top)Q3(bot)Q4(bot)

vertex proc1

25 ns

256*2

256*2

256*2

256*2

16 chips * 16 signals (pair) @ double speed

vertex proc2

vertex proc4vertex proc3

Figure 4.12: Pile-Up processing plus the data multi-plexing and serialising scheme.

4 Multiplexer Boards

DATA OUT

2 optical ribbons

PCI−conn

DATA IN

5*176−pin

5*176−pin

PCI−conn

5* IN

Output Board

4 Vertex Finder Boards

to Output Board

to L0DU

to DAQ

TTCrx

TTCrx

TTCrx

Figure 4.13: System crate layout and data connec-tions.

via point-to-point connections running at80 Mbit/s, using PCI-connectors/cables atthe backplane. In the Multiplexer Boardthe data will be round-robin routed by aFPGA to the Vertex Finder Boards. Theinput data is also copied directly into mem-ory (L0 buffer) for inclusion in the DAQchain. A TELL1-board [16] will be used forthat purpose. Both at the Beetle level as atthe Multiplexer Board level noisy channnels

can be masked.

Vertex Finder Boards

In total four Vertex Finder Boards areplanned to be used, where each board han-dles every fourth event. Minor configura-tion parameters as threshold levels shouldeasily be adaptable. Algorithms for dif-ferent beam or geometrical conditions willbe pre-programmed and loaded on demand.Binning in the vertex histogram and themasking width can be adapted.

Future FPGAs are expected to be evenlarger than the XCV3200E, providing thepossibility to combine all tasks in just oneFPGA. The specific elements for the lumi-nosity processing still have to be definedin detail and require also extra FPGA re-sources.

Output Board

The Output Board is a simple board com-bining the inputs of the Vertex FinderBoards and outputting the data to theL0DU. The trigger information (0, 1, 2 in-teractions) is histogrammed at the L0DUlevel. These histograms will be the basis fordetermining the luminosity with the Pile-Up system.

Latency

The breakdown of the latency of the Pile-Up system is given in Table 4.2.

Debugging and Monitoring

The following items have to be monitoredregularly:

• Noisy channels: channels that givespurious hits could flood the process-ing system with uncorrelated entries.Automatically these channels shouldbe looked for and removed from theinput of the processing system.

38 CHAPTER 4. LEVEL-0 PILE-UP SYSTEM

Table 4.2: Breakdown of the latency for the Pile-Upsystem.

Time [ns]Beetle 50Copper cable 90Optical Transmission Station 125Optical fibre 270Multiplexer Board 250Algorithm 1175Output Board 150To L0DU 90Total 2200

• Optical Station: The ECS interfacecan fill specific buffers located at theOptical Station with test patterns.They emulate data transport by theoptical links for consecutive events.

• Readout: the input signals for thePile-Up system are transferred to theDAQ. Regularly offline checks have tobe performed to see whether the re-sults of online and offline processingagree.

• Vertex Checks: the number of entriesin the histograms and the location ofthe vertices should be followed closelyto check the behaviour of the systemand the machine background condi-tions.

• Processing: test patterns can be fedinto the Vertex Finder Boards to checkthe overall processing of the system.

Although the requirements on alignmentare not very stringent (�r < 100 µm), acheck on the correct position of the detec-tors is necessary. This check is part of theoverall VELO geometry alignment.

Chapter 5 Level-0 Decision Unit

The Level-0 Decision Unit (L0DU) re-ceives information from the Calorimeter,Muon and Pile-Up sub-triggers at 40 MHz,which arrive at different fixed times. TheL0DU latency budget is 500 ns, countedfrom the latest arrival of the sub-systemdata. Table 5.1 lists the breakdown of la-tency budget for the Level-0 sub-systems.The computation of the decision can start

Table 5.1: Breakdown of the latency of Level-0 inns.

Muon Calo Pile-UpTOF+Cables 975 850 1000Processing 1200Subtotal 2175 2050 2200L0DU 500RS+TTC→FE 800Total=max 3500Contingency 500Total latency 4000

with a sub-set of information coming froma Level-0 sub-trigger, after which the sub-trigger information is time aligned. An al-gorithm is executed to determine the triggerdecision, and a summary bank (L0Block)is constructed. The L0Block is made avail-able to Level-1 and the HLT. The decision issent to the Readout Supervisor [15], whichhas the ultimate decision about whether toaccept an event or not. The Readout Su-pervisor is able to generate and time-in alltypes of self-triggers (random triggers, cali-bration, etc) and to control the trigger rateby taking into account the status of thedifferent components in order to preventbuffer overflows and to enable/disable the

triggers at appropriate times during resetsetc.

The L0DU performs simple arithmeticcombining the signatures into one decisionper crossing. It can set several thresholdsper candidate, and allows the downscalingof triggers. It can also base its decisionon some information of the two precedingand two subsequent crossings, and this in-formation is also included in the L0Block.It will monitor the Level-0 performancewith counters which are made available viathe ECS, and allows quick interrogationof the trigger source via an explanationword included in the L0Block. Despite theavailable flexibility, the results presented inChapter 7 are based on a simple algorithm,which sets thresholds on the ET of all thecandidates. If the SPD and Pile-Up mul-tiplicities, or the number of tracks from asecondary vertex are above a given value,the event is tagged as a Pile-Up-Event andrejected. In addition, events are accepted ifΣpµ

T is larger than a threshold, irrespectiveof the Pile-Up-Event tag.

The L0DU will be installed in the bar-racks. It is a custom-built board [44], im-plemented using a 40 MHz pipeline logic inan FPGA.

5.1 L0DU inputs

Table 5.2 summarizes the L0DU in-put/output ports.

Each L0 trigger processor sends the cor-responding data synchronously with its ownlatency. The trigger processor data fits into

39

40 CHAPTER 5. LEVEL-0 DECISION UNIT

Table 5.2: L0DU Input/Output summary

External system I/O # bitsCALO I 224@40MHzMUON I 256@40MHzPile-Up I 64@40MHzReserved I 96@40MHz

Readout Supervisor O 16@40MHzL1 O 704@1MHz

HLT O 1024@40kHzECS IO -

normalized 32 bits words corresponding toa candidate. It includes a bunch identifica-tion number allowing the synchronizationbetween the data sources.

• The Calorimeter trigger (see Chap-ter 2) sends seven words to the L0DU.They correspond to the highest ET

electron, photon, local π0 [23], globalπ0 and hadron trigger candidates, thesum of the total transverse energiesmeasured in HCAL (Total ET) and thetotal multiplicity of the event NSPD

measured with SPD detector [45].

• The Muon trigger processor (see Chap-ter 3) sends eight words to the L0DUcorresponding to eight muon candi-dates, two per quadrant of the muonsystem.

• The Pile-Up System (see Chapter 4)sends two words, indicating the num-ber of tracks per interaction to theL0DU.

A total of 20×32 bits @ 40 MHz is ex-pected as input of the L0DU. Only one elec-trical standard on the L0DU input interfacewill be used in order to simplify test andmaintenance.

5.2 L0DU overview

For each data source, a Partial Data Pro-cessing system performs a specific part of

the algorithm and the synchronisation be-tween the various data sources. Then a trig-ger definition unit combines the informa-tion from the above systems to form a setof trigger conditions based on multi-sourceinformation (Figure 5.1).

The trigger conditions are logicallyORed to obtain the L0DU decision afterthey have been individually downscaled ifnecessary.

Then a decision word (16 bits) is sent tothe Readout Supervisor [15]. This word in-cludes the decision itself (1 bit) and 12 bitsfor the bunch number. One additional bit isreserved for a forced trigger. The L0Blockis built for each event and stored in pipe-line memories waiting for the L0 accept sig-nal. Then it is transferred to the Level-1buffer and a subset is sent to the Level-1trigger.

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

Triggers Definition Unit

Partial Data Processing

64@

40M

Hz

256@

40M

Hz

224@

40M

Hz

96@

40M

Hz

RS16@

40M

Hz

RS word computation − Decision

Control interface

CAL MUON SPARE

ECS

HLTL1

L0B

lock

bui

lder

1kb@

40kH

z88

b@1M

Hz

Pile−Up

Figure 5.1: L0DU logical architecture

The ECS control interface manages thealgorithm parameters, the algorithm be-haviour, reset scenarios, slow control, onlinedebugging and monitoring and many othertasks like FPGA programming.

Like a detector Front-End electronicsboard, the decision unit is able to send datato L1 and the HLT.

5.4. STUDIES FOR THE FINAL IMPLEMENTATION 41

5.3 L0DU Prototype

An L0DU prototype was assembled at thebeginning of 2002. It is a simplified versionof what is foreseen for the final L0DU. It hasneither ECS nor TTC connection and hasonly a reduced number of inputs (96 bits)and outputs (32 bits). Inputs and outputsare transmitted in LVDS format at 40 MHzvia standard Ethernet cables. Figure 5.2shows the prototype of the decision unit be-ing tested. This prototype offers a maxi-mum flexibility and adaptability to test alarge part of the final L0DU functionalitiesincluding L0Block building operations.

Figure 5.2: L0DU prototype and its test set-up

To avoid connections between differentmodules the final L0DU will be imple-mented in a single board.

In the prototype, the use of a singleFPGA would have been sufficient in termsof number of input/output and internalmemory resources, but to be more realis-tic in relation to the final version, the pro-totype was made up of five interconnectedFPGA1.

Several simple algorithms were imple-mented and tested successfully [46] whenthe latencies of the various sources were em-ulated.

After a first step of time alignment ofthe different sources, thresholds are appliedon the data. Each intermediate condition

1ACEX1K100 were used

is individually downscaled, rate divided ormasked under a given set of parameters. Fi-nally the decision is taken if a combinationof conditions is realised.

Tests of the L0DU require the design of aspecific test bench. A first version was builtto test the L0DU prototype. It is made upof several “memory” boards synchronizedby a “clock generator” board. Each boardallows 64 bi-directional input/outputs to bedriven or received into standard Ethernetcable with LVDS levels at a 40 MHz fre-quency. The memory boards are both usedto store the stimuli and the outputs of thetested system. The maximum number ofdata words is 216 in the current design.Three memory boards and one clock gener-ator were used to test the first L0DU pro-totype (Figure 5.2).

The user defined stimuli and the datafrom the system under tests are downloadedor read out through a VME bus.

In addition, the clock generator boarddelivers a synchronization signal as a refer-ence for the whole test bench timing. Upto 16 memory boards can be synchronised.

A C++ acquisition program running ona Linux PC controls the memory boardsthrough VME. It makes a bit-to-bit com-parison of the L0DU prototype outputswith the results of a C++ simulation per-formed with the same stimuli.

The prototype was fully functional. Dueto the synchronous pipe-lined architectureit should be easily scaled to the final ver-sion of the unit. Recent FPGA technologiesproviding more input/output ports in manyformats and more internal memory will beused.

5.4 Studies for the final imple-

mentation

In order to reduce the number of incomingcables and connectors, the data will be seri-alized. Optical links, allowing the transfer

42 CHAPTER 5. LEVEL-0 DECISION UNIT

of 16 bits at 80 MHz, are a good candidateto exchange data with the L0 sub-triggers.

Most sub-detectors of LHCb use theTELL1-board [16] for buffering their dataand interfacing to L1 and the HLT. Thisboard includes many components needed bythe L0DU; it is described in more detail inSection 6.1.2. Its use to implement directlyL0DU is appealing and under study.

The TELL1-board has some modular in-put mezzanines, where is implemented thereceiver part, linked to the mother boardthrough 4 connectors. The L0DU itselfcould be a specific mezzanine (9U heightand about 16.5 cm wide) including :

• input interfaces from trigger proces-sors (re-use of the design of the opticalmezzanines);

• the FPGA implemented L0DU algo-rithms;

• output interfaces to the Readout Su-pervisor with LVDS signals;

• Level-0 pipeline.

The mother board connectors would beused to receive TTC and ECS signals andsend data to L1 and the HLT. The L0DUstill remains a specific board but stronglylinked to the TELL1-board.

5.5 Debugging and monitor-ing

A simplified version of the external testbench, described in Section 5.3, is foreseento be integrated on the L0DU. It will ensurethat the L0DU is still working correctly byinjecting data patterns designed to make aneasy and fast diagnostic of possible prob-lems. Meanwhile the full debugging andthe maintenance will be performed with theexternal test bench. In that way, a copy ofthe L0DU will be maintained permanentlyto ensure the full availability of the unit.

Online monitoring functions will be im-plemented through ECS. Counters will pro-

vide statistics on taken decisions and inter-mediate results allowing a measurement ofthe L0 trigger performances.

Chapter 6 Level-1 and High Level Trigger

In this chapter we describe the techni-cal design of Level-1 and the HLT, address-ing both its hardware implementation andthe algorithms used to take the trigger de-cisions.

Essential input parameters for the designare the average values of (a) the amountof event data sent from the FE electron-ics to the CPU farm and (b) the CPUprocessing time of the trigger algorithms.The former sets the scale for the networkand, for a given readout network technol-ogy, it defines the number of data sourcesper subdetector. The latter sets the mini-mum number of CPUs needed in the farm.The distributions of these input parame-ters must also be known, to some extent,in order to verify that the system is well-behaved even in the presence of tail eventswith large data sizes and/or large process-ing times. All event data sizes and process-ing times shown in this document have beenobtained from the standard LHCb simula-tion framework, which is described in Chap-ter 7. Minimum-bias events were processedto produce L0-accepted and L1-acceptedevent samples with realistic data sizes andprocessing times. These samples have thenbeen used as input to the L1/HLT networkand CPU farm simulations.

The requirement for the implementationis to be flexible in the assignment of pro-cessing nodes to either Level-1 or HLT, andto be easily scalable as the need arises.

L1 makes use of data from the VELO,TT, L0DU and Calorimeter SelectionCrate, whereas the HLT has access to alldata. The VELO and TT provide the min-

L1 Velo Clusters

even

ts

Entries 13325

Mean 1026.78

RMS 424.748

L1 TT Clusters

Entries 13325

Mean 461.51

RMS 174.42

1

10

10 2

10 3

0 1000 2000 3000 4000 0 500 1000 1500 2000

Figure 6.1: The number of clusters per event in theVELO and TT for events passing Level-0.

imum information required to obtain pre-cise impact parameter measurements and arough estimation of the particle momentaby using angles and deflections of tracks inthe upstream fringe field of the spectrom-eter magnet. Events are selected by re-quiring at least two tracks with large pT

and significant impact parameter to the pri-mary vertex. The muons from the L0DUand clusters from the Calorimeter Selec-tion Crate allow further enhancement of thesignal purity by matching VELO tracks tothese L0 high-ET candidates.

For VELO and TT, the L1 cluster in-formation can be encoded in 2 Bytes withsufficient spatial resolution. Hence, thedata size per event is roughly given byNcl × 2 Bytes, plus some L1 board headerinformation (4 Bytes per board). Here, Ncl

is the number of L1 clusters in an event.The Ncl distributions after L0 for VELOand TT are shown in Figure 6.1. The num-ber of data sources and event fragment sizes

43

44 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

are summarized in Table 6.1.HLT has access to the full event data,

and is executed on the same commodityCPU farm as L1. The algorithm first con-firms the L0 and L1 triggers with betterprecision, and then mimics the off-line se-lection algorithms for the various channelsto reduce the rate to 200 Hz, at which rateevents will be written to storage. The totalraw event size is approximately 31 kBytes.

Table 6.1: Number of L1 data sources and aver-age event fragment size per source, which does notinclude the transport overhead.

Subsystem Number of Data/sourcesources [Bytes]

VELO 76 36TT 48 24L0DU 1 86Calo Crate 1 70

The total size of the readout network forLevel-1 and the HLT has been based onthe simulation results shown above. Ratherthan overdesigning the system to be able tocope with the unforeseen, the system is re-quired to be scalable to be able to adaptquickly to actual needs. In addition, the 40kHz L1 output rate is dominated by eventswhich trigger because a track is wrongly as-signed to have large transverse momentum.Hence, the L1 algorithm would benefit sig-nificantly from a more precise momentumdetermination. This can be achieved byproviding L1 with Tracker (T1–T3) data.In addition, the use of Muon (M2–M5)data would increase significantly the L1 ef-ficiency for channels with muons in the finalstate. The scalabilty of the L1/HLT systemis presented in Appendix A.

Next, we give a detailed description ofthe L1/HLT technical design, discussingfirst the hardware architecture and imple-mentation, and subsequently the L1 andHLT algorithms.

6.1 Level-1 and HLT hard-ware implementation

The Level-1 trigger and HLT algorithms op-erate on general-purpose CPUs. The in-put data come from the front-end electron-ics of the detectors included in the system,which are the VELO and TT, together withdata from the L0-trigger. In this sectionthe Data Acquisition (DAQ) system for theLevel-1 and HLT is described, which willcollect the event fragments from the Front-End electronics boards, assemble them intocomplete events and deliver them to a CPUin a computer farm.

In the case of the Level-1 trigger theevent data are buffered in the front-endelectronics until a decision has been taken.It is therefore important to keep the la-tency for the whole process of data trans-port and event assembly as short as possi-ble, to allow maximal time for the execu-tion of the algorithm. The system providesan environment for the physics algorithms,in which they can run unchanged from the“offline” environment. Some adaptations oflow-level services of the software framework(“Gaudi” [53]) are however required.

The technological challenge in the sys-tem consists of handling the high rate ofdata using commercial and (to a large ex-tent) commodity equipment, while trans-porting and assembling the data as quicklyas possible.

The High Level Trigger uses the fullevent data and operates at the accept rateof the Level-1 trigger. From a data acquisi-tion point of view the problem is very simi-lar, the main difference being that there aremany more data sources sending larger frag-ments, but at a much reduced rate. How-ever the aggregated traffic is significantlysmaller than that from the Level-1 trigger.The HLT algorithm also needs completelyassembled events and runs on a general pur-pose CPU. In this case there is no latency

6.1. LEVEL-1 AND HLT HARDWARE IMPLEMENTATION 45

limit due to limited front-end buffers sincethe buffering of the events is done in theCPUs.

A system for performing the DAQ forthe HLT has been described in the OnlineSystem TDR [9]. The system describedhere is an evolution of the architecture de-scribed there, which does the data acqui-sition and event assembly for both triggerlevels using the same infrastructure. Thesystem presented here supersedes what hasbeen written on the data-flow in [9]. Theother parts of [9], which deal with the TFC,ECS and general infrastructure remain un-changed, except that their scale is adaptedaccordingly. The key characteristics of thedata-flow system are:

• Copper Gigabit Ethernet is used as alink technology. The connectivity be-tween the sources and the destinationsis provided by a large switching net-work.

• Data are pushed through, every sourcesends when it is ready to do so. Flowcontrol is exercised centrally by dis-abling the trigger at the level preced-ing the one at which the problem isdetected via the TFC system.

• HLT and Level-1 data share the in-frastructure and the HLT and Level-1algorithms run concurrently in theCPUs.

• Event fragments are packed in the datasources to reduce the packet rate in thesystem.

6.1.1 Architecture

The architecture is most easily explainedby following the data-flow from the sources,the front-end electronics boards, to the ul-timate destinations, the CPU nodes, asshown in Figure 6.2.

The front-end electronics is required tobe able to store 58254 events [11]. The min-imal time between events entering the sys-

tem is 900 ns. This means that in orderto avoid buffer overflow the maximum timefrom the entry of the event into the systemuntil the trigger decision is transmitted tothe front-end electronics is 52.4 ms. The de-cisions are distributed via the TFC system,described in [9]. All front-end boards usethe same standard Gigabit Ethernet plug-in card to send the data [47]. This cardhas two output links, one of which is usedfor HLT data and one for Level-1 data, ifapplicable1.

In order to reduce the packet rate in thesystem, the front-end electronics is requiredto pack several event fragments in a multi-event packet (MEP). Due to fluctuations inthe fragment size, the resulting MEP canbecome bigger than the Maximum Trans-mission Unit (MTU) of Ethernet, which is1500 bytes of payload [48]. The front-endelectronics must then split the MEP intoseveral Ethernet frames. Because we wantto use standard protocols wherever possi-ble, the front-end electronics is required toformat the MEP as an Internet Protocol(IP) packet. The IPv4 standard defines alsothe way to split up a packet into Ethernetframes [49].

The packing factor, i.e. how many eventfragments are put into one MEP, is anadjustable parameter of the system. ForLevel-1 the maximum packing factor is de-fined to be 32 and for HLT 16. Packingfactors which frequently require a packet tobe split into several Ethernet frames are notuseful, because it is the Ethernet frame ratewhich is the problem for the receiving end.

From the current knowledge of the datafragment sizes per board, good workingpoints for the packing factors are 25 for theLevel-1 and 10 for the HLT.

To reduce the size of the central read-out network it is foreseen to do some multi-plexing using Ethernet switches before go-ing into the main readout network switch.

1Technically it is also possible to use both links for L1and HLT traffic.

46 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

Figure 6.2: The architecture of the DAQ system. The ECS and the throttle signals are not shown.

The multiplexing increases the packet-rateon the outgoing link towards the readoutnetwork2. Data are then pushed through alarge high-performance Ethernet switch toa sub-farm controller.

The sub-farm controllers (SFC) sit at thedown-stream end of the readout network.They perform the event-building, where in-dividual event fragments from the MEPsare assembled in correct order into events.They distribute the events to the computenodes connected to them via another Gi-gabit Ethernet switch. The SFC exercisesdynamic load balancing among the nodes.Each node is only processing one Level-1event at any given time. This means thatLevel-1 events will have to queue in theSFC, when there is no available node, andthat time-out mechanisms must be imple-mented in the nodes.

2The total rate is of course not increased.

Simulation has been used to investigatethe additional latency suffered by eventsdue to the queueing in the SFCs and dueto the packing of events into MEPs [50].Figure 6.3 shows the number of events intime-out as a function of the maximum pro-cessing time allowed. The underlying simu-lation includes a complete model of all fea-tures of the queueing and load balancing ina sub-farm and a model of the expected per-formance of the Level-1 trigger algorithm.The model for the processing time of theLevel-1 algorithm has a cut-off at 50 ms,which explains the steep drop around 50 msin the insert. No significant increase of thelatency has been found.

Since the mean time for reaching aLevel-1 decision is much shorter than the al-lowed maximum, the nodes will not alwaysbe busy. Optimal usage of the total avail-able CPU power is achieved by running the

6.1. LEVEL-1 AND HLT HARDWARE IMPLEMENTATION 47

Figure 6.3: Fraction of events exceeding the maxi-mum processing time in a sub-farm as a function ofthe time cut-off. The solid and dotted lines showthe simulation result for a scenario with and with-out event-packing, respectively.

HLT as a background task, which is inter-rupted whenever a Level-1 event needs to beprocessed. Switching between the two tasksis done by the operating system, and hasbeen measured to take less than 10 µs [50].

After a trigger algorithm has finished,a result is sent back to the SFC. For aLevel-1 event the decision contains only ashort summary block, which is forwardedto the Level-1 decision sorter described inthe next paragraph. In the case of theHLT data, accepted events will undergo fullreconstruction and the reconstructed datawill be sent together with the raw data topermanent storage. To this end the SFCswill be connected to the storage either viathe event-building network itself or via asmall dedicated network.

Level-1 decisions

The Level-1 decisions are small Ethernetpackets, which are sent to the TFC system.In particular they are received by the Read-out Supervisor, which makes the ultimatedecision about whether to accept an eventor not, because it knows about the stateof the throttle signals. Since it also sendsthe trigger decisions to the front-end it hasan easy way to measure the actual time anevent has spent in the system, and can reactto a time-out for Level-1 events3.

The Readout Supervisor requires theLevel-1 trigger decisions to arrive in the or-der they entered the system, i.e. in ascend-ing order of Level-0 event numbers. Thisis the task of the Level-1 decision sorter,shown on the right side of Figure 6.2. Thesorter must be informed when an event en-ters the system. This is done by the TriggerReceiver Module (TRM), shown in the up-per right part of Figure 6.2.

Destination Assignment

The destination assigment is central andstatic. The Readout Supervisor broadcaststhe destination for each Level-1 and HLTMEP. The broadcast contains among otherinformation the 10 least significant bits ofthe IP addresss of the SFC to which thisMEP should be sent. The Readout Super-visor keeps a table from which it selectsthe destinations. By making SFCs appearmore often than others in this table a coarsestatic load balancing can be achieved. Theadvantage for the front-end electronics is toavoid keeping a relatively big table of ad-dresses. Details can be found in [54].

Control and Monitoring

All the software and hardware componentsdescribed here will be configured, controlled

3Time-outs for HLT events are not so critical, becausethere are no buffers in the front-end electronics which couldoverflow, so time-outs are decided locally in the farm, justto avoid wasting time on an excessively complicated event.

48 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

and monitored using the Experiment Con-trol System (ECS), which is described in [9,51]. The ECS interfaces to the equipmentvia Ethernet. Following a general designprinciple of the online system to separateeverywhere data and control paths, a sep-arate Ethernet Local Area Network (LAN)is used to connect the equipment. This isdone either directly by using a second net-work interface card in the farm-nodes orSFCs, or indirectly by means of a controlsPC, which in turn accesses the hardwareby one of the agreed interfaces to electron-ics described in [9]. The ECS also collectsdata from all farm-nodes for online moni-toring and quality checking.

6.1.2 Implementation

The first stage of the system is part of thefront-end electronics of the sub-detectors.All detectors are required to use the samecustom-made Gigabit Ethernet interfacecard. Its design is finished and a prototypeis expected soon. In many aspects it cor-responds to a standard network interfacecontroller (NIC), found in PCs [47]. Thecurrent implementation has two indepen-dent Gigabit Ethernet ports and thus cansupport a theoretical maximum data out-put rate of 250 MB/s. A future versionis planned which will have 4 ports, with amaximum data output rate of 500 MB/s.

FE interface to the DAQ

The Gigabit Ethernet interface is limitedto the Ethernet protocol. The formattingof the MEP data into a IPv4 packet, whichmay span several Ethernet frames, is the re-sponsibility of the motherboard. Since IP isa very simple protocol, this poses no prob-lem for the powerful FPGAs used in theFE. All sub-detectors which send data toLevel-1 use the TELL1-board [16] to receivethe data from the Level-0 electronics, pro-cess them for Level-1 and send them to the

event builder. The data are buffered untila Level-1 decision has been reached, and,if accepted, are sent to the event builderfor HLT processing. In the following ashort description of the key features of theTELL1-board is given.

The data from the various sub-detectorscan be received either via digital optical oranalogue electrical links. For this purposemezzanine cards are used, which are shownin the top part of Figure 6.4. The dataare digitised for the electrical links and de-serialised for the optical links. To cope withthe quite diverse processing requirements ofthe sub-detectors the data are then passedthrough several large FPGAs. An exampleof the processing to be done is given in [55],focusing on the VELO, but applicable alsoto TT and IT. The data are then stored ina buffer implemented using standard DDR-SDRAM memory. The Level-1 buffer holds58254 events. For detectors, which senddata to the Level-1 trigger the data are thenforwarded to the SyncLink FPGA (shownin the lower part of Figure 6.4), which per-forms the following tasks:

• link the fragments from all processingFPGAs into one fragment;

• perform any processing required forcompletely assembled fragments;

• buffer the assembled fragments untilthe number of events in a MEP hasbeen reached;

• pack the MEP into an IPv4 packet;

• if necessary segment the IPv4 packetinto several Ethernet frames;

• send the frames via the Gigabit Ether-net card (RO-Tx in the figure) to theevent builder.

For all detectors upon reception of aLevel-1 yes via the TTC system, the eventfragments are pulled into the SyncLinkFPGA. It then goes through the samesteps as for Level-1 events, except that the

6.1. LEVEL-1 AND HLT HARDWARE IMPLEMENTATION 49

PP-FPGA

A-RxCard

L1B

SyncLink-FPGA

PP-FPGA

L1B

PP-FPGA

L1B

PP-FPGA

L1B

RO-TxTTCrx FEMECS

HLTL1TTTCECS

FE FE FE FE

A-RxCard O-RxCard

L0 a nd L 1

Throttle

Figure 6.4: Building blocks of the TELL1-board.In the first row the input stage is shown with op-tical and electrical input mezzanines. The secondrow shows the processing FPGAs and the Level-1buffers. The third row shows the SyncLink FPGA,which assembles the sub-fragments and pushes theformatted events to the network. The fourth andlast row shows the interfaces to the external sys-tems like the ECS, the TFC and the network (RO-Tx).

fragment processing for HLT is different4

and an additional link is used to send theframes.

PCs and switches

Most of the other components are commer-cially available. The aggregation and sub-farm switches are relatively cheap GigabitEthernet switches, typically found in highperformance LAN (Local Area Network) in-stallations. Full connectivity at maximumspeed is not required, because most of thelinks are not fully loaded. On the otherhand, the core switch of the readout net-work must provide full performance andgenerous buffering to cope with the traf-fic. Such devices are typically found in the

4At 40 kHz more sophisticated processing can be per-formed.

backbone of large campus networks. Keyparameters for this switch can be extractedfrom simulation [50]. If monolithic switcheswith a sufficient number of ports cannotbe found, or are very expensive, then theswitch can be built from smaller compo-nents. Various interconnection topologiesare possible. While detailed studies can befound in [50], in general it is found thata system built from several layers has ad-vantages because of the distributed buffer-ing. The forwarding latency, however, is in-creased slightly. From an operational pointof view, a single unit is easier to manage. Inthe end this question will be decided basedon the cost of the solutions.

The sub-farm controller, whose archi-tecture is described in [9], is a high-performance PC. The emphasis on perfor-mance is mainly on the I/O capabilities, be-cause it is required to handle at least twoGigabit/s of data. Such PCs are alreadyavailable. To achieve maximum through-put care must be taken in selecting high-performance network interfaces, which sup-port advanced DMA (Direct Memory Ac-cess) features and buffering. The requiredpacket rate of 80 kHz can be sustained withtoday’s hardware without problems; betterhardware and custom software will allowraising this limit even further.

The farm-nodes will be chosen accordingto the best obtainable price/performanceratio. They will be operated disk-less andrequire, apart from CPU power, sufficientmemory to run without a page-file and twonetwork interfaces in order to maintain theseparation of data and control paths.

The detailed implementation of the CPUfarm is the subject of ongoing studies.These comprise the physical realisation ofthe farm as a system of water-cooled rackswith rack mounted PCs, as well as the soft-ware to control and monitor the CPUs andthe event distribution from the SFCs to theworker CPUs.

50 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

Packet loss and error handling

Ethernet does not guarantee reliable framedelivery. For performance reasons, no reli-able higher-level protocol like TCP is usedin the system, except for sending eventsto permanent storage. The system is how-ever error detecting at all stages, using se-quence numbers in the transport protocolsand time-outs. There are two major rea-sons for packet loss: bit errors and droppedpackets in the switches due to congestions.Bit errors can be detected reliably by theEthernet checksum. Measurements withmodern Ethernet hardware on unshieldedtwisted pair cables considerably longer thananything foreseen in the experiment haveshown bit error rates better than 10−14.Packet dropping in the switch can onlybe avoided by selecting a suitable switch.The parameters for such a switch will beobtained from simulation. From currentsimulations it is known that large outputbuffer memories are required. Candidateswitches will be verified in a test-bed. Ex-perience with recent high-end routers (suchas used for the CERN campus network)shows that packet-loss in the switches is ex-tremely rare.

TRM and Level-1 decision sorter

The TRM is implemented using a TELL1-board [16], which simply sends a packet tothe Level-1 decision sorter upon receptionof the Level-0 accept decision from the TFCsystem.

The decision sorter itself is implementedusing a commercially available PCI card,which allows fast packet processing usinga Network Processor (NP). The code forsorting and performance results are de-scribed in [50]. An alternative implementedentirely in the TFC system is describedin [54].

The implementation of the remaining in-frastructure and hardware is as described

in [9].

Size of the system

To determine the size of the system, thenumber of front-end boards is the first basicinput. This defines the number of links tobe read out. For the Level-1 trigger, whereVELO, TT, L0DU and the Calorimeter Se-lection Crate are read-out, this results in126 before aggregation and 64 links into theevent building network; for the HLT thereare 323 before aggregation and 32 after. As-suming a packing factor of 25 (i.e. 25 eventsare sent in one packet) for the Level-1 and10 for the HLT, multiplexing factors can becalculated to get to an average link load of80%.

Table 6.2: Base target parameters for the L1/HLTimplementation. The first four are given externally,the others are chosen. The overheads describe thesize of the header, which is needed to describe thedata contents for the event-builder [52].

L0 accept rate 1 MHzL1 accept rate 40 kHzL1 transport overhead / MEP 48 bytesHLT transport overhead / MEP 48 bytesL1 packing factor 25HLT packing factor 10Input link rate < 100 MB/sOutput link rate < 100 MB/sFrame rate at output < 80 kHz

The target parameters are summarisedin Table 6.2. They either stand for availableresources like CPU nodes and switch ports,or load factors like the link rates and framerates. The other input into the system isgiven by the expected average data size perfragment per front-end board. These num-bers are taken from the full detector simula-tion described in Chapter 7. At the level ofthe system design only the overheads due tothe data transport format (in particular theIP header) are added. A relatively straight-forward minimisation exercise yields the fi-

6.2. THE LEVEL-1 ALGORITHM 51

nal required numbers of switch-ports in theevent-building switch. Extra ports need tobe added to connect the storage-system andthe L1-decision sorter.

Table 6.3: Key performance figures of the system.

Event BuildingTotal Frame Rate at RN input [kHz] 7248RN output links 94RN output link rate [MB/s] 47.9Frame rate (L1) per link [kHz] 59.9Frame rate (HLT) per link [kHz] 20.0Total frame rate at RN output [kHz] 79.6MEP rate (L1) per link [kHz ] 0.47MEP rate (HLT) per link [kHz] 0.05Total MEP rate 0.53

Trigger farmsSub-farms 94Event rate/sub-farm (L1) [kHz] 11.7Event rate/sub-farm (HLT) [kHz] 0.4Processors/sub-farm 21Event rate per processor (L1) [kHz] 0.56Event rate per processor (HLT) [kHz] 0.02

In Table 6.3 the key performance fig-ures of the system are summarised includ-ing numbers for an extended network to in-clude the tracking detectors. The system isdominated by the Level-1 traffic. It shouldbe noted that the output links from theevent building network are not very heavilyloaded. In fact the number is determinedby the frame rate limit of 80 kHz, which ischosen to protect the SFC from too high aninterrupt rate. It is likely that with betterhardware and custom software significantlybetter results can be achieved, which wouldallow reducing the size of the system.

6.2 The Level-1 Algorithm

Level-1 exploits the finite lifetime of the B-mesons in addition to the large B-mesonmass as a further signature to improve thepurity of the selected events. All results as-sume the following information is used by

Level-1:

1. The L0DU summary information anddata from the Calorimeter SelectionBoards.

2. The VELO measurements of the ra-dial and angular position of the tracks,in silicon planes perpendicular to thebeam-line between radii of 8 mm and42 mm. The strip layout of the sensorsis shown in Figure 6.5.

Figure 6.5: The strip layout of the VELO sensors asviewed along the beam line. Two R-sensors (top) areshown, with their strips subdivided in octants. In φ-sensors (bottom) the strips make an angle between 10−20◦ with the radial, and strips are subdivided in tworegions. The dotted lines indicate the two φ−detectorsdownstream, which are rotated around the y-axis. Thelines to route the signals to the electronics located atthe periphery of the sensors are not shown.

3. The Trigger Tracker (TT) measure-ments from its four silicon planes, twowith vertical strips and two with a ±5◦

stereo angle.

52 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

34 m

m

910 mm

z

r

Figure 6.6: Event display of the result of the 2D tracking in the VELO detector, showing all hits and reconstructedtracks in a slice of 45◦ of the VELO R-sensors in an event where 72 forward 2D tracks were reconstructed.

B-mesons with their decay products in theLHCb acceptance move predominantly for-ward along the beam-line, which impliesthat the projection of the impact parameterin the plane defined by the beam-line andthe track is large, while in the plane per-pendicular to the beam it is almost indis-tinguishable with respect to primary tracks.The L1-algorithm exploits this by recon-structing so-called 2D tracks using only theVELO R-sensors. The 2D tracks are suffi-cient to measure the position of the primaryvertex since the strips at constant radius aresegmented in 45◦ φ-slices. Muon tracks areidentified by matching 2D tracks to Level-0muon candidates. A fraction of the 2Dtracks is selected based on their impact pa-rameter and their match to Level-0 muons,and these 2D tracks are combined with theφ-sensor clusters to form 3D tracks. Bycombining 3D tracks with hits in TT themomentum of these tracks can be measuredusing the fringe field of the magnet be-tween VELO and TT. In the following sec-tions the reconstruction algorithm and itsperformance will be described in more de-tail, while the combination of B-signaturesto form a trigger decision will be given inChapter 7.

6.2.1 VELO 2D Track Reconstruc-tion

Using the information from the R-sensors,tracks are reconstructed in the rz view,in which view tracks originating from thebeam-line form straight lines. The methodis based on triplet search, a triplet definedas three points in consecutive VELO sta-tions, and in the same octant, compatiblewith a straight line. The triplets are com-bined into longer segments using a dedi-cated fast algorithm based on a manipu-lation of integer indexes [56]. Figure 6.6shows an event display of the result of the2D track search in a 45◦ slice of the VELO.

6.2.2 Primary Vertex Search

In the next step a primary vertex searchis performed [56]. 2D tracks are projectedonto the central axis of their octant, andthese projections are combined with tracksfrom perpendicular octants to estimate theposition of the primary vertex. The ini-tial estimate of the primary vertex is basedon histogramming and peak finding, fol-lowed by iteratively rejecting outliers andre-fitting the vertex. After three itera-tions, a precision of RMSPV

x,y = 25 µm and

RMSPVz = 60 µm is reached.

6.2. THE LEVEL-1 ALGORITHM 53

6.2.3 Level-0 Object Matching to 2DTracks

The electron and hadron candidates abovea given ET threshold, and the largest pT

muons, are matched to 2D tracks to identifycandidates for the 3D tracking describedbelow. The details of the procedure can befound in [57]. The matching is performedby comparing dr/dz and azimuthal angleφ between VELO tracks and Level-0 ob-jects. A χ2 of the match is formed basedon the pT kick and the energy resolution ofthe Level-0 objects and δ(φ) = 45◦/

√12 of

the 2D track. For the moment only muoncandidates with relatively small χ2 are ac-cepted for 3D confirmation.

6.2.4 VELO 3D Track Reconstruc-tion

In addition to the selection as explained inthe previous section, 2D tracks with an im-pact parameter in the range 0.15 to 3 mm tothe primary vertex are also selected. Onlythese 2D tracks are reconstructed in 3Dto reduce the execution time of the algo-rithm. The φ-sensor information is linkedto the 2D tracks taking into account the es-timated position of the primary vertex [56].The combined 2D and 3D reconstructionefficiency for reconstructible B-decay prod-ucts5 is 94% and the ghost rate is 5.9%.The momentum information is not avail-able at the level of VELO reconstructionand thus the covariance matrix of a trackcannot include the contribution from mul-tiple scattering in the traversed material.It was shown [58] that assuming multiplescattering contributions corresponding to a3 GeV particle gives optimal parameters forextrapolation to TT, and for the measure-ment of the impact parameter.

5A particle is considered reconstructible if it has at leastthree hits in VELO R-sensors, three hits in φ-sensors, andat least one x and one stereo hit in each station of T1–T3.A track is considered to be found if 70% of its reconstructedhits originate from a single Monte Carlo particle.

6.2.5 Level-0 Object Matching to 3DTracks

In the next step of the reconstruction thematching between Level-0 objects and 2Dtracks is confirmed for the corresponding3D tracks to improve the ghost rejectionthanks to the increased precision in φ. Theuncertainties of the slopes of Level-0 objectsdominate the error, and hence the contri-bution from the VELO tracks uncertaintyto the matching χ2 is ignored. The puri-ties and efficiencies obtained for typical χ2

cuts are presented in Table 6.4. The purityis defined as a fraction of correct matchesnormalized to all matches in a signal sam-ple, while the efficiency is given for B-decaytracks and shows the rate of tracks matchedto L0 candidate, provided that the trackand L0 candidate are both reconstructed.Only the matching to muons is used for themoment in the performance given in Chap-ter 7.

Table 6.4: The performance of Level-0 object matchingto a 3D VELO track

3D tracks χ2max purity efficiency σ(p)/p

muons 16 51.2% 94.7% 6%electrons 4 32.9% 95.8% 12%hadrons 4 26.9% 92.8% 15%

6.2.6 VELO TT Matching and Mo-mentum Determination

The reconstructed 3D VELO tracks arecombined with hits in TT, which is locatedabout 250 cm downstream from the inter-action point. A perspective view of the TTstation as it is modelled in the simulationis shown in Figure 6.7. The 3D track pa-rameters at the farthest downstream pointobserved in the VELO are used to searchfor the track segment passing the four lay-ers of TT. A 3D track is extrapolated tothe z-position of each layer, and its distance

54 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

120.

81 5 . 1 5

117.

1

1 4 3 . 7

7.7

Figure 6.7: A front view of the TT station. Adjacentsilicon sensors of the same colour are daisy chained inthe readout.

to all hits inside a ±20 mm window is cal-culated. These distances are then rescaledto a reference plane perpendicular to thebeam line at the middle position of TT, tak-ing into account the expected slope of thetrack, and filled into a histogram. In thishistogram accumulations of at least threeclusters in five consecutive bins are searchedfor. They are analyzed in an increasing or-der of their mean distances to the straight-line extrapolation. For each accumulationa least-squares fit of a straight line is made,and based on their χ2, the worst points arerejected iteratively. The procedure stopswhen the χ2 becomes acceptable for at leastthree TT hits. Otherwise the accumulationis rejected and the next one from the list isconsidered until the list is exhausted. Foreach accepted accumulation the momentumis determined by a fit to the slopes of thetrack at VELO and at TT, and the inte-gral of magnetic field in-between. The ver-tical component of the magnetic field andthe corresponding

∫Bdl between the VELO

and TT are shown in Figure 6.8, and isclose to optimum [59] for the Level-1 perfor-mance. The VELO-TT reconstruction wastuned to optimize the Level-1 performance,which requires that for high-pT purity pre-vails over efficiency. Figure 6.9 shows howthe efficiency and resolution vary as a func-

a)

T T

By

[T]

b)T T

z [cm ]∫B

dl [

Tm

] -0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0

0.05

0.1

0.15

0.2

0.25

-50 0 50 100 150 200 250 300

Figure 6.8: The vertical component of the magneticfield (a) and the corresponding

∫Bdl (b) of the fringe

field. The vertical lines indicate the position of the TTstation.

tion of pT at that optimized working point.

a)

effic

ienc

y

pT [GeV/c]

σ(p

T)/

pT

b)0

0.10.20.30.40.50.60.70.80.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1 2 3 4 5 6

Figure 6.9: The efficiency (a) and pT resolution (b) ofVELO-TT matching at the optimal working point as afunction of pT.

6.2.7 L1 timing

A crucial aspect of the Level-1 algorithmis its execution time. A full study ofLevel-1 execution-time optimization of the

6.3. THE HLT ALGORITHM 55

code which is used for all performance num-bers in this TDR and the ReoptimizationTDR can be found in [60]. All criticalparts were identified and the logic of thealgorithm was tuned to meet the timingconstraints. The execution flow was set-up to minimize a number of calculationsneeded to make the negative decision thatis expected for the majority of Level-0 ac-cepted events. On average 8.5 tracks needto be reconstructed in 3D from 58.4 forward2D tracks in a minimum-bias event thatpasses Level-0. The final phase of the al-gorithm, in which all B-signatures are com-bined to obtain the Level-1 decision, is de-scribed in Chapter 7, but its execution timeis accounted for here. The timing of themost important phases of the Level-1 algo-rithm are detailed in Table 6.5. On average

Table 6.5: The timing of various phases of the Level-1algorithm as measured on a 1 GHz Pentium III Linuxprocessor.

Level-1 phase time [ms]initialization 1.12D tracks 2.4PV fit 0.52D selection 1.33D tracks 1.4refit of 3D tracks 0.2TT matching 0.9L0 3D matching 0.4decision 0.1

8.3 ms is spent in Level-1 with the algo-rithm described above on a 1 GHz PentiumIII processor. The time was measured asthe real time elapsed between start and stopof the Level-1 algorithm with a granular-ity of 1 µs. An overhead of about 20% dueto communication between different partsbased on creation of dynamic containers hasbeen subtracted. The distribution of totalexecution time is presented in Figure 6.10.The code described above has not been fur-ther optimized for execution time perfor-mance to allow stable code to be used for

IDEntriesMeanRMS

100 36024

8.321 6.922

Level-1 time [ ms ]

nb. o

f eve

nts

1

10

10 2

10 3

10 4

0 20 40 60 80 100 120 140

Figure 6.10: The distribution of the Level-1 executiontime on a 1 GHz Pentium III processor for minimumbias events accepted by Level-0.

the large scale production for the physicsstudies. In the mean time faster code hasbeen developed [62] with a similar trackfinding efficiency. With this new code weexpect to have an average execution timeper event of less than 1 ms in 20076.

6.3 The HLT Algorithm

The purpose of the HLT algorithm is to re-ject as fast as possible events not compat-ible with an interesting b decay, using thefinal quality information from the trackingsystem. Basically, the tracking is performedfirst in the VELO, then in the whole track-ing system to get the track’s momentumwith almost the full accuracy. Particle iden-tification is limited to muon and electronidentification, as processing the full RICHinformation would be too slow. With an L1processing time of less than 1 ms, the pro-cessing power remaining for the HLT corre-sponds to around 25% of the farm, whichtranslates to about 10 ms per event, 60 mson today’s 1 GHz Pentium III processor.

6A 1 GHz PIII processor delivers 425 CERN Units (CU),while it is expected [61] that in 2007 the performance willbe 2500 CU per node.

56 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

6.3.1 VELO tracking

An implementation of the HLT VELOtracking has been performed [62] with bothefficiency and speed as main concerns.Starting from the HLT event, it uses thefull VELO information to get a more ac-curate cluster position than in L1. The 2Dtracking in the rz-plane is performed, usingonly the R-sensors, and then the full spacetracking, collecting clusters in the φ-sensorscompatible with the 2D track.

The efficiency is measured on signaldatasets, where the main concern is to re-construct the tracks of the interesting b de-cay. This is not exactly the same as to getthe best efficiency on all tracks, as the inter-esting b decay products tend to have largertransverse momentum.

Speed is measured on the expected inputevents, this means minimum-bias eventspassing the L0 and L1 triggers, on a 1.0GHz Pentium III Linux CPU. Those eventstend to be busier than average signal events,and thus the processing speed is a bitslower. The reconstruction efficiency fortracks passing the magnet is 95.5 ± 0.2%,and for tracks from a b-decay 98.3 ± 0.3%.The ghost rate is 5.7% , and the time perminimum-bias event is 5.9 ms.

6.3.2 VELO TT tracking

The first step to measure the momentumof a track is performed by finding the rel-evant hit in the TT station. Even if themomentum estimate has a low accuracy, atthe 30% level for σ(p)/p, this reduces signif-icantly the search region in the T stations,hence speeding up by a factor close to twothe second step of the track search.

For pattern recognition purposes, thefield between the VELO and TT has thesame effect as an angular kick, proportionalto 1/p, of the track at a fixed z positionzbend. But multiple scattering is importantand dilutes this simple relation. The VELO

TT tracking method is to loop on VELOtracks, and to project all TT hits onto afixed plane, joining them to the track im-pact in the zbend plane. Only a small sub-set of the hits needs to be handled, as aminimal momentum for the track implies amaximum distance for the hit. The win-dow is typically ±21 mm in the bendingdirection, corresponding to a minimal mo-mentum of 1.5 GeV. In fact, the windowis reduced at low angle, corresponding to aminimal transverse momentum of 100 MeV.The pattern recognition is simply a searchfor an accumulation of projected hits, withat least 3 of the 4 planes fired. Priorityis given to 4-plane candidates, and to thehighest momentum solution.

As the TT stations do not cover com-pletely the acceptance, mainly around thebeam pipe, tracks with less than 4 expectedhits in TT are accepted, without momen-tum estimate. Tracks for which no suffi-cient accumulation was found are rejected,they are low momentum tracks. About30% of the VELO tracks are discarded thisway. The efficiency for interesting b de-cay tracks is over 99% compared to foundVELO tracks. It takes less than 5 ms on a 1GHz Pentium III to perform this first mo-mentum estimate for about 40 tracks perevent.

6.3.3 Long tracking

The search for hits in the T stations is basedon the so-called “Forward tracking” [63],which uses the fact that as soon as onepoint is known after the magnet, the mo-mentum is known and thus the completetrajectory. A fast method, using polyno-mial parametrizations, allows all hits to beprojected onto a common plane. With themomentum estimate from the VELO-TTtrack or a minimal momentum requirementof 2 GeV/c, the range of hits can be re-duced. An accumulation is searched for inthe projection, and an iterative procedure,

6.3. THE HLT ALGORITHM 57

fitting and removing the worst point, allowsthe best extrapolation of the VELO trackto be found.

The key ingredient is a parametrizationof the trajectory in the field, to obtain a fastprojection. As the field is reasonably regu-lar, and the minimum momentum 2 GeV/c,only a few polynomial terms need to becomputed, as described in [63]. The effi-ciency to find the correct hits is over 98%for the pions of B → ππ and a bit lower,94% , for larger multiplicity decays. Thespeed of the algorithm is about 40 ms perevent on a 1 GHz PIII CPU, with a mo-mentum resolution σ(p)/p around 0.6%.

6.3.4 Tracking summary

The total tracking efficiency for b decaytracks in the HLT varies between 92% and96.5% depending on the decay channel.This takes around 50 ms per minimum-biasevent accepted by L0 and L1. This figure isalready in the allowed range, but can be im-proved by technical changes such as using afaster compiler, by improving the handlingof the geometry information, and by tuningthe cuts. For example, the “Long tracking”speed depends quite strongly on the min-imum (transverse) momentum. The mainconcern is to be able to perform high qual-ity tracking in the available time budget,and hence the tracking package was devel-oped keeping the execution time in mind.The algorithm to efficiently select the sig-nal events is under development, and henceemphasizes flexibility rather than optimiza-tion of execution time. This selection willbe described in Section 7.3.

58 CHAPTER 6. LEVEL-1 AND HIGH LEVEL TRIGGER

Chapter 7 Performance Simulation

Minimum-bias proton-proton interac-tions at

√s = 14 TeV are generated us-

ing the PYTHIA 6.2 program [66], in-cluding hard QCD processes, single anddouble diffraction, and elastic scattering.Some of the PYTHIA parameters have beentuned to reproduce measured charged par-ticle distributions for

√s less than 1.8 TeV,

and a detailed description of this tuningcan be found in [2] and references therein.PYTHIA predicts a total inelastic cross-section of σinel = 79.2 mb, of which visiblecollisions1 correspond to (79.1 ± 0.2)% ofσinel. Several inelastic proton-proton colli-sions may occur in the same bunch cross-ing, which is simulated assuming σinel =80 mb and a non-empty bunch crossing fre-quency at the LHCb interaction point of30 MHz. The luminosity L is assumed todecrease exponentially with a 10-hour lumi-nosity lifetime in the course of 7-hour fills,with an average value of 2 × 1032 cm−2s−1,which implies ∼ 2.8(1.4) × 1032 cm−2s−1 atstart(end) of fill. Studies have shown [67]that over this range of luminosities the ef-ficiency for selecting signal events while re-tuning the trigger settings does not varysignificantly. In this chapter all thresholdsettings will be given for a luminosity of2 × 1032 cm−2s−1.

Generated particles are traced througha detailed description of the spectrometerand its surroundings using the GEANT3package [68], and the detector response

1A collision is defined to be visible if it produces at leasttwo charged particles with sufficient hits in the VELO andT1-3 to allow them to be reconstructible. Elastic scatteringnever results in tracks observed in the spectrometer.

is simulated taking the effect of the twopreceding and one following bunch cross-ings into account, but ignoring the LHCbunch structure. All expected perfor-mances are based on these simulated de-tector responses, which include efficiencies,noise and cross-talk as appropriate for eachsub-system.

Minimum-bias data corresponding toabout 2 s of LHCb running have been gener-ated, together with generating signal eventsby forcing a generated B-meson to decayinto a specific final state. In the follow-ing sections the trigger settings and cor-responding signal efficiencies will be pre-sented which allow the minimum-bias eventrate to be reduced to 1 MHz, 40 kHz and200 Hz by Level-0, Level-1 and the HLT, re-spectively. In the final section of this chap-ter the sensitivity of the trigger to alter-nate settings of the PYTHIA parametersand a less than expected spectrometer per-formance will be presented.

7.1 Performance of Level-0

The performance of Level-0 is expressed interms of efficiency εchannel

L0 , the fraction ofoffline selected events that pass Level-0 fora given signal channel.

The maximization of the Level-0 efficien-cies is done by optimizing the bandwidthgiven to each sub-trigger taking into ac-count correlations between them. Ideally,Level-0 Level-1 and the HLT should beoptimized simultaneously. This scenario,however, was simplified by first determin-

59

60 CHAPTER 7. PERFORMANCE SIMULATION

ing cuts on the global event variables, henceSPD and Pile-Up multiplicity and the num-ber of interactions [69], for a chosen setof thresholds on the other variables. Fig-ure 7.1 shows εL0 × εL1 for a few channelsas a function of the cuts applied on the SPDmultiplicity. For each value, all thresholds

L1×L

0 ch

anne

l effi

cien

cy %

Bs --> DsKBd --> π+π-

Bs --> J/ψ(µµ)φ

Time

SPD Multiplicity Cut

L1 T

ime

and

Clu

ster

s

Clusters

0

10

20

30

40

50

60

70

0.8

0.9

1

200 300 400 500

No CutSPD Multiplicity Cut

Figure 7.1: The top plot shows εL0 × εL1 as a functionof the SPD multiplicity cut for events which have beenaccepted by the Pile-Up cut, and have a Pile-Up hitmultiplicity below 112. The bottom plot shows the datasize after Level-0 and the Level-1 execution time for thesame SPD mutiplicity cuts, normalized to no SPD cut.The dashed line indicates the chosen working point.

on the transverse energy of B-decay candi-dates were scaled by a common factor to re-adjust the total output of Level-0 to 1 MHz,and Level-1 to 40 kHz on minimum-biasevents. As is explained in the next section,events are accepted if the sum of the trans-verse momenta of the two largest pT muonsis above its threshold irrespective of theglobal cuts. Hence, channels with muonsin their final state show an increased effi-

ciency while tightening the global cuts. Thevalues of the cuts on the global event vari-ables are given in Table 7.1, and these cutsare used throughout the evaluation of thetrigger performance described in the nextsections. Cuts on the global event vari-ables allow to tune the size of the eventswhich pass Level-0, and their correspondingexecution time in Level-1, with only smallvariations in signal efficiency. The chosencuts are rather conservative, just to showthe principle.

Events are only accepted if “Total ET”,which is a measure of the total energy de-posited in the HCAL, is above 5 GeV, toreduce the possibility of triggers on halo-muons in crossings without interactions.

Table 7.1: List of L0-cuts on the global event variables.

Global Cuts ValueTracks in 2nd vertex 3Pile-Up Multiplicity 112 hitsSPD Multiplicity 280 hitsTotal ET 5.0 GeV

7.1.1 Bandwidth Division

We determine the L0 thresholds using aset of decay channels given in Table 7.4,which are representative both in giving ac-cess to the CKM parameters, and in theway they rely on the different trigger com-ponents [70].

The Level-0 bandwidth division mini-mizes the overall loss in efficiency by maxi-mizing:

channels

εchannelL0

εchannelL0−max

, (7.1)

where εchannelL0−max is the trigger efficiency for

a channel when the full bandwidth is avail-able, and εchannel

L0 is obtained using one fixedset of thresholds for all channels simultane-ously. Thresholds are given in Table 7.2,with their corresponding inclusive rates on

7.1. PERFORMANCE OF LEVEL-0 61

Table 7.2: List of cuts obtained after the combinedLevel-0 optimization. The last column gives the inclu-sive L0 output rate on minimum-bias events after theglobal event cuts.

ET thresholds Value (GeV) M. B. rate (kHz)hadron 3.6 705electron 2.8 103photon 2.6 126π0 local 4.5 110π0 global 4.0 145muon 1.1 110∑pµT 1.3 145

minimum-bias events after the global eventcuts described above. The thresholds of theECAL triggers are highly correlated sincethese triggers are to a large extent redun-dant.

An event triggers Level-0 if (1) it passesthe global selection and if at least one ofthe candidates’ ET exceeds its threshold, or(2) if the sum of the transverse momenta ofthe two largest pT muons (

∑pµ

T) is aboveits threshold, irrespective of the global cuts.The FOIs of the Muon Trigger (see Chap-ter 3) have been optimized correspondingly,and their values are listed in Table 7.3.

Table 7.3: The size of the FOI along the x and y coor-dinates as used in the Muon Trigger for the thresholdslisted in Table 7.2.

M1 M2 M4 M5x ±2 ±5 ±1 ±1y 0 0 ±1 ±1

Figure 7.2 shows the sensitivity of εL0

to the Level-0 output rate. The values ofεL0 from 0.5–1.0 MHz were obtained aftera combined optimization of Level-0 as de-scribed above for each output rate. The val-ues of εL0−max are shown in the right-handpart of the figure, indicating how much islost in efficiency per channel while sharingthe bandwidth over all channels in a demo-cratic way. This bandwidth division results

in small losses, apart from channels withthe decay J/ψ → ee, since in the band-width division optimization it is combinedwith J/ψ → µµ for calculating εchannel

L0 .

Figure 7.2: Level-0 efficiencies as a function of theLevel-0 output rate. The rightmost set of data pointsrefers to the efficiency obtained after individual opti-mization of each channel.

The Level-0 efficiencies for the set ofchannels used to tune the thresholds aregiven in Table 7.4, while Table 7.5 givesthe efficiencies for channels which did notparticipate in the optimization. We alsoinclude the inclusive hadronic, electromag-netic (electron, photon and π0s) and muontrigger, to show the correlation betweenthem. The contributions from the ECALtriggers have been grouped together, sincethey are to a large extend redundant andtheir relative contributions depend on thechoice of their highly correlated thresholds.

PYTHIA predicts a bb cross sectionof 633 µb, which results in 1.1% of thecrossings with at least one inelastic pp-interaction containing at least one bb-pair.A cc-pair is produced in 5.6% of the cross-ings, where cc-pairs are only counted if nobb-pair is present. The beauty enrichmentof the data after the various triggers is listed

62 CHAPTER 7. PERFORMANCE SIMULATION

Table 7.4: L0 efficiencies at 1 MHz for several offline selected signal channels which have been used to determinethe thresholds. The last three columns show the inclusive trigger efficiencies for the hadronic, electromagnetic(electron, photon, π0’s) and muon triggers.

Inclusive efficiencies (%)Decay Channel εL0(%)

had. trig. elec. trig. muon trig.B0

d → π+π− 53.6 ± 0.4 47.6 ± 0.5 14.1 ± 0.3 6.8 ± 0.2B0

s → D−s (K+K−π−)π+ 49.4 ± 0.6 42.2 ± 0.6 13.1 ± 0.4 8.3 ± 0.4

B0s → D−

s (K+K−π−)K+ 47.2 ± 0.3 39.4 ± 0.3 11.7 ± 0.2 8.2 ± 0.2B0

d → J/ψ(µ+µ−)K0S(π+π−) 89.3 ± 0.5 18.6 ± 0.7 8.3 ± 0.5 87.2 ± 0.6

B0d → J/ψ(e+e−)K0

S(π+π−) 48.3 ± 1.0 21.5 ± 0.8 37.4 ± 0.9 7.0 ± 0.5B0

s → J/ψ(µ+µ−)φ(K+K−) 89.7 ± 0.1 20.0 ± 0.2 8.4 ± 0.1 87.4 ± 0.1B0

d → K∗0(K+π−)γ 72.9 ± 1.0 32.7 ± 1.1 68.1 ± 1.1 7.8 ± 0.6

Table 7.5: L0 efficiencies at 1 MHz for offline selected signal channels which have not been used to tune the thresh-olds. The last three columns show the inclusive trigger efficiencies for the hadronic, electromagnetic (electron,photon, π0’s) and muon triggers.

Inclusive efficiencies (%)Decay Channel εL0(%)

had. trig. elec. trig. muon trig.B0

d → K+π− 54.1 ± 0.8 48.3 ± 0.8 12.3 ± 0.5 7.2 ± 0.4B0

s → K−π+ 56.5 ± 1.1 51.2 ± 1.1 13.2 ± 0.7 6.7 ± 0.5B0

s → K+K− 51.8 ± 0.3 46.0 ± 0.3 11.6 ± 0.2 6.5 ± 0.2B0

d → π+π−π0 77.2 ± 1.6 39.4 ± 1.9 66.2 ± 1.8 7.9 ± 1.1B0

d → D∗−(D0π−)π+ 49.0 ± 1.1 41.7 ± 1.1 14.0 ± 0.8 8.4 ± 0.6B0

d → D0(K+π−)K∗0(K+π−) 53.0 ± 1.4 45.3 ± 1.4 13.9 ± 0.9 8.1 ± 0.7B0

d → D0(K+K−)K∗0(K+π−) 50.7 ± 1.2 43.4 ± 1.1 13.6 ± 0.8 8.4 ± 0.6B0

d → J/ψ(µ+µ−)K∗0(K+π−) 91.2 ± 0.3 23.1 ± 0.4 9.3 ± 0.3 88.6 ± 0.3B+

u → J/ψ(µ+µ−)K+ 90.3 ± 0.4 26.2 ± 0.5 9.1 ± 0.4 87.1 ± 0.4B0

s → J/ψ(e+e−)φ(K+K−) 49.0 ± 0.6 22.9 ± 0.5 38.3 ± 0.5 7.0 ± 0.3B0

s → J/ψ(µ+µ−)η(γγ) 92.1 ± 0.8 19.2 ± 1.2 37.2 ± 1.5 88.4 ± 1.0B0

s → ηc(4π, 2K2π)φ(K+K−) 47.0 ± 3.0 41.5 ± 2.9 12.5 ± 1.9 8.0 ± 1.6B0

s → φ(K+K−)φ(K+K−) 41.8 ± 0.9 28.7 ± 0.9 9.7 ± 0.6 8.6 ± 0.5B0

d → µ+µ−K∗0(K+π−) 93.6 ± 0.7 24.9 ± 1.2 10.3 ± 0.8 91.8 ± 0.7B0

s → φ(K+K−)γ 69.6 ± 1.6 33.1 ± 1.6 65.8 ± 1.7 7.7 ± 0.9B+

c → J/ψ(µ+µ−)π+ 92.6 ± 0.5 29.4 ± 0.8 9.9 ± 0.5 89.5 ± 0.6

in Table 7.6.

The bandwidth division described aboveshould be regarded as an example only, an-other possibility is to optimize on effectivetagging efficiency [70] rather than just con-sidering Level-0 efficiency as used above.

7.2 Performance of Level-1

The principal idea of Level-1 is to com-bine the two most characteristic propertiesof b tracks available at this early stage,impact parameter and transverse momen-tum, to form an efficient selection of events

Table 7.6: The fraction of crossings with at least oneinelastic pp-interaction containing at least one bb-pair,or cc-pair. Where cc-pairs are only counted if no bb-pairs are present.

bb % cc %Generated 1.1 5.6Level-0 3.0 10.6Level-1 9.7 14.2HLT (L1-confirmation) 14.0 14.7

containing b-hadrons. In addition, use ismade from the signatures already found byLevel-0 and passed on to Level-1.

The Level-1 decision algorithm [71] con-

7.2. PERFORMANCE OF LEVEL-1 63

-π+π →dB Ks D→sB min. bias

)T

ln (pΣ10 11 12 13 14 15 16 17 18 19 20

) dσ

ln (

d /

Σ

-2

-1

0

1

2

3

4

5

6

7

8

)T

ln (pΣ10 11 12 13 14 15 16 17 18 19 20

)T

ln (pΣ10 11 12 13 14 15 16 17 18 19 20

Figure 7.3: Distribution of offline-selected signal events and minimum-bias events in the plane of the two variablesln(PT1)+ln(PT2) versus ln(IPS1)+ln(IPS2). The solid line is an example of the vertical-diagonal discriminantapplied to determine the Level-1 trigger variable.

sists of two parts: in the first, generic algo-rithm, a trigger variable is computed basedon the properties of the two tracks withhighest transverse momentum pT. Thispart is sensitive to very generic b-hadronsignatures. In the second, specific algo-rithm, the trigger variable is weighted ac-cording to signatures involving L0 objects,such as dimuons or high-ET electrons andphotons, that are present in the event. Thismeans that good L0 signatures have the ef-fect of relaxing the generic requirement.

7.2.1 Generic Algorithm

An average (most probable) number of 8.5(4) tracks per minimum-bias event are re-constructed in 3D (see Section 6.2). Re-quiring the 3D impact parameter to be be-tween 0.15 and 3 mm reduces this numberto 6.5 (4). Using this set of tracks, twoevent variables are defined as B-signatures,ln(PT1)+ln(PT2) and ln(IPS1)+ln(IPS2),where PT1(2) is the pT of the 3D trackwith the highest (second-highest) pT, andIPS1(2) are their respective impact parame-ter significances with respect to the primaryvertex, defined as d/σd [72]. The error σd isestimated based on the pT of the track. Fig-ure 7.3 shows the distribution of minimum-bias events and two types of signal events

in these two variables. Also shown is thevertical-diagonal discriminant line that isused to determine the trigger variable ∆:the distance of an event to this line, withnegative sign if the event lies to its left. Thechoice of the diagonal discriminant line wasoriginally motivated by an earlier version ofthe L1-algorithm which had poorer resolu-tion in impact parameter and therefore asurplus of high-pT tracks originating fromthe primary vertex, i.e. at low impact pa-rameters. We decided to keep this featurein order to guard against possible degrada-tion of the tracking performance in the ro-bustness tests.

7.2.2 Specific Algorithm

The following event variables are consideredfor relaxing the generic trigger condition:

• mmaxµµ – The highest invariant dimuon

mass, where dimuons consist of twoVELO tracks that have been matchedin 3D to L0 muon tracks, withoutrequirements on the charges nor thevertex of the two tracks. This vari-able is sensitive to channels involvingJ/ψ → µ+µ− or B → µ+µ−(X) (seeFigure 7.4).

• Eγ,maxT – The highest photon trans-

verse energy found by Level-0, if above

64 CHAPTER 7. PERFORMANCE SIMULATION

3 GeV. Sensitive to channels such asB → K∗γ.

• Ee,maxT – The highest electron trans-

verse energy found by Level-0, if above3 GeV. Sensitive to channels involvingJ/ψ → e+e−.

In each case, a “bonus” β is calculated de-pending on the variable and added to thegeneric trigger variable.

0 2000 4000 6000Arb

itrar

y un

its

0

500

1000)-π+π(0

s) Kµ µ(ψ J/→dB

)-K+

(Kφ) µ µ(ψ J/→sB

0 2000 4000 6000Arb

itrar

y un

its

0

500

1000

1500-µ+µ →sB

0 2000 4000 6000Arb

itrar

y un

its

0

20

40

60)-π+

(K* K-µ +µ →dB

[MeV]µµm0 2000 4000 6000

kHz

0

5

10Min. Bias

Figure 7.4: Dimuon invariant mass distribution forseveral signal channels in comparison with minimum-bias events.

For dimuons, βµµ is set to a very highvalue (i.e. completely overwriting the deci-sion of the generic trigger) if mmax

µµ lies ei-ther within ±500 MeV/c windows aroundthe J/ψ and the B mass, or above the lat-ter. For other values βµµ increases linearlywith mmax

µµ , where the linear coefficient ischosen as a compromise between enhancingthe efficiency for B → µ+µ−K∗ and mini-mizing the used bandwidth.

In a similar way, βγ and βe are computedas functions of Eγ,max

T and Ee,maxT , respec-

tively. A minimum of 3 GeV is required,

and β increases linearly starting from thisvalue. In the case of photons, both Eγ,max

T

and Ee,maxT are required to be above 3 GeV

for βγ > 0, whereas only Ee,maxT is used to

compute the value of βe. Again the linearcoefficients have been chosen to minimizebandwidth use while giving significant im-provements in the efficiencies for channelscontaining photons and electrons.

7.2.3 Final decision

An event passes Level-1 if

∆ + βµµ + βγ + βe > ∆0.04, (7.2)

where ∆0.04 is determined empirically by re-quiring a minimum-bias retention rate of4% of all L0-triggered events, correspond-ing to an output rate of 40 kHz.

7.2.4 Efficiencies and bandwidth di-vision

Figure 7.5 shows the trigger efficiencies of afew selected signal channels as a function ofthe Level-1 output rate. Table 7.7 summa-rizes signal efficiencies for the design out-put rate of 40 kHz. Figure 7.6 illustratesthe bandwidth division between generic andspecific sub-triggers. The specific improve-ments use up 24.7% of the bandwidth. Thecurrent trigger balance should be regardedas an example only – by adjusting the vari-ous parameters the division among the sig-natures can easily be adapted to the needsof the experiment.

The bb enrichment after Level-1 is listedin Table 7.6.

7.3 Performance of the High

Level Trigger

The HLT can access the same data as usedfor the offline selection, but at its input rateof 40 kHz. The reconstruction and selec-tion algorithms which will be employed are

7.3. PERFORMANCE OF THE HIGH LEVEL TRIGGER 65

Figure 7.5: L1 efficiencies as a function of the L1 out-put rate. The last bin refers to the maximum efficiencyobtained after individual optimization of each channel.The efficiencies are normalized to L0-triggered eventsthat have been selected by the offline analysis. Indi-cated errors are statistical.

Figure 7.6: Bandwidth division among the varioustrigger components as explained in the text.

constrained by the available computing re-sources of a few hundred CPU nodes. TheHLT algorithm is under development, andthe following strategy guarantees a high se-lection efficiency, and is estimated to givean affordable execution time:

A Confirm the Level-1 algorithm, butusing T1–T3 to improve on the mo-mentum resolution compared to theVELO-TT tracks of Level-1.

B Full pattern recognition of long tracks,and lepton identification.

C Exclusive selection of channels.

Table 7.7: L1 efficiencies at 40 kHz output rate forseveral signal channels. The efficiencies are normalizedto L0-triggered events that are used for offline analysis.

Decay channel εL1(%)B0

d → π+π− 62.7±0.5B0

d → K+π− 61.5±1.0B0

s → K−π+ 65.0±1.4B0

s → K+K− 60.0±0.4B0

d → π+π−π0 46.6±2.2B0

d → D∗−(D0π−)π+ 56.0±1.6B0

d → D0(K+π−)K∗0(K+π−) 66.7±1.8B0

d → D0(K+K−)K∗0(K+π−) 61.6±1.6B0

s → D−s (K+K−π−)π+ 63.0±0.9

B0s → D−

s (K+K−π−)K+ 62.6±0.4B0

d → J/ψ(µ+µ−)K0S(π+π−) 67.7±0.9

B0d → J/ψ(e+e−)K0

S(π+π−) 54.9±1.4B0

d → J/ψ(µ+µ−)K∗0(K+π−) 76.8±0.3B+

u → J/ψ(µ+µ−)K+ 76.0±0.5B0

s → J/ψ(µ+µ−)φ(K+K−) 71.4±0.2B0

s → J/ψ(e+e−)φ(K+K−) 57.2±0.8B0

s → J/ψ(µ+µ−)η(γγ) 70.3±1.5B0

s → ηc(4π, 2K2π)φ(K+K−) 59.0±4.0B0

s → φ(K+K−)φ(K+K−) 60.3±1.5B0

d → µ+µ−K∗0(K+π−) 78.5±1.1B0

d → K∗0(K+π−)γ 51.9±1.4B0

s → φ(K+K−)γ 49.3±2.0B+

c → J/ψ(µ+µ−)π+ 65.6±0.9

D Inclusive selection of channels.

The purpose of step A is to reject events assoon as possible with an algorithm whichis cheap in execution time. With this al-gorithm the rate is reduced to 20 kHz, andits expected performance will be describedin the next section. For the remainingevents after A, full pattern recognition isperformed, including lepton identification,but not using the RICH information. Thetracking part of step B will require mostof the execution time, and is described al-ready in Section 6.3. The average numberof tracks reconstructed per event with bothVELO and T information is 32.

Selection C aims at getting the high-est possible efficiency for those channelswhich are considered as the most impor-tant for the physics goals. Below it will beshown for a few selected channels that low

66 CHAPTER 7. PERFORMANCE SIMULATION

enough output rates can be achieved, evenwithout the particle identification using theRICHes.

The bb-pair content of the 20 kHz ofevents which have to be analysed after stepA by the HLT is given in Table 7.6, andshows that the sample is still dominated bylight quark events. In step D commonalitiesin the offline selection of B-decay channelswith similar kinematics are used to definea series of selections which should result inhigh efficiencies for all interesting channels,without having to resort to an exclusive re-construction per channel, and hence shouldguarantee that LHCb will write with highefficiency a large spectrum of B-like eventsto storage for subsequent analysis. This lastselection is least developed, and hence noresults are presented on its expected outputrate here, but it will be reported as part ofthe Computing TDR.

7.3.1 Level-1 Confirmation

Section 6.3 describes the pattern recogni-tion algorithm and its performance. Whilethe VELO-TT tracks of Level-1 have a mo-mentum resolution between 20–30%, theHLT reconstructs tracks with a σ(p)/paround 0.6%. With the HLT tracks the L1-algorithm is repeated as described in Sec-tion 7.2, i.e among the about eight trackswith an impact parameter between 0.15–3 mm the momentum is measured as de-scribed in Section 6.3, the two tracks withthe largest pT are selected, the trigger vari-able ∆ is computed, and a bonus is addeddepending on mmax

µµ , Eγ,maxT or Ee,max

T . Fig-ure 7.7 shows the trigger efficiencies of afew selected signal channels as a functionof trigger rate. This shows that the L1-confirmation reduces the rate from 40 kHzto 20 kHz with only a few % loss in signalefficiency. To achieve this reduction not alllong tracks have to be reconstructed, hencereducing the necessary execution time. Un-like the algorithm deployed in Level-1, the

Figure 7.7: L1-confirmation efficiencies for events ac-cepted L0, L1 and the offline analysis for a few selectedsignal channels as a function of minimum-bias rate.

HLT algorithm has not yet been packagedto allow an execution time measurement ofthe L1-confirmation step separately, how-ever based on the total track reconstructionit is estimated that this step will take lessthan 4 ms in 2007, hence leaving around14 ms per event for the further analysis ofthe 20 kHz of accepted events. Out of thisabout 6 ms will be needed to reconstruct allremaining tracks.

7.3.2 Exclusive Selection

The combined output rate of all channelsunder study at the moment in LHCb [2],including the expected background, is lessthan 1 Hz of data taking rate. However,the exclusive selection in the HLT will havea significantly larger output rate due tothe need to relax the final selection cutsto be able to study the sensitivity and sys-tematics. In particular side bands in in-variant mass distributions are necessary tofit the contribution of the background. Inaddition, the global RICH reconstructionbenefits from reconstructing all track seg-ments around the RICH detectors, and con-

7.4. TRIGGER PERFORMANCE ROBUSTNESS 67

Table 7.8: Expected rate in the HLT in minimum-bias events due to the exclusive selection algorithmsaimed at selecting the indicated channels. All errorsare statistical only.

Algorithm HLT rate (Hz)B → ψ(µ+µ−)X 21 ± 4B → h+h− 12 ± 3B → Dsh 12 ± 3B0

d → K∗0(K+π−)γ 13 ± 3B0

d → φ(K+K−)γ 14 ± 3

sequently its execution time only allows itto be executed at a rate of a few hundredHz. The exclusive selection power has beenchecked by passing generated minimum-bias events through the first two trigger lev-els, and then applying relaxed offline selec-tion cuts [73, 74, 75, 76]. Table 7.8 lists theexpected rates for the channels considered.The ψ(µ+µ−) rate contains 5 Hz of genuineB → ψX decays. This shows that the ex-clusive selection can reduce the rate to wellbelow a few hundred Hz without having toresort to RICH information.

7.4 Trigger Performance Ro-bustness

The expected trigger performance dependson the imperfections of our simulation pro-gram: the underlying physics process maybe slightly different in real data, the de-scription of the detector and its perfor-mance may not be perfect, the LHC ma-chine conditions may be different than ex-pected, and so on.

The LHCb trigger system is designed tobe flexible to adapt to such unexpected sit-uations. In order to illustrate this point,several robustness scenarios have been fullysimulated, and the trigger execution timeand performance have been measured onthese samples. The scenarios considered aredescribed in detail in [64] and they corre-spond to:

• PYTHIA Test: PYTHIA settings (ex-trapolated to the LHC energy) from arecent tuning on CDF data [65].

• Global Test: General LHCb degradeddetector and worse PYTHIA settings(as in [2]).

• VELO Test: Degraded VELO detec-tor, with increased material, worsecluster resolution and worse signal-over-noise ratio (as in [64]).

• Beam Spot Test: LHC beam posi-tion uncertainty increased by a factorthree in the perpendicular plane, from70 µm to 210 µm.

• LHC Background Test: LHC machinebackgrounds worse by an order of mag-nitude (as in [30]).

The uncertainty on the simulation of theunderlying physics process is taken into ac-count by the changes in the PYTHIA set-tings in the “PYTHIA Test” and “GlobalTest”. The most visible effect is the changein the track multiplicity, which turns out tobe the most relevant variable in the evalua-tion of the trigger performance. The worseVELO detector resolutions intends to sim-ulate the effect of possible misalignments.The “Beam Spot Test” checks the depen-dence of the trigger performance with theposition of the beam.

The effect of the LHC machine back-ground is found to be negligible. In nomi-nal conditions, the average number of beamhalo muons traversing the detector in anydirection is 0.04 per bunch. Even withan order of magnitude worse conditions,the event rate increases after Level-0 andLevel-1 by only 12±3 kHz. The most signif-icant effect is seen in the Level-0 muon sys-tem and has been described in Chapter 3.

In the next two sections, the robustnessof the algorithm used in Level-1 is deter-mined through the effect in the resolutionof the track parameters, and through thechanges in the execution time of the algo-rithms and event size. The overall effect

68 CHAPTER 7. PERFORMANCE SIMULATION

GLOBAL TEST

BEAM SPOT TEST

VELO TEST

DEFAULT

Pt (GeV)

Impact Parameter Resolution (mm)

0

0.025

0.05

0.075

0.1

0.125

0.15

0.175

0.2

0 0.5 1 1.5 2 2.5 3 3.5 4

Figure 7.8: Impact parameter resolution (in mm)as measured by the Level-1 algorithm as a func-tion of the transverse momentum of the track, forseveral MC simulations. The impact parameter iscomputed with respect to the reconstructed pri-mary vertex.

on the trigger decision will be discussed inSection 7.4.3.

7.4.1 Resolutions

The impact parameter and the determina-tion of the pT of the tracks used in the trig-ger algorithms depend on the performanceof the VELO detector and the ability to re-construct the primary vertex. The “GlobalTest” and the “VELO Test” both degradethe measurement of the track parametersin the VELO detector, while the “BeamSpot Test” degrades the ability to recon-struct the primary vertex. The resolutionof the impact parameter with respect to thereconstructed primary vertex, as measuredby the Level-1 algorithm, is shown in Fig-ure 7.8 as a function of pT. The resolutionis worse by 30% when the material of theVELO detector is increased, in particularthe RF foil. The resolution of pT as mea-sured by the Level-1 algorithm is shown inFigure 7.9 as a function of pT. In the case

DEFAULT

GLOBAL TEST

BEAM SPOT TEST

VELO TEST

0

0.1

0.2

0.3

0.4

0.5

0 0.5 1 1.5 2 2.5 3 3.5 4

pT (GeV)

σ(p

T)/p

T

Figure 7.9: σ(pT)/pT as measured by the L1 algo-rithm as a function of the transverse momentum ofthe track, for several MC simulations.

of Level-1 the small changes in the pT reso-lution have a negligible effect.

7.4.2 Execution Time and EventSize

The execution time of the Level-1 algorithmis dominated by the time it takes to per-form the track reconstruction. The only sig-nificant changes are observed when the ex-pected multiplicity of the event is different,i.e. in the “Global Test” and the “PYTHIATest”. The event multiplicity is increasedby 30% in the “Global Test” while it is re-duced by 20% using the CDF tuning. Themeasured execution time and event size forthe Level-1 follow the same pattern.

7.4.3 Performance

The output of Level-0 is fixed to 1.0 MHz,while the output of the Level-1 trigger isfixed to 40 kHz. Hence, the cuts used in thetrigger algorithms need to be adapted foreach robustness scenario. One convenientway to do this for Level-0 is just to change

7.4. TRIGGER PERFORMANCE ROBUSTNESS 69

Table 7.9: Level-0/Level-1 efficiencies for different robustness scenarios divided by the efficiencies usingthe default settings.

B0s → D∓

s K± B0s → J/ψ(µµ)φ B0

s → K+K− B0→ K∗0γL0 Global Test 0.93 ± 0.02 1.01 ± 0.01 0.95 ± 0.01 0.93 ± 0.02L0 PYTHIA Test 1.06 ± 0.02 1.01 ± 0.01 1.11 ± 0.01 1.03 ± 0.02L0 Beam Spot Test 0.98 ± 0.02 1.01 ± 0.01 1.00 ± 0.01 0.96 ± 0.02L0 VELO Test 1.00 ± 0.02 1.01 ± 0.01 1.01 ± 0.01 1.00 ± 0.02L1 Global Test 0.84 ± 0.03 0.92 ± 0.01 0.87 ± 0.01 0.82 ± 0.03L1 PYTHIA Test 0.97 ± 0.03 0.99 ± 0.01 0.97 ± 0.01 0.99 ± 0.03L1 Beam Spot Test 0.90 ± 0.03 1.00 ± 0.01 0.94 ± 0.01 0.97 ± 0.03L1 VELO Test 0.92 ± 0.03 0.96 ± 0.01 0.89 ± 0.01 0.89 ± 0.03

the SPD/PU multiplicity cuts quoted in Ta-ble 7.1 while keeping the rest of the cuts un-changed. The cut on the Level-1 variable,∆0.04, described in Section 7.2 also needs tobe modified to keep the 40 kHz output rate.

We have not tried to modify the algo-rithms to adapt to each robustness scenario,but rather we quote the efficiencies as theycome out from the same algorithms, whichmay be regarded as an estimate of the or-der of magnitude of the uncertainties in thetrigger performance. The results are quotedin Table 7.9. In general, the Level-0 per-formance is stable within 10%, while theLevel-1 performance is stable within 20%.

Independently of the uncertainties in thesimulation, we have also considered the sce-nario in which on “day one” we do nothave the full CPU power, and we start run-ning the experiment with part of the eventbuilding network, and part of the nominalnumber of CPUs assigned to L1/HLT. Thetrigger efficiency for B0

s → D∓s K±events,

normalized to offline selected events, isshown as a function of different Level-0 andLevel-1 output rates in Figure 7.10.

In conclusion, the trigger performancesatisfies the physics requirements of the ex-periment. It is also a robust system. Theuncertainty on the expected efficiency isevaluated to be not larger than 25% evenfor the most pessimistic scenarios consid-ered.

L0 (1.00 MHz)

L0 (0.75 MHz)

L0 (0.50 MHz)

L1 output rate (kHz)

Efficiency L0+L1 (normalized offline)

0

0.1

0.2

0.3

0.4

0.5

10 15 20 25 30 35 40 45

Figure 7.10: Trigger efficiency for B0s → D∓

s K±

events, normalized to offline selected events, as afunction of different Level-0/Level-1 output rates.

70 CHAPTER 7. PERFORMANCE SIMULATION

Chapter 8 Project Organization

This chapter deals with the managerialaspects of the trigger systems. Informationis presented on the current cost estimatesof the systems, the planning schedules andthe distribution of the responsibilities.

8.1 Cost and Funding

The breakdown of the cost of the trigger isshown in Table 8.1.

For both the Calorimeter Triggers andthe Pile-Up System infrastructure is sharedbetween the trigger components and theirrespective detectors. The cost listed ex-cludes those items already accounted for intheir detector TDRs.

For the prices of commercial hardwarelike PCs and switches, used in the imple-mentation of Level-1 and HLT, the numbersprojected by the technology tracking groupat CERN have been used [61, 77].

In the Online System TDR [9] the costof the part of the Online System which ex-cludes TFC, ECS and general infrastruc-ture was 4,244 kCHF. This value is now su-perseded by 5,711 kCHF of the combinedLevel-1&HLT system as indicated in Ta-ble 8.1.

8.2 Schedule

The detailed project schedules of the sub-systems are shown in Figures 8.1. Allschedules assume that the commissioningof individual sub-systems has to be fin-ished by September 2006, after which date

LHCb starts the overall commissioning ofthe spectrometer to be ready for beam inthe beginning of 2007.

Some sub-systems contain several differ-ent boards, in which case the milestones re-fer to the total number of boards, irrespec-tive of their type.

The L1&HLT System will have enoughfunctionality in 2006 to allow a test of thewhole detector, but not at the design band-width. The bulk of the CPUs will be ac-quired and installed as late as possible.Table 8.2 contains a set of milestones ex-tracted from the project schedules.

8.3 Division of Responsibili-

ties

Institues currently involved in the LHCbtrigger are Annecy, Barcelona, Bologna,CERN, Clermont-Ferrand, Krakow, Lau-sanne, NIKHEF, Marseille, Orsay and Riode Janeiro. The sharing of responsibilitiesis listed in Table 8.3.

71

72 CHAPTER 8. PROJECT ORGANIZATION

Table 8.1: Components (needed + spares) and cost of the trigger.

Item Quantity Total cost [kCHF]Calorimeter TriggersCaloFE Card, 9U (trigger part) 238+31PrsFE Card, 9U (trigger part) 100+10Validation Card, 9U 28+3SPD Multiplicity Card (part), 9U 16+2Crates (part) and Backplane 26+4Optical links 208+30Selection Cards, 9U 8+2Selection Crate 1+1Readout Card, 9U 1+1Total 950Muon TriggerPU board, 9U 60+6Muon Selection board 9U 12+2Controller board, 9U 4+2Backplane 4+2Crates 4+2Short distance OL ribbon 120+12Total 1000Pile-Up SystemHybrids 4+1CablesRepeater station 1Optical ribbon connection 8+1Processing system 1Sensors 4+2SC/HV/LVCrates 1TELL1 board 5Total 420L0 Decision UnitTELL1 board 1+2Mezzanine Cards 1+2Total 60Level-1 and the HLT DAQMux switches L1 62 + 7Mux switches HLT 29 + 3Sub-farm switches 94 + 10Readout network ports 190 + 19SFCs 94 + 10CPUs 1786 + 188a

TRM 1 + 1Decision sorter 1 + 1Total 5711Total 8141

aAll CPUs available will be active in the system anytime. The spares can be seen as “hot spares”

8.3. DIVISION OF RESPONSIBILITIES 73

ID Task Name

1 L0 Calo

2 Calo FE

3 PRR

4 production

5 50 %

6 End of acceptance tests

7 End of installation

8 PreShower FE

9 PRR

10 production

11 50 %

12 End of acceptance tests

13 End of installation

14 Validation Card

15 PRR

16 production

17 50 %

18 End of acceptance tests

19 End of installation

20 Selection Cards

21 PRR

22 production

23 50 %

24 End of acceptance tests

25 End of installation

26 L0 Muon

27 Prototyping

28 Processing Board

29 Emitter board

30 Simulation of the backplane

31 Minimal backplane

32 Simplified controller

33 Full chain tested

34 Final design

35 Processing board

36 Muon selection board

37 Controller board

38 Backplane

39 Test

40 PRR

41 Production

42 25%

43 100%

44 End of production

45 L0 Pileup System

46 OptStation

47 final proto

48 production

49 Hybrid

50 final proto

51 production

52 MultiplexerBoard

53 final proto

54 production

55 VertexFinderBoard

56 final proto

57 production

58 OutputBoard

59 final proto

60 production

61 L0DU

62 Interface fully specified

63 Prototype Ready

64 Production

65 L1&HLT Hardware

66 L1 Decision Sorter implementation tests

67 L1 Decision Sorter decision

68 Procurement DAQ infrastructure

69 Installation DAQ infrastructure

70 SFC evaluation

71 SFC baseline parameter specification

72 SFC and Switch procurement

73 Readout switch specifications

74 Readout switch Specifications ready

75 HLT network deployment

76 HLT network ready

77 L1 Preliminary Tests

78 Full L1 deployment and commissioning

79 Full System 1 MHz ready

80 Bulk procurement of farm PCs and installation

15/01

15/07

20/03

15/09

15/07

15/07

20/03

15/09

10/01

10/10

21/03

15/09

10/01

10/10

21/03

15/09

16/05

08/03

05/07

15/12

17/11

17/11

15/02

15/04

16/07

15/10

01/03

01/10

01/12

04/07

02/10

Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q22004 2005 2006 2007

Figure 8.1: Project schedule of the trigger.

74 CHAPTER 8. PROJECT ORGANIZATION

Table 8.2: List of major milestones.

Milestone DateCalorimeter TriggersStart of board production 9/200450% of boards produced 10/2005100% of boards produced and tested 6/2006Muon TriggerStart of board production 8/200525% of boards produced 2/2006100% of boards produced and tested 8/2006Pile-Up SystemDetector produced 1/2006Trigger boards produced and tested 5/2006Level-0 Decision UnitBoards produced and tested 6/2006Level-1 & HLTSorter Implementation Decision 05/2004Event Builder Switch specification 06/2004Subfarm Controller specification 01/200510(5)% of network(farm) ready 07/2005100% of network ready 10/2006Full farm installed 3/2007Trigger installed 9/2006

Table 8.3: Breakdown of the responsibilities of the Trigger Systems.

System InstituteLevel-0 Calorimeter TriggersECAL/HCAL FE + coordination OrsayPreShower FE Clermont-FerrandSPD multiplicity board BarcelonaValidation Card AnnecySelection Crate BolognaLevel-0 Muon Trigger MarseilleLevel-0 Pile-Up System NIKHEFLevel-0 Decision Unit Clermont-FerrandLevel-1 & HLTData movement upstream of farm and event-building CERN/LausanneGigabit Ethernet Mezzanine Card CERNInstallation, configuration and procurement of hardware CERNImplementation of sub-farm BolognaSimulation of network and switch specification Bologna/CERNOnline adaption of offline software framework Rio de JaneiroL1&HLT agorithms CERN/Krakow/Lausanne

Appendix A Scalability of Level-1

The total size of the readout networkfor Level-1 and the HLT has been basedon simulation. As has been shown in Sec-tion 7.4, with different conditions one wouldalter the data size and the performance ofLevel-1. Hence, the DAQ is required to beflexible enough to be able to adapt quicklyto actual needs.

In this section an extension of the datamade available to Level-1 is discussed asan example of what this scalability en-compasses. The additional data compriseLevel-1 data from the tracking detectors T1to T3 and the Muon detector stations M2to M5, and its size is given in Table A.1.

Table A.1: Number of L1 data sources and aver-age event fragment size per source, which does notinclude any transport overheads. IT, OT and M2–M5 are the extra sources added to L1.

Subsystem Number of Data/sourcesources [Bytes]

VELO 76 36TT 48 24L0DU 1 86Calo Crate 1 70IT 42 23OT 48 60M2–M5 8 54

The numbers given here for the addi-tional components required for the DAQsystem illustrate the scalability of the sys-tem. The numbers for the system presentedin the main text are repeated for compari-son.

The system is perfectly scalable to in-clude only part of this maximum extension.

The input parameters remain as shown inTable 6.2.

The following remarks should illustratehow the numbers in Table A.2 are derived:

• The target number of worker nodes re-mains the same, ∼ 18001.

• The number of SFCs must be in-creased, to cope with the increasedbandwidth at the output of the net-work. Each sub-farm is still limited to100 MB/s (c.f. Table 6.2).

• The number of worker nodes is re-quired to be identical for each sub-farm. Thus the actual total num-ber of worker-nodes and consequentlythe event-rate per CPU node variesslightly compared to Table 6.3.

. The figures for the extended system areindicated in the architecture shown in Fig-ure A.1. Table A.3 lists the number ofcomponents required, if data for IT, OTand Muon stations M2 to M5 would be in-cluded in Level-1. It illustrates that thesystem scales quite nicely, taking into ac-count that the event size almost doubles asshown in A.2.

The cost estimate is based on prices fora system ready in 2007. The additional costfor this enlarged system using the cost fig-ures from [61, 77] is ∼ 2100 kCHF. An “up-grade” at a later time will be cheaper.

1It might seem odd at first, that here with the samenumber of CPUs a larger Level-1 event is processed forthe Level-1-trigger. However, the increased informationavailable to the Level-1 algorithm will improve its rejectionpower, thus allowing to reduce the Level-1 output rate andconsequently the number of CPUs required for HLT and re-construction.

75

76 APPENDIX A. SCALABILITY OF LEVEL-1

Table A.2: Key performance figures of the standard and the extended Level-1 and HLT DAQ system

VELO TT + IT OT M2–M5Event BuildingTotal Frame Rate at RN input [kHz] 7248 12990RN output links 94 175RN output link rate [MB/s] 47.9 52.8Frame rate (L1) per link [kHz] 59.9 68.9Frame rate (HLT) per link [kHz] 20.0 10.7Total frame rate at RN output [kHz] 79.6 79.9MEP rate (L1) per link [kHz ] 0.47 0.25MEP rate (HLT) per link [kHz] 0.05 0.03Total MEP rate 0.53 0.28

Trigger farmsSub-farms 94 175Event rate/sub-farm (L1) [kHz] 11.7 6.3Event rate/sub-farm (HLT) [kHz] 0.4 0.2Processors/sub-farm 21 12Event rate per processor (L1) [kHz] 0.56 0.53Event rate per processor (HLT) [kHz] 0.02 0.02

Figure A.1: The architecture of the Level-1 and HLT DAQ system. The numbers on the left side showthe increase in scale required by the additional data in Level-1.

77

Table A.3: Number of items for the hardware implementation of the Level-1 and HLT DAQ system. Thenumber of spares is listed separately.

VELO TT + IT OT M2–M5Item Quantity QuantityMux switches L1 62 + 7 87 + 9Mux switches HLT 29 + 3 29 + 3Sub-farm switches 94 + 10 175 + 18Readout network ports 190 + 19 344 + 35SFCs 94 + 10 175 + 18CPUs 1786 + 188 1925 + 175TRM 1 + 1 1 + 1Decision sorter 1 + 1 1 + 1

78 REFERENCES

References

[1] LHCb Technical Proposal, LHCb,CERN/LHCC 98–4.

[2] LHCb Reoptimized Detector Designand Performance Technical Design Re-port, LHCb, CERN LHCC 2003–030.

[3] LHCb Magnet Technical Design Re-port, LHCb, CERN LHCC 2000–007.

[4] LHCb Calorimeter Technical DesignReport, LHCb, CERN LHCC 2000–036.

[5] LHCb RICH Technical Design Report,LHCb, CERN LHCC 2000–037.

[6] LHCb Muon System Technical DesignReport, LHCb, CERN LHCC 2001–010.

[7] LHCb VELO Technical Design Report,LHCb, CERN LHCC 2001–011.

[8] LHCb Outer Tracker Technical DesignReport, LHCb, CERN LHCC 2001–024.

[9] LHCb Online System Technical DesignReport, LHCb, CERN LHCC 2001–040.

[10] LHCb Inner Tracker Technical DesignReport, LHCb, CERN LHCC 2002–029.

[11] Requirements to the L0 front-end elec-tronics, J. Christiansen, LHCb 2001–014.Requirements to the L1 front-end elec-tronics, J. Christiansen, LHCb 2003–078.

[12] Timing, Trigger and Control(TTC) Systems for the LHC,http://ttc.web.cern.ch/TTC/intro.html

[13] The latency of the Level-0 trigger,J. Christiansen et al., LHCb 99–015.

[14] Simulation of the LHCb front-end,J. Christiansen and I. Garcia Alfonso,LHCb 99–047.

[15] Readout supervisor design specifica-tions, R. Jacobsson, B. Jost andZ. Guzik, LHCb 2001–012.

[16] Common L1 read out board for LHCbspecification, A. Bay et al., LHCb2003–007.

[17] Single Event Effects, Actel AX PGA,F. Machefert, LHCb 2002–072.

[18] LHCb Calorimeter Front-End Elec-tronics, Radiation Dose and SingleEvent Effects, C. Beigbeder et al.,LHCb 2002–021.

[19] The Trigger Part of the CalorimeterFront-End Card, C. Beigbeder et al.,LHCb 2003–037.

[20] Functional specification of the PGAsfor the ECAL/HCAL Front-End Card,C. Beigbeder et al., LHCb 2003–036.

[21] Front-End Electronics for LHCbPreshower Trigger Part, G. Bohner etal., LHCb 2003–068.

[22] The Validation Card for the Calorime-ter Triggers, C. Drancourt et al.,LHCb 2003–120.

[23] Implementation and performance ofLevel-0 trigger system for neutral pi-ons, O. Deschamps and P. Perret,LHCb 2003–067.

[24] The Backplane of the CalorimeterFront-End crate, C. Beigbeder et al.,LHCb 2003–038.

[25] The Selection Crate for the L0Calorimeter Trigger, G. Balbi et al.,LHCb 2003–095.

[26] A realistic algorithm for the L0 muontrigger, E. Aslanides et al., LHCb2002–042.

[27] A synchronous architecture for the L0muon trigger, E. Aslanides et al.,LHCb 2001–010.

[28] Performance of the muon trigger witha realistic simulation, E. Aslanides etal., LHCb 2002-041.

REFERENCES 79

[29] Muon trigger performance with the re-optimized LHCb detector, E. Aslanideset al., LHCb 2003–074.

[30] Machine halo in LHCb for variousvacuum conditions, G. Corti andG. von Holtey, LHCb 2003–086.

[31] Specification of the muon trigger pro-cessing board, E. Aslanides et al.,LHCb 2002–003.

[32] Gigabit Optical Link (GOL) Trans-mitter Cern Microelectronics Group,http://proj-gol.web.cern.ch/proj-gol

[33] Quartz Crystal Phase-Locked Loop(QPLL), Cern Microelectronics Group,http://proj-qpll.web.cern.ch/proj-qpll

[34] High speed ribbon optical link for theL0 muon trigger, E. Aslanides et al.,LHCb 2003–008.

[35] Study of the LHCb pile-up trigger andthe Bs → J/psi phi decay. N. Zait-sev, Thesis, University of Amsterdam,2000.

[36] The LHCb vertex triggers, N. Zaitsevand L. Wiggers, Nucl.Instrum.Meth.A447, 235–243, 2000.

[37] The LHCb Vertex Locator andLevel-1 Trigger, H. Dijkstra,Nucl.Instrum.Meth. A453, 126–130,2000.

[38] R-Sensor sectors and Strip Pitch,L. Wiggers et al., LHCb 2003–012.

[39] Pile-Up System Simulations, M. Zupanand M. Ferro-Luzzi, LHCb 2003–070.

[40] The Beetle Reference Manual,N. van Bakel et al., LHCb 2001–046.

[41] Performance of the Beetle readout chipfor LHCb, N. van Bakel et al., Proc.8th Workshop on Electronics for LHCExperiments, Colmar, 121–124, 2002.

[42] Investigation of the Beetle.1.1 chip inthe X7 testbeam, N. van Bakel et al.,LHCb 2002–053.

[43] Beetle Comparator ImplementationM van Beuzekom and H. Verkooijen,LHCb 2003–071.

[44] The Level 0 Decision Unit for LHCb,R. Cornat, J. Lecoq and P. Perret,LHCb 2003–065.

[45] Using the SPD multiplicity in theLevel-0 trigger, O. Callot, M. Ferro-Luzzi and P. Perret, LHCb 2003–022.

[46] Test of the first L0DU prototype,L. Bernard, R. Cornat, J. Lecoq,R. Lefevre and P. Perret, LHCb 2003–066.

[47] GiGabit Ethernet mezzanines for DAQand Trigger links of LHCb, H. Mulleret al., LHCb 2003–021.

[48] Carrier sense multiple access with col-lision detection (CSMA/CD) accessmethod and physical layer specifica-tions, IEEE, IEEE Std 802.3, 2000 ed.

[49] Internet Protocol, IETF, RFC 791and Requirements for Internet Hosts,IETF, RFC 1122.

[50] A common implementation of theLevel-1 and High Level Trigger DataAcquisition, A. Barczyk et al., LHCb2003–079.

[51] An integrated experiment control sys-tem: the LHCb approach, C. Gaspar etal., in Proc. IEEE NPSS Real TimeConference 2003.

[52] Raw data transport format, B. Jost andN. Neufeld, LHCb 2003–063.

[53] Gaudi Project, http://proj-gaudi.web.cern.ch/proj-gaudi

[54] Implementing the L1 trigger path,R. Jacobsson, LHCb 2003–080.

[55] LHCb VELO off detector electron-ics preprocessor and interface to theLevel-1 trigger, A. Bay et al., LHCb-2001–043.L1-type Clustering in the VELO onTest-beam Data and Simulation, N.Tuning, LHCb-2003–073.

80 REFERENCES

[56] The LHCb Level-1 Trigger: Architec-ture, Prototype, Simulation and Algo-rithm, V. Lindenstruth et al., LHCb2003-064, Chapter 5.

[57] Matching VELO tracks to L0 objects,N. Tuning, LHCb 2003–039.

[58] VELO-TT matching and momentumdetermination at Level-1 trigger,M. Witek, LHCb 2003–060.

[59] The relevance of the magnetic field inthe Level-1 trigger, H. Dijkstra et al.,LHCb 2003-110.

[60] Execution time optimization of Level-1algorithm, M. Witek, LHCb 2003–061.

[61] Processors, Memory andBasic Systems, WorkingGroup A, PASTA 2002 ed.http://lcg.web.cern.ch/LCG/PEB/PASTAIII/pasta2002Report.htm

[62] VeloTracking for the High Level Trig-ger, O. Callot, LHCb 2003–027.

[63] The Forward tracking, an optical modelmethod, M. Benayoun and O. Callot,LHCb 2002–008.

[64] Study of the LHCb Trigger Perfor-mance Robustness , F. Teubert, LHCb2003–059.

[65] R. Field, private communica-tion, and talks available athttp://www.phys.ufl.edu

/~rfield/cdf/rdf_talks.html; wehave considered the PYTHIA 6.2settings referred to as “tune A” onhttp://www.phys.ufl.edu

/~rfield/cdf/tunes/rdf_tunes.html.

[66] High-Energy-Physics Event Genera-tion with PYTHIA 6.1, T. Sjostrand etal., Computer Physics Commun. 135(2001) 238.

[67] Some comments on the running sce-nario for LHCb, O. Callot, LHCb2000–095.

[68] GEANT Detector Description andSimulation Tool, CERN Program Li-brary Long Write-up W5013 (1994).

[69] Effect of Multiplicity Cuts on the L0and L1 Triggers, M. Ferro-Luzzi et al.,LHCb 2003–047.

[70] Level-0 Trigger Bandwidth Division,E. Rodrigues, LHCb 2003–048.

[71] Level-1 decision algorithm and band-width division, C. Jacoby and T. Schi-etinger, LHCb 2003–111.

[72] The use of the TT1 tracking station inthe level-1 trigger, H. Dijkstra, et al.,LHCb 2002–045.

[73] The control channel B0→ J/ψ(µµ)K∗0,L. de Paula and E.C. de Oliveira,LHCb 2003–108.

[74] Selection of B0(s) → h+h− decays at

LHCb, V. Vagnoni et al., LHCb 2003–123.

[75] B0s → D∓

s K± and B0s → D−

s π+

event selection, A. Golutvin, R. Hi-erck, J. van Hunen, M. Prodkudin andR. White, LHCb 2003–127.

[76] Radiative B decays with LHCb,I. Belyaev and G. Pakhlova, LHCb2003–090.

[77] Networking Technology, Work-ing Group D, PASTA 2002 ed.http://lcg.web.cern.ch/LCG/PEB/PASTAIII/pasta2002Report.htm