Triggers: What, where, why, when and ho– Receives full event info, ROIs and L1 items ﬁred – Runs close to ofﬂine software to reconstruct objects in ROIs or the full event –

Triggers: What, where, why, when and howATLAS as an example (Other detectors do exist...)

Alex Martyniuk (UCL)

November 21, 2017

1 / 23 Alex Martyniuk

Triggering: What is it even?Triggering: A system/process to initiate a detectors’ readoutsystem to record an event of potential interest

Many modern particlephysics experimentsdeploy multi-level triggerand data acquisition(TDAQ) systems to recordtheir desired events

I will concentrate on howthe ATLAS experimentmeets this challenge(personal bias)

Hopefully I will explaineach of these partsDisclaimer: Otherdetectors approachtriggers in different ways,depending on theirneeds/challenges


Triggering: Why even trigger?

First question: Why don’t we just record every single eventproduced in ATLAS?

Reason #1: The datarates are too damn high!Nominal LHC bunchcrossing rate is 40MHzA raw ATLAS event isO(2MB)

Back of the envelope,O(80 TB/s),O(288 PB/hr),O(6.9 EB/day)

i.e. would need morestorage than Google ownafter a few days... Silly...

Also, the detector wouldlikely be on fire (possiblyliterally)


https://what-if.xkcd.com/63/

Triggering: Why even trigger?

Second question: Do we even want to record all events?

Reason #2: Most eventsare really quite boring(subjectively)

Of the totalcross-section O(1011pb)

Most collisions areinelasticOr jet production (it is ahadron collider)

“Interesting” stuff(subjective) doesn’t startfor many orders ofmagnitude

The more you record, themore you need to throwaway later

pp

total (x2)

inelastic

JetsR=0.4

dijets

incl .

γ

fid.

pT > 125 GeV

pT > 25 GeV

nj ≥ 1

nj ≥ 2

nj ≥ 3

pT > 100 GeV

W

fid.

nj ≥ 0

nj ≥ 1

nj ≥ 2

nj ≥ 3

nj ≥ 4

nj ≥ 5

nj ≥ 6

nj ≥ 7

Z

fid.

nj ≥ 1

nj ≥ 2

nj ≥ 3

nj ≥ 4

nj ≥ 5

nj ≥ 6

nj ≥ 7

nj ≥ 0

nj ≥ 1

nj ≥ 2

nj ≥ 3

nj ≥ 4

nj ≥ 5

nj ≥ 6

nj ≥ 7

t̄tfid.

total

nj ≥ 4

nj ≥ 5

nj ≥ 6

nj ≥ 7

nj ≥ 8

t

tot.

Zt

s-chan

t-chan

Wt

VVtot.

ZZ

WZ

WW

ZZ

WZ

WW

ZZ

WZ

WW

γγ

fid.

H

fid.

H→γγ

VBFH→WW

ggFH→WW

H→ZZ→4ℓ

H→ττ

total

WV

fid.

Vγ

fid.

Zγ

W γ

t̄tW

tot.

t̄tZ

tot.

t̄tγ

fid.

WjjEWK

fid.

ZjjEWK

fid.

WWExcl.

tot.

Zγγ

fid.

Wγγ

fid.

WWγ

fid.

ZγjjEWKfid.

VVjjEWKfid.

W ±W ±

WZ

σ[p

b]

10−3

10−2

10−1

1

101

102

103

104

105

106

1011 Theory

LHC pp√s = 7 TeV

Data 4.5 − 4.9 fb−1

LHC pp√s = 8 TeV

Data 20.3 fb−1

LHC pp√s = 13 TeV

Data 0.08 − 36.1 fb−1

Standard Model Production Cross Section Measurements Status: July 2017

ATLAS Preliminary

Run 1,2√s = 7, 8, 13 TeV


Triggering: How?

ATLAS deploys a multi-level trigger system alongside its detectorreadout

Level-1:Hardware based triggerFast, 2.2µs latencyUses coarse data fromcalorimeters and muonsystemReduces input rate to75− 100kHz

High-level trigger (HLT):Software based triggerSlower, O(1s) latencyUses event data from alldetectorsReduces input rate toO(1kHz)O(2GB/s) recorded totape


Level-1 – Architecture

Level-1 Aims

– Hardware based trigger, with fast, 2.2µs latency due to pipelines– Reduces input rate to 75−100kHz, partially dependent on detectors/readout

Timing constraints only allow readoutof calorimeters and fast-trackingdetectors in muon system

Clearly only a subset of detectorsNeed to reduce rate to allow fullreadout to occur

Dedicated calo/muon hardwareprocessors, digitise and interpretsignals

Pass to the CTP the multiplicity ofthresholds passed (e.g. 2MU4)

L1Topo can perform more complexchecks, ∆φ, MJJ ...


Level-1 – Items

Only have muon/calorimeterinformation, but can do a lotwith that

Electrons, muons, taus, jets,Emiss

T , total energy

When you add in L1Topo, thisexpands to many additionalkinematic and topologicallists/combinations

L1 Items:

– Individual signatures thatthe processors search forand count the multiplicity of,e.g. J100, EM22VH, MU24...

L1 Calo

– Sliding window used in L1Calo, finds localmaxima with isolation guard ring (8×8 || 4×4)– Similar method used for electrons/ photons/taus/ jets– Simple cone algorithm used in L1Topo– ΣET and Emiss

T done by summing towers


Level-1 – Items

Only have muon/calorimeterinformation, but can do a lotwith that

Electrons, muons, taus, jets,Emiss

T , total energy

When you add in L1Topo, thisexpands to many additionalkinematic and topologicallists/combinations

L1 Items:

– Individual signatures thatthe processors search forand count the multiplicity of,e.g. J100, EM22VH, MU24...

L1 Muon

– TGCs, RPCs and CSCs form the L1 Muonsystem (Yay, TLAs!)– Form muon roads, connecting hits in the trig-ger chambers– Provides ROI to HLT to search for combinedtracks within


Level-1→HLT Handover

Now have a list of multiplicities of L1 Itemsfound in the event

CTP takes all these inputs and checks against amenu

If one item passes the entire event is read outand handed over to the HLT

Read-out system (ROS) collects data fromfront-end boardsCollates information from the whole detectorinto an event, which can be sent to HLT when itneeds it

Regions of interest (ROIs)

– Items passed with an ROI, so that the HLT does not‘have’ to look everywhere again– Could then be combined at HLT into a super-ROI– Or ignored completely and perform a full-scan


Level-1: Bunch crossingsLHC Fill Patterns

– Bunch structure matters for the L1 trig-gers– Triggers formed by a logical OR of anL1 item and a type of bunch crossing:filled/empty paired/unpaired...– Response of detectors also change de-pending on bunch position in train, affectsrates

Example of possible fill pattern issues:For example, the time taken for a the ionisation froma hadronic shower to be read out spans ≈ 18− 24bunch crossingsMany overlapping signals in the detector at the sametimePulse shape tries to smooth this out, but position ofthe bunch in the train can lead to over or undercorrection


Level-1: Dead time

Deadtime

– Deadtime is there to halt the system in certain situations– Simple Deadtime: After an event is recorded, no triggers can fire for a set number ofbunch crossings– Complex Deadtime: CTP modelled as a bucket with a hole

– If there is space for a trigger to be put in then it goes in the bucket– No space then complex deadtime holds trigger until there is enough space

– Smooths the output rate of the system– Backpressure through the system (detector read out issues, HLT farm on fire, e.t.c.)can also halt the system creating deadtime, want to keep this to a minimum


Backpressure == Bad Times!


HLT Farm

HLT Farm

– Huge bank of 40k++ cores dedicated to running the HLT trigger– Receives full event info, ROIs and L1 items fired– Runs close to offline software to reconstruct objects in ROIs or the full event– Menu of HLT chains decide whether to keep the event based on reconstructed objects– Has O(1 s) in which to make its decision


HLT Alogrithms

What is the HLT actually doing?Offline reconstruction too slow to run online ≥ 10s vs needed ≤ 1s

Perform step-wise processing with early rejection to reduce timetaken

1 Fast reconstruction

Trigger-specific or special configuration ofoffline algorithmsGuided by L1 ROIs

2 Precision reconstruction

Offline (or close to) algorithmsFull detector data available

As soon as one step fails, stop processing!Streaming

Events are always written out if any trigger passes

Written to different streams depending on which trigger passed

Can be written to a debug stream if something went wrong, i.e. timedout


Trigger Menu

The trigger menu defines the physicsprogram/reach of ATLAS, i.e. what itrecords

Each physics signature defines a setof trigger chainsThe collection of all signatures formthe full trigger menuThe menu consists of:

Primary physics triggersSupport triggersCalibration and timing triggers

Current menus contain around 2-3000trigger chainsPeak rate of 1.5kHz, average of 1kHz

Menu varies with luminosity, time andrunning conditions

Overall menu design driven by:Physics prioritiesRate limitations at L1/HLTOnline resources (CPU, bandwidth)


Prescaled Triggers

Not all triggers need to or indeed canrun at their full rate

Rate might be too highA sub-sample might be enough to fulfilneeds (support triggers)Adding in triggers as the luminositynaturally drops (unless levelling) leadsto an ‘optimal’ usage of resources

Prescales used to reduce output ratePrescale of N means system accepts ‘1 out of N events’Prescales can be fractionalCan be applied at L1 and/or at the HLTPrescales can change during the run, i.e. can change the rate of a trigger, add it orremove it


Trigger ConfigurationTrigger Configuration

– This all has to hold together in a coherent way, to load into the hardware/farm and runthe trigger on events– Trigger Menu is stored in an Oracle database

– SMK: describes the contents of the L1/Topo/HLT menus– L1PSK: Sets the L1 prescales– HLTPSK: Sets the HLT prescales– BGK: Describes the LHC fill pattern


What does an analyser care about?Three Main Things

Where is the trigger turn-on?Where does the trigger reach maximal efficiency w.r.t. offline objects?

What is the peak efficiency?Is it 100%? Or do you need a scale factor?

Is it prescaled?Am I getting all the events? Or do I have to correct for a prescale?

Turn on and peakefficiency are afunction of:

ResolutionsInefficienciesOnline/Offlinedifferences


Measuring efficiencies/turn ons

How do you measure the efficiencyof your trigger?

Efficiency usually defined w.r.t. theobjects reconstructed offline

εtrigger =Ntrigger

Noffline

Measure via various methodsTag-and-probe

Trigger on one particle (the tag), e.g. leading muon from Z → µµ, and measure howoften the sub-leading (the probe) passes the trigger selection

Boot-strapUse a sample triggered by a looser (prescaled) trigger to measure the efficiency of ahigher threshold trigger

Orthogonal triggerUse a sample triggered by one trigger (e.g. muon trigger) to measure the efficiencyof a different trigger, e.g. a jet trigger (independent samples)

Simulation/emulationEmulate the action of the trigger in your MC


Monte Carlo and Scale Factors

Triggers have to be emulated in the simulated data (Monte Carlo)

Problem is, MC samples are produced before data taking starts

The MC production therefore contains a best-guess trigger menu to cover allknown triggers

Contains backups to emulate possible future triggers, cannot second guesseverything though

Differences between data/MC always slip inthough

Don’t have perfect knowledge of the years runconditions, µ, instantaneous lumi etcThe trigger menu is not always fixed, it reacts tochangesImprovements or bug fixes added

Therefore have to provide trigger scale factorsCorrect the MC to match the observed dataProvided by trigger signature groups wherenecessaryParameterised as needed in pT, η, φ....


Challenges

One major challenge to the trigger is pileupI.e. multiple pp collisions in the same bunchcrossing, or the effect of collisions inadjacent crossings

More collisions, means more tracks, morejets, more muons e.t.c.

More objects to reconstruct takes more CPUand more time

It is a long slog to get trigger objects to lookflat in < µ >

Tracking becomes more difficult and CPUintensive as tracks overlapObject isolation loses efficiency, harmingone route to lower pT thresholdsEvent sizes increase, causing a knock on tothe rate

In short, nobody likes pileup (except maybethe jet trigger, we have crazy plans)


The course of true love never did run...

The main problem with trigger systems is their permanent natureMake a cut in your analysis, you can undo it and try another oneMake a cut in a trigger, that data is gone, even if your cut was wrong

The ATLAS trigger system (other trigger systems are available) is incrediblycomplicated!!!Nothing could possibly go wrong right?Welllllll..... This delightful example from the start of run-2 shows what can gowrong

Above a certain energy trigger towers cansaturatePulse peak then lasts multiple bunchcrossings

Algorithm in place to pick the right bunchcrossing from options

This error was caused by a single ‘3’ in a DBbeing set as ‘2’Meant saturated towers were assigned to theprevious bunch crossing, thus triggeringthe previous event


Trigger Level Analyses

Search analyses don’t tend to like usingprescaled triggers

An automatic efficiency loss at thetrigger levelSignal events could be lost

Prescales are there to keep rates undercontrolHave another dial to tune though, event size

Reduce the size of the event by only savingthe objects you need for your analysis

Can run unprescaled again (caveats exist)In this example, only save the leadingfew HLT trigger jets with selectivevariablesForm the dijet invariant mass andpush down below the thresholdallowed by normal jet triggers


Summary

What I hope you take away...ATLAS deploys a two level trigger system

Level-1: Fast first sweep withhardware thresholdsHLT: Slower ‘offline-like’ softwarereconstruction and decisions

It is a complex, configurable system thataims to mesh the needs of the physicsprogram with the capabilities of the detector

As an analyser you should care aboutDoes the trigger you need exist?When does it turn-on?Is it fully efficient? Or do you need ascale-factor?Is it prescaled?

Remember: If you don’t record the rightevents in the first place, your selectionefficiency is always 0.000...

Questions?


Documents

Triggers: What, where, why, when and ho– Receives full event info, ROIs and L1 items ﬁred – Runs close to ofﬂine software to reconstruct objects in ROIs or the full event –