Boosted Top Tagging with Deep Neural Networks · Background rejection: No pile up Background...

Boosted Top Taggingwith Deep Neural Networks

Jannicke PearkesUniversity of British Columbia, Engineering Physics

Wojtek Fedorko, Alison Lister, Colin GayInter-Experimental Machine Learning Workshop

March 22nd, 2017

Overview

• Introduction • Method

– Monte Carlo Samples– Network architecture & training

• Results – Preprocessing– PT dependence– Pileup dependence– Learning what is being learnt

• Next Steps

Introduction

• Train a deep neural network to discriminate between jets originating from top quarks and those originating from QCD background

Low top pTHigh top pT

bImage: Emily Thompson

Monte Carlo Samples• Signal: Z’ to ttbar• Background: Dijet• Generated with PYTHIA v8.219 NNPDF23 LO AS 0130 QED PDF• DELPHES v3.4.0 using default CMS card• Jets clustered using DELPHES energy-flow objects

• Anti-kT jets selected with R = 1.0• Trimming performed with kT algorithm and R = 0.2, pT frac = 5%

• Signal jets are selected where a truth top decays hadronically within 𝛥R= 0.75 of a large radius jet

• Jets are required to have 𝜂<= 2.0• Jets are subsampled to be flat in pT and signal-matched in eta• Looking at jets with pT between 600-2500 GeV

• ~ 4 million signal jets and ~4 million background jets • Sample divided into 80%, 10%, 10% for training, validation and testing

Examples of Jet Images

�1.0 �0.5 0.0 0.5 1.0Translated pseudorapidity ⌘

�1.0

�0.5

Background jet with pT = 1370 GeV

10�4

10�3

10�2

10�1

�1.0

�0.5

10�4

10�3

10�2

10�1

�1.0

�0.5

10�4

10�3

10�2

10�1

�1.0

�0.5

Signal jet with pT = 781 GeV

10�4

10�3

10�2

10�1

�1.0

�0.5

10�4

10�3

10�2

10�1

�1.0

�0.5

10�4

10�3

10�2

10�1

Jet images are typically very sparse roughly 5-10% pixel activation on average if using a 0.1x0.1 grid [1][1] L. de Oliveira, M. Kagan, L. Mackey, B. Nachman, and A. Schwartzman, Jet-images -- deep learning edition, JHEP 07 (2016) 069, arXiv:1511.05190 [hep-ph].

Neural Network Inputs

• Use sequence of jet constituents rather than image

• Advantages: – No loss of information due to pixelization in an image– Inputs are more information dense

• Using 120 constituents average activation is 30%-50%

Training and Network Architecture

• Implemented with Keras• Initially planned on using an LSTM, but ended up using a fully connected network • We found that performance between the LSTM and the fully connected network was

very similar, but the deep networks were much faster to train (~10 times) which allowed for faster experimentation with preprocessing techniques and network architectures

Network Type Fully connected

Number oflayers

5,[300,150,50,10,5,1]

Number of free parameters

41,323

Activation function

Rectified linear units, sigmoid on output

Optimizer Adam

Loss Binary Cross-Entropy

Early Stopping Patience of 5

Preprocessing

• Large radius, R = 1.0, jets are trimmed using subjets R = 0.2 found with the kT algorithm with and pT frac = 5%

• Order subjets by subjet pT and jet constituent pTwithin each subjet

• We use only the 120 highest pT jet constituents• Perform preprocessing using domain knowledge

about the physics at hand

No Preprocessing

0.0 0.2 0.4 0.6 0.8 1.0Top Tagging Efficiency

nJet pT = 600 - 2500 GeV

Trimming only

Trimming onlyAUC = 0.83Rϵ = 50% = 8.85Rϵ = 80% = 3.36

• Scale pT of all jet constituents by a common factor to ensure that the constituent pT is approximately between 0 and 1

nJet pT = 600 - 2500 GeV

Trimming onlyScale

ScalingAUC = 0.900Rϵ = 50% = 21.3Rϵ = 80% = 6.02

Translate

• Center jet about highest pT subjetin 𝜂, 𝜙 plane

nJet pT = 600 - 2500 GeV

Trimming onlyScaleTranslation

Translate

TranslationAUC = 0.924Rϵ = 50% = 33.2Rϵ = 80% = 8.48

Rotate• Designed method of rotations

to preserve jet mass• Transform 𝑝', 𝜂, 𝜙 into

𝑝), 𝑝*,, 𝑝+• Rotate so that second highest

pT subjet is aligned with negative y-axis:

• Transform (𝑝), 𝑝*,, 𝑝+) back to 𝑝', 𝜂, 𝜙

nJet pT = 600 - 2500 GeV

Trimming onlyScaleTranslationRotation

Rotate

RotationAUC = 0.932Rϵ = 50% = 42.3Rϵ = 80% = 9.57

• Third subjet is not constrained, but can be moved to right half of plane

• Flip jet if average pT is in left half of plane

nJet pT = 600 - 2500 GeV

Trimming onlyScaleTranslationRotationFlip

FlipAUC = 0.933Rϵ = 50% = 44.3Rϵ = 80% = 9.75

Performance onTruth vs Reconstructed Jets

Performance after preprocessing

Jet pT = 600 - 2500 GeV

DNN, truth⌧32, truthDNN, reco⌧32, reco

Performance at 50% overall Signal Efficiency

600 800 1000 1200 1400 1600 1800 2000 2200 2400Jet pT [GeV]

Signal efficiencyBackground rejection

600 800 1000 1200 1400 1600 1800 2000 2200 2400Jet pT [GeV]

Signal efficiencyBackground rejection

Reconstructed JetsTruth Jets

AUC = 0.947Rϵ = 50% = 66Rϵ = 80% = 13

AUC = 0.933Rϵ = 50% = 44Rϵ = 80% = 9.7

Pileup

Performance at different levels of pileup

Jet pT = 600 - 2500 GeV

No pile upPile up = 23Pile up = 50

Extremely stable performance with respect to pileup

600 800 1000 1200 1400 1600 1800 2000 2200 2400Jet pT [GeV]

Signal efficiency: No pile upSignal efficiency: Pile up = 23Signal efficiency: Pile up = 50

Background rejection: No pile upBackground rejection: Pile up = 23Background rejection: Pile up = 50

Performance at different levels of pileup

pT dependence also stable with respect to pileup

Learning what is being learnt

0 100 200 300 400 500Jet mass [GeV]

Background Jets

Jet Mass

0 50 100 150 200 250 300 350 400

Jet mass [GeV]

0.014Flat pT distribution600 < jet pT < 2500 GeV

SignalBackground

0 100 200 300 400 500Jet mass [GeV]

Background Jets

Jet Mass

0 50 100 150 200 250 300 350 400

Jet mass [GeV]

SignalBackground

Next StepsShort term:• We plan to revisit LSTMs• Thorough Bayesian hyper-parameter optimization

Longer term:• Both top and W tagging with deep neural networks now

reasonably well-established on Monte Carlo• “But does it work on data?”• Start working towards evaluating the performance of these

techniques on data • Investigate effects of systematics and strategies for

mitigating the impact of systematics

Thank you!

W-tagging performance on truth

QCD-Aware Recursive Neural Networks for Jet Physics.Louppe, Cho, Becot, Cranmer https://arxiv.org/abs/1702.00748

Zooming

Parton Shower Uncertainties in Jet Substructure Analyses with Deep Neural Networks Barnard, Dawe, Dolan, Rajcic https://arxiv.org/pdf/1609.00607v2.pdf

Performance when trained and tested on different levels of pileup

600 800 1000 1200 1400 1600 1800 2000 2200 2400Jet pT [GeV]

Signal efficiency: NN trained on µ = 23 tested on µ = 0Signal efficiency: NN trained on µ = 23 tested on µ = 23Signal efficiency: NN trained on µ = 23 tested on µ = 50Background rejection: NN trained on µ = 23 tested on µ = 0Background rejection: NN trained on µ = 23 tested on µ = 23Background rejection: NN trained on µ = 23 tested on µ = 50

600 800 1000 1200 1400 1600 1800 2000 2200 2400Jet pT [GeV]

- Examined how a neural network trained at one pileup level performs on another level of pileup

- NN seems relatively robust to changes in pileup expected at the LHC in the next few years

0 100 200 300 400 500Jet mass [GeV]

Background Jets

Jet Mass

0 50 100 150 200 250 300 350 400

Jet mass [GeV]

SignalBackground

0.0 0.2 0.4 0.6 0.8 1.0 1.2

SignalBackground

0.0 0.2 0.4 0.6 0.8 1.0⌧wta

Background Jets

0.0 0.2 0.4 0.6 0.8 1.0 1.2

SignalBackground

Boosted Top Tagging with Deep Neural Networks · Background rejection: No pile up Background...

Documents

Managing the Technology Procurement Process€¦ · • Development – Delivery, acceptance and rejection – Ownership – Work-made-for-hire – Assignment – Protection – Background

Pile 1 Pile 2 Pile 3 Features TERNARY BASE

High Energy Electron Reconstruction in the BeamCal · Many Beyond Standard Model (BSM) searches have t-channel SM processes as a background [3], the rejection of this background motivates

Search for rare signals and CR flux measurements: background rejection and Montecarlo simulations

Neutrinoless double beta decay search with liquid ... · beta decaying nuclei, a long live time, and a background-free environment or powerful background rejection methods to eliminate

Annihilation /spl gamma/ Ray Background …miil.stanford.edu/publications/files/110_PUB.pdf · Annihilation y Ray Background Characterization and Rejection for a Small Beta Camera

Introduction 3D Analysis of Soil-Pile Interaction Summary ...opensees.berkeley.edu/workshop/OpenSeesDays2006...Introduction Background & Goal 3D Analysis of Soil-Pile Interaction Beam-Solid

457.19: Study of Background Rejection Systems for the IXO ... · graded shield: Ta, Sn, Cu, Al, C to stop ﬂuorescence emission 457.19: Study of Background Rejection Systems

Background Rejection with the DMTPC Dark Matter Search Using Charge Signals

Background Rejection Activities in Italy Francesco Longo University and INFN, Trieste, Italy

Demonstration of background rejection using deep …...Demonstration of background rejection using deep neural networks in the NEXT experiment C. Adams, M. Del Tutto, J. Renner, K

Surface sensitive bolometers (SSB) for the rejection of continuous alpha background in CUORE

Rejection List2013

Identification and rejection of pile-up jets at high …...novel strategy for forward pile-up jet tagging that exploits jet shapes and topological jet correlations in pile-up inter-actions

Transplant Rejection

Nitrogen Rejection (Chapter 13) Trace-Component Recovery ......Nitrogen Rejection (NRU) Nitrogen Rejection for Gas Upgrading Nitrogen Rejection for EOR Trace-Component Recovery or

Outline of the MOON detector Background rejection & Expected sensitivity R&D Summary

Persistent C4d and Antibody-Mediated Rejection in …med.stanford.edu/content/dam/sm/pednephrology/documents/...Background—Pediatric renal transplant recipient survival continues

Enhanced Background Rejection in Thick Tissue with ...ultra.bu.edu/papers/Mertz_BJ_2008.pdf · Enhanced Background Rejection in Thick Tissue with Differential-Aberration Two-Photon

Ultimate Pile Capacity of Bored Pile and Driven Pile at ...utpedia.utp.edu.my/10797/1/2010 - Ultimate Pile Capacity of Boared Pile... · Ultimate Pile Capacity of Bored Pile and Driven