View
298
Download
0
Category
Preview:
Citation preview
Connecting the Dots 2015Tuesday Meeting
Tim Head
École Polytechnique Fédérale de Lausanne
24 March 2015
Question: What is pattern recognition in sparsely sampled data?
Obvious answer: Track reconstruction!
Interesting answer: Computer vision, track reconstruction, space object tracking, face
recognition, jet reconstruction, self driving cars, ''Ok, Google ...''
Tim Head (EPFL) 24 March 2015 2
© BerkeleyLab
• A new conference series, this time in
Berkeley
• February 2015
• Check the agenda for lots of
interesting talks
• (the views are amazing)
Tim Head (EPFL) 24 March 2015 3
1. Is an aggressive R&D in this field sufficiently motivated?
2. Which are the most promising directions we should explore?
1. Associative Memory ASICs vs. FPGAs2. Retina/Hough transform3. Tracklets4. Cellular Automata5. GPUs6. Commercial CPUs7. .....
What is the future of fast track finding for trigger applications
beyond Atlas and CMS Phase II Upgrade?
Where charm leads,
beauty goes.
Followed by the Higgs.
LucianoRistori
Tim Head (EPFL) 24 March 2015 4
• In the post-Higgs era, in absence of of new physics, the key to progress in our field will be precision measurements
• The HL-LHC at 1035 will produce ~1014 Beauty and Charm decays/year. If we can harvest most of them we could bring the precision of CP violation measurement in rare decays from the present ~ 10–2 to below ~10–4
• To do this we will need to change the way we perform experiments
• 1014 x 1 MB = 1020 bytes = 105 PB/year -> No way!
• We need to read out the detector for every single crossing, perform an almost complete analysis in real time and retain only the information relevant to the process of interest (e.g. few tracks involved in the decay)
• This involves finding all tracks down to low momentum, identifying decay vertices, computing invariant masses...the complexity of this problem is 10-100 times worse than what we are now trying to solve for CMS Phase II
• 1014 x 1 KB = 100 PB/year -> Possible!
Is an aggressive R&D in this field sufficiently motivated?an example
To stay ahead, we
need completely
new ideas.
LucianoRistori
Tim Head (EPFL) 24 March 2015 5
It is all About Representation
1.5 1.0 0.5 0.0 0.5 1.0 1.5X
1.5
1.0
0.5
0.0
0.5
1.0
1.5
Y
Original Data
Separating black from
white is hard work ...
Tim Head (EPFL) 24 March 2015 6
It is all About Representation
1.5 1.0 0.5 0.0 0.5 1.0 1.5X
1.5
1.0
0.5
0.0
0.5
1.0
1.5
Y
Original Data
2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5One dimensional representation
Separating black from
white is hard work ...
... until you learn
about spherical co-
ordinates.
Tim Head (EPFL) 24 March 2015 6
How Jets are Like YouTube
Jet Clustering 101Detecting Jets
7
Michael Kagan
Tim Head (EPFL) 24 March 2015 8
Jet Clustering 101The HEP Problem at Hand
8
QCD
QCD
QCD
QCD
QCD
QCD
QCD
QCD
Decay products of the
W and Z all end up in
the same jet.
Michael Kagan
Tim Head (EPFL) 24 March 2015 9
N-subjettinessHEP Approach to Boosted Particle Tagging
• “Substructure” techniques to analyze constituents of jet, e.g. – Is it a 1-prong, 2-prong, or 3-prong like decay? – Is the energy split evenly amongst “sub-jets”? – Many sub-structure related variables / algorithms
• Example substructure variable: – N-subjettiness τ21=τ2 / τ1 – Continuous version of subjet counting
• Example Classification problem: Separate W boson jet from a QCD light jet
9
21τ0.2 0.4 0.6 0.8 1 1.2 1.4
Nor
mal
ised
Ent
ries
00.020.040.060.08
0.10.120.140.160.180.2
0.22 ATLAS Simulation Preliminary=8 TeVs
jets with R=1.0tanti-kTrimmed
| < 1.2TRUTHη| < 350 GeVTRUTH
T200 < p
WindowRECOMQCD jetsW jets
N-subjettiness: after
a lot of thinking, cook
up a variable that can
separate QCD from W
jets.
Michael Kagan
Tim Head (EPFL) 24 March 2015 10
N-subjettinessThe Jet-Image
• Jets built from calorimeter towers • Build NxN grid of towers containing the jet (here 25x25) • The Jet-Image à calorimeter towers like pixels in image!
11
Example Jet from Wàqq’ decay
Jet Jet-‐Image
Calorimeter towers are
like the pixels of an
image.
Michael Kagan
Tim Head (EPFL) 24 March 2015 11
N-subjettinessClass Averages
14
0.0 0.5 1.0 1.5 2.0 2.5
Q2
0.0
0.5
1.0
1.5
2.0
2.5
Q1
Cell
Coe!cient
10!9
10!8
10!7
10!6
10!5
10!4
10!3
10!2
10!1
0.0 0.5 1.0 1.5 2.0 2.5
Q2
0.0
0.5
1.0
1.5
2.0
2.5
Q1
Cell
Coe!cient
10!9
10!8
10!7
10!6
10!5
10!4
10!3
10!2
10!1
Average W jet Average Light jet from QCD
How can we extract the important features? How can we convert this into discrimination power?
After some prepro-
cessing, there is a dif-
ference!
Michael Kagan
Tim Head (EPFL) 24 March 2015 12
N-subjettinessFisher Discriminant • Finds direction that maximizes
between-class scatter / within-class scatter
– Extract “most important” feature, a, for discrimination for this metric – This can be written as a “Generalized” eigenvalue problem
• If data is high dimensional, e.g. 625 elements, then St has huge number of independent components, e.g. 192,495! – Not enough data to build full rank matrix à Must regularize!
– Details of analytic solution: Z. Zhang et. al. Regularized Discriminant Analysis, Ridge Regression and Beyond, Journal of Machine Learning Research 11 (2010) 2199-2228
16
A complicated way of
saying ...
Michael Kagan
Tim Head (EPFL) 24 March 2015 13
Fisher's Linear Discriminant
4 3 2 1 0 1 2 3 46
4
2
0
2
4
6
Find an axis along
which we can separ-
ate the data.
Tim Head (EPFL) 24 March 2015 14
Fisher's Linear Discriminant
4 3 2 1 0 1 2 3 46
4
2
0
2
4
6
Find an axis along
which we can separ-
ate the data.
Tim Head (EPFL) 24 March 2015 15
PerformancePerformance
23
0 10 20 30 40 50 60 70 80 90 100Signal Efficiency [%]
1
3
6
10
30
60
100
Bac
kgro
und
Rej
ecti
onSig Eff @ Bkg Rej776% @ x2319% @ x10196% @ x2060% @ x100
Fisher-JetN-subjettiness (⌧2/⌧1)
We did not have to
think long and hard
about a variable, and
are competitive!
Michael Kagan
Tim Head (EPFL) 24 March 2015 16
Computer Vision Applied Blindly
• By mapping concepts from images to jets you gain access to well studied CV
techniques
• No need to think up ''clever'' variables a priori
I flexible method!
• Computers can discover good ways to represent the data ''by themselves''
• Fisher's Linear Discriminant was state of the art in 1997, things have moved on
since then!
Tim Head (EPFL) 24 March 2015 17
What About YouTube?
Let a computer watch YouTube and
it will learn that cats are a useful
thing (variable) to know about.
Tim Head (EPFL) 24 March 2015 18
The automatic physicist?
Deep Learningdetecting the higgs boson
A two-class supervised learning problem:
Higgs-production Primary background
Machine learning classifier:
∙ 28 features∙ 21 low-level features∙ 7 high-level features derived by physicists
∙ 10M simulated collisions for training (50% each)∙ 500k validation set∙ 500k test set
3
Do the seven high
level variables help?
Peter Sadowski
Tim Head (EPFL) 24 March 2015 20
Deep Learningdetecting the higgs boson
∙ Current approach: shallow models∙ Boosted decision trees* (BDT)∙ Shallow neural networks (NN)
∙ Our approach: deep neural networks (DNN)
BDT NN DNN
*Used for Higgs discovery in 20124
Things we knew in the
80s have finally star-
ted working!
Peter Sadowski
Tim Head (EPFL) 24 March 2015 21
Deep Learningdeep learning for particle collider data analysis
Motivated by successes of deep learning in vision and speech.
∙ Huge progress on benchmark supervised learning tasks∙ Replacement of engineered features with learned features
Engineered features Learned features
2
Deep Neural Networks
can learn better rep-
resentations of the
data without human
input.
Peter Sadowski
Tim Head (EPFL) 24 March 2015 22
Deep Learningdetecting the higgs boson
Area Under ROC Curve for Test SetTechnique Low-level features All featuresBDT 0.73 0.81NN 0.733 (0.007) 0.816 (0.004)DNN 0.880 (0.001) 0.885 (0.002)
Deep learning improves AUC by 8% over shallow methods.
Deep learning does not require engineered features.
Baldi et al, Nature Communications 2014
6
No, adding high level
features does not im-
prove performance.
Peter Sadowski
Tim Head (EPFL) 24 March 2015 23
Nice, ... what does all this have to do with LHCb?
The Physics Equivalent of the CatWhat variables does
NN learn when you
show it physics? We
should find out!
Tim Head (EPFL) 24 March 2015 25
Learn Expensive Parts of the Simulationdetecting the higgs boson
Mean Squared Error of networks trained to compute 7 high-levelfeatures from 21 low-level features.Technique Feature Regression MSELinear Regression 0.1468NN 0.0885DNN 3 layers 0.0821DNN 4 layers 0.0818DNN 5 layers 0.0815DNN 6 layers 0.0812
High-level features easier to learn with deep nets
9
Use a NN with multiple
regression outputs to
learn a fast simulation
of some parts of the
simulation?
Peter Sadowski
Tim Head (EPFL) 24 March 2015 26
Isolation or Flavour TaggingCan we use "jets-are-
like-images" ideas for
this?
Tim Head (EPFL) 24 March 2015 27
Visualisation
0
1
2
3
4
56
7
8
9
0
1
2
34
5
6
7
8
9
0
1
2
34
5
6
7
8
9
0
9
5
5
6
5
0
9
8
9
8
4
1
7
7
35
1
0
0
2 2
7
8
2
0
12
6
33
7
3 34
6
6
6
4
9
1
5
0
9
5
2
8
2
00
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
43
1
4
0
5
3
6
9
6
1
7
5
44
7
28
22
5
7
9
5
4
8
8
4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
55
6
5
0
9
8
9
8
4
1
7
7
3
5
1
0
0
22
7
8
2
0
1
2
6
33
7
3
3
4
66
6
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
31
3
9
1
7
6
84
3
1
4
0
5
3
6
9
6
1
7
5
4 4
7
2
8
2
2
55
4
88 4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
5
5
6
5
0
9
8
9
8
4
1
7
7
3
5
1
00
22
7
8
2
0
1
2
6
3
3
7
33
4
666
4
9
1
5
0
9
5
2
8
2
00
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
6
9
6
1
7
5
4
4
7
2
8
2
2
5
7
9
5
4
88
4
9
0
8
9
3
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
5
5
6
5
0
9
8
9
8
41
77
3
5
1
00
2
2
7
8
2
0
1
2
6
33
7
33
4
6 66
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8 4
3
1 4
0
5
3
6
9
6
1
7
5
4
4
7
2
8
2
2
5
7
9
5
4
8 8
4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
55
6
5
0
9
8
9
84
1
77
3
5
1
0
0
22
7
8
2
0
1
2
6
33
7
33
4
66 6
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
6
9
61
7
5
4
4
7
2
8
2
2
5
7
9
5
4
88
4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
12
3
4
5
6
7
8
9
0
12
3
4
5
6
7
8
9
0
9
55
6
5
0
9
8
9
8 4
1
77
3
5
1
00
2
2
7
82
0
1
2
6
33
7
33
4
66 6
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
6
9
6
1
7
5
44
7
2
8
22
5
7
95
4
8
8
4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
55
6
5
0
9
8
9
84
1
7
7
3
5
1
0
0
2
2
7
8
2
0
1
2
6
33
7
33
4
6
6
6
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
84
3
1
4
0
5
3
69
6
1
7
5
4
4
7
2
8
2
2
5
7
9
5
4
8
8
4
9
0
8
9
8
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
55
6
5
0
9
8
9
8
4
1
77
3
5 1
2
7
8
2
0
1
2
6
33
7
33
4
66
6
4
91
5
0
9 5
2
8
2
0
0
1
7
6
3
2
1
4
6
3
1
3
91
7
6
8
4
3
1
4
0
53
6
9
6
1
7
5
44
7
2
8
2
2
5
7
9
5
44
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
12
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
5 5
6
5
0
9
8
9
8
4
1
7
7
3
5
1
00
7
8
2
0
1
2
6
3
3
7
33
4
6
6 6
4
9
1
5
0
9
5
2 8
2
0 0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
69
6
1
7
5
44
7
28
2
2
5
7
9
5
4
8
8
4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3 4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
5
5
6
5
0
9
8
9
8
4
1
7
7
3
5
1
00
2
2
7
8
2
0
12
6
3
3
7
3
3
4
6
66
4
9
1
5
0
9
5
2 8
2
00
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
6
96
1
7
54
4
7
2
8
2
2
5
7
9
5
4
88
4
9
0
8
9
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
45
6
7
8
9
0
9
55
6
5
0
9
8
9
8
4
1
7
7
3 5
1
0 0
22
7
8
2
0
12
6
3
3
7
33 4
666
4
9
1
5
0
95
28
2
00
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
43
1
4
0
53
69 6
1
7
5
44
7
2
8
22
5
7
9
5
4
88
4
9
0
8
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
9
55
6
5
0
9
8
9
8 41
77
3
5
1
00
2
2
7
8
2
0
1
2
6
3
3
7
3
3
4
6
6
6
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
6
9
6
1
7
5
4 4
7
2
8
2
2
5
7
9
5
4
8 8
4
9
0
8
9
8
0
1
2
34
5
6
7
8
9
0
1
23
4
5
6
9
0
1
2
3
4
5
6
7
8
9
0
9
5 5
6
5
0
9
8
9
8
4
1
77
3
5
1
00
2
2
7
8
2
0
12
6
33
7
3
3
4
66
6
4
9
1
5
0
9
5
2
8
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
4
3
1
4
0
5
3
696
1
7
5
4
4
7
2
2
5
7
9
5
4
49
0
8
9
8
0
12
3
4
5
6
7
8
9
0
1
23
4
5
6
7
8
9
0
1
2
3 4
5
6
7
8
9
0
9
5
5
6
5
0
9
8
9
8
4
1
7
7
3
5
1
0
0
2
2
7
8
2
0
1
2
6
3 3
7
3
3
4
6
6
6
4
9
1
5
0
9
5
2
8
2
0
0
1
7
6
3
2
1
7
4
6
3
1
3
9
1
7
6
8
43
1
4
0
5
3
6
9 6
1
7
5
4
4
7
2
8
22
5
7
9
5
4
8 8
4
9
0
8
9
8
t-SNE projecting a 64
dimensional space
into 2D, without using
labels.
Tim Head (EPFL) 24 March 2015 28
The End
• It is all about representation.
• A small conference with
unusual mix of attendants.
I check the agenda for moreon traditional tracking, etc
• LHCb is leading the way when
it comes to ''real time''
tracking, others are following.
• To stay ahead of the other
experiments we should
investigate these new ML
tools.
Tim Head (EPFL) 24 March 2015 29
Recommended