View
1
Download
0
Category
Preview:
Citation preview
Gonzalo Farias1
Ernesto Fabregas2
Sebastián Dormido-Canto2
Jesús Vega3
Sebastián Vergara1
May 31/2019, Vienna, AUSTRIA
Automatic recognition of anomalous patterns in
discharges by applying Deep Learning
1 2 3
3rd IAEA Technical Meeting
on Fusion Data Processing,
Validation and Analysis
Outline
❑ Introduction
❑ Background
❑Anomaly detection
❑Autoencoder (AE)
❑ Proposed solution
❑ Preliminary results
❑ Conclusions
❑ Experimental nuclear fusion devices have enormous databases
Introduction
A simple shot of few seconds can
generate huge quantity of data:
• TJ-II device has +1000 channels
of measurements.
• A discharge in JET can take a
couple of seconds (10 GB/shot.
around 100 TB/year).
• ITER could generate 1 TB/shot.
around 1 PB/year.
It is estimated that only 10% of this data is analyzed!
❑ The idea is to use Artificial Intelligence to deal with this huge
quantity data.
❑ Create systems that allow specialists to analyze and interpret
data more quickly and efficiently than manually.
❑ One important issue is anomaly detection.
Introduction
❑ Anomaly: Something that deviates from what is standard,
normal, or expected.
❑ One type of anomaly is known as 'outlier', which is a value
located outside of the normal/regular class.
❑ In our case, anomaly detection refers to the problem of
finding patterns in data that do not conform to expected
behaviour.
Background – Anomalies
Clustering Pattern recognition
❑ We try to find anomalies in signals (known and unknown).
❑ Unknown (most of anomalies)
❑ Known: disruptions, ELMs, L-H transitions, and stray-light TS diagnostic.
1. Vega, J., et al. (2013). Results of the JET real-time disruption predictor in the ITER-like wall campaigns. Fusion Engineering and Design, 88(6-
8), 1228-1231.
2. Farias, G., Vega, J., Gonzalez, S., Pereira, A., Lee, X., Schissel, D., Gohil, P. (2012) Automatic determination of L/H transition times in DIII-D
through a collaborative distributed environment, Fusion Engineering and Design, ISSN 0920-3796, Volume 87, Issue 12, Pages 2081–2083.
3. Farias, G. , Dormido–Canto, S., Vega, J. , Santos, M. , Pastor, I., Fingerhuth, S. , Ascencio, J. (2014) Iterative Noise Removal from Temperature
and Density Profiles in the TJ–II Thomson Scattering, Fusion Engineering and Design, ISSN 0920-3796, Volume 89, Number 5, Pages 761–765.
Background – Anomalies
The importance of anomaly detection is due to the fact that anomalies in data
frequently involves significant and critical information.
❑ Prediction (trained model)
Background– Anomaly detection Using ML
t0 t1 t2 tn tn+1 t
V
Observed
Predicted
Anomaly
tn+2
E
K*std
-K*std
Anomaly Detection – Threshold (th=k*std)
❑ How the Anomaly is detected?
❑We fix a threshold proportional to the Standard Deviation of the Error.
❑ Anomaly detection using Autoencoder (AE).
❑ The AE learns the regular waveforms of data through the coding and
decoding of the input data.
❑ Important differences between predicted and observed signals are
considered as anomalies.
❑ The occurrence of one anomaly at a similar time in several signals
during a discharge could reveal that something significant has happened.
Anomaly Detection - Using ML/AE
❑ Autoencoder (AE)
Anomaly Detection – Autoencoder
Input
Layer
Hidden
Layer
Output
Layer
Input
Dat
a
Outp
ut
Dat
a
Encoder Decoder
It adjusts the bias and weights to
learn a function in an unsupervised
way.
AE tries to learn an approximation
to the identity function, so it
outputs something similar to its
input.
Placing constraints on the network,
such as by limiting the number of
hidden units, we can discover
interesting structure about the data.
The network is forced to learn a
“compressed” representation of the
input.
Farias, G., et al. (2016). Automatic feature extraction in large fusion databases by using deep learning approach. Fusion Engineering and Design, Volume
112, Pages 979–983.
Farias, G., et al. (2018). Applying deep learning for improving image classification in Nuclear Fusion Devices. IEEE Access, vol. 6, pp. 72345–72356.
❑ Autoencoder (AE) Cost Function
Anomaly Detection – Autoencoder
𝑀𝑆𝐸 =1
𝑁
𝑛=1
𝑁
(𝑥𝑘𝑛 − ො𝑥𝑘𝑛)2
Ω𝑤𝑒𝑖𝑡ℎ𝑡𝑠 =1
2
𝑙
𝐿
𝑗
𝑛
𝑖
𝑘
𝑤𝑗𝑖𝑙
Ω𝑠𝑝𝑎𝑟𝑠𝑖𝑡𝑦 =
𝑖=1
𝑠2
𝜌 log𝜌
ො𝜌𝑖+ (1 − 𝜌) log
1 − 𝜌
1 − ො𝜌𝑖
𝐽 = 𝑀𝑆𝐸 + 𝜆 ∗ Ω𝑤𝑒𝑖𝑡ℎ𝑡𝑠 + 𝛽 ∗ Ω𝑠𝑝𝑎𝑟𝑠𝑖𝑡𝑦
ො𝜌𝑖 =1
𝑛
𝑗=1
𝑛
ℎ 𝑤𝑖𝑙 𝑇𝑥𝑗 + 𝑏𝑖
(𝑙)
Input
Layer
Hidden
Layer
Output
Layer
Input
Dat
a
Outp
ut
Dat
a
Encoder DecoderFarias, G., et al. (2016). Automatic feature extraction in large fusion databases by using deep learning approach. Fusion Engineering and Design, Volume
112, Pages 979–983.
Farias, G., et al. (2018). Applying deep learning for improving image classification in Nuclear Fusion Devices. IEEE Access, vol. 6, pp. 72345–72356.
Anomaly Detection – Learning
It adjusts the bias and
weights to learn a function
(unsupervised):𝑥1
𝑥2
𝑥𝑛
𝑥3..
.
𝑥1′
𝑥2′
𝑥𝑛′
𝑥3′.
.
.
ℎ𝑊,𝑏(𝑥) ≈ 𝑥
The above is similar to
learning the waveform
Illustrative example:
Training as a database of
sinusoidal signals with
noise and different
frequencies, phases and
amplitudes. Then it is tested
with square-pulse signal
Forward
𝑥1, 𝑥2, 𝑥3, … , 𝑥𝑛 ≈ [𝑥1′, 𝑥2′, 𝑥3′, … , 𝑥𝑛′]
Anomaly Detection – AlgorithmSet parameters
(𝑑𝑖,𝛼, Δ𝑡)
For each signal 𝑋𝑖 predict 𝑋𝑖′
𝑋𝑖′ = 𝑃𝑟𝑒𝑑𝑖𝑐𝑡(𝑋𝑖)
Trained AE
Calculate difference between Predicted and Observed𝐷𝑖𝑓𝑓𝑖𝑡 = 𝑋𝑖𝑡 − 𝑋’𝑖𝑡
Anomalies detection
Outlierit =ቊ10
𝑖𝑓 𝐷𝑖𝑓𝑓𝑖𝑡 ≥ 𝑑𝑖𝑖. 𝑜. 𝑐.
𝑘=𝑡+1−Δ𝑡
𝑡
∀𝑖
𝑛
𝑜𝑢𝑡𝑙𝑖𝑒𝑟𝑖𝑡 ≥ 𝛼
∀𝑖
𝑛
𝑜𝑢𝑡𝑙𝑖𝑒𝑟𝑖𝑡 ≥ 𝛼
Anomaly detected at time instant t
yes yes
no no
• di: Threshold of the difference band for signal i
• 𝛼: Number of signals involved in an anomaly
• Δ𝑡: Time windows
Anomalies during a time windowAnomalies at the same time
Anomaly Detection – In action (Δt = 1)
❑We look for simultaneous anomalies (same time instant) in different signals
within the same shot.
❑ Database from TJ-II Fusion Device
❑ 430 Shots with 9 signals each (330 for training, 100 for testing
randomly selected)
❑ Training time (1 minute for each type of signal, GPU)
❑ Testing time (Predicting, 10 ms each type of signal, GPU)
Preliminary Results
Preliminary Results – AE structure
❑ How many hidden units to use?
❑We train networks with different hidden units and measure the RMSE for
each type of signal.
RMSE
HiddenUnits Signal 1 Signal 2 Signal 3 Signal 4 Signal 5 Signal 6 Signal 7 Signal 8 Signal 9
1 0,42912 0,21228 0,19873 0,08722 0,04503 0,24234 0,13889 0,25375 0,18487
10 0,16928 0,09222 0,07663 0,04399 0,03544 0,08213 0,07262 0,10284 0,07730
100 0,082032 0,043349 0,031299 0,024351 0,013658 0,043946 0,043788 0,070419 0,04304
1000 0,089530 0,052544 0,025972 0,025922 0,017132 0,040409 0,037944 0,043563 0,04426
Underfitting
Overfitting Less anomalies detected
More anomalies detected
Preliminary Results – AE structure
❑ How many hidden units to use?
❑We train networks with different hidden units and measure the RMSE for
each type of signal.
RMSE
HiddenUnits Signal 1 Signal 2 Signal 3 Signal 4 Signal 5 Signal 6 Signal 7 Signal 8 Signal 9
1 0,42912 0,21228 0,19873 0,08722 0,04503 0,24234 0,13889 0,25375 0,18487
10 0,16928 0,09222 0,07663 0,04399 0,03544 0,08213 0,07262 0,10284 0,07730
100 0,082032 0,043349 0,031299 0,024351 0,013658 0,043946 0,043788 0,070419 0,04304
1000 0,089530 0,052544 0,025972 0,025922 0,017132 0,040409 0,037944 0,043563 0,04426
Underfitting
Overfitting Less anomalies detected
More anomalies detected
❑ Simultaneous anomalies detection
Preliminary Results – Anomalies (Δt = 1)
*100 shots randomly selected
The wider is the band the less anomalies are detected
The m
ore
sim
ultaneity is r
equired,
the
less a
nom
alie
s a
re d
ete
cte
d.
4 simultaneous
anomalies in 5
signals for a
threshold with k=1
at given time (t)
𝑑𝑖 = 𝑘 ∗ 𝑆𝑇𝐷
𝛼 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
1 387 381 343 299 274 264 258 249 248 238 233 225 210 194
2 387 354 284 252 238 217 203 184 163 143 113 104 79 65
3 377 293 240 213 172 143 112 87 61 42 36 28 23 14
4 350 253 210 159 108 74 56 28 18 10 8 6 5 5
5 302 214 146 96 54 32 17 10 6 4 3 3 3 3
6 240 155 86 44 16 11 5 2 1 1 0 0 0 0
7 164 70 26 10 5 1 1 0 0 0 0 0 0 0
8 65 20 7 2 1 0 0 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0 0 0 0 0 0
❑ Simultaneous anomaly detection in a time window (Δt=5)
*100 shots randomly selected
3 simultaneous
anomalies in 6
signals for k=0,8
with Δt=5
The m
ore
sim
ultaneity is r
equired,
the
less a
nom
alie
s a
re d
ete
cte
d.
The wider is the band the less anomalies are detected
𝑑𝑖 = 𝑘 ∗ 𝑆𝑇𝐷
𝛼 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
1 383 383 348 313 276 266 263 253 245 236 229 213 180 168
2 383 376 288 262 218 201 172 148 128 110 97 75 56 45
3 383 318 234 205 184 137 109 90 68 55 47 38 32 26
4 362 244 209 167 113 74 53 31 28 22 19 12 9 5
5 321 212 150 95 56 28 20 12 7 5 5 4 4 2
6 273 184 90 44 27 12 6 3 3 2 2 2 2 0
7 237 114 37 23 11 5 3 0 0 0 0 0 0 0
8 210 59 20 8 3 0 0 0 0 0 0 0 0 0
9 152 26 8 3 1 0 0 0 0 0 0 0 0 0
Preliminary Results – Anomalies (Δt > 1)
The wider is the TW the more anomalies are detected
❑ Simultaneous anomaly detection with a deep Autoencoder (DAE)
Preliminary Results – Anomalies (Δt = 1)
*100 shots randomly selected
The wider is the band the less anomalies are detected
The
more
sim
ultan
eity is r
equ
ired,
the
less a
nom
alie
s a
re d
ete
cte
d.
2 simultaneous
anomalies in 6
signals for k=0.6
at given time (t)
𝑑𝑖 = 𝑘 ∗ 𝑆𝑇𝐷
𝛼 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
1 387 365 303 259 230 213 196 184 171 162 152 122 107 96
2 373 277 218 183 153 103 75 58 50 36 30 24 18 14
3 335 217 165 104 64 44 31 21 17 16 14 10 7 5
4 258 157 92 50 28 14 7 3 3 2 2 0 0 0
5 215 99 40 18 12 5 4 1 1 0 0 0 0 0
6 171 51 18 14 6 2 0 0 0 0 0 0 0 0
7 121 23 12 6 0 0 0 0 0 0 0 0 0 0
8 60 14 2 0 0 0 0 0 0 0 0 0 0 0
9 17 6 1 0 0 0 0 0 0 0 0 0 0 0
❑ How to define the parameters of the approach (𝑑𝑖,𝛼, Δ𝑡)?❑ This mainly depends on the anomaly to be detected, i.e., the expert has
to tune such values according to what he/she is looking for.
❑ We could determine a range for these parameters automatically if theexpert defines the fraction of discharges that generates anomalies.
❑ Instead AEs we could use different ML algorithms to learn
regular waveforms (LSTM/GRU).❑ Automatic recognition of anomalous patterns in discharges by recurrent neural networks. In the 12th IAEA Technical
Meeting on Control, Data Acquisition and Remote Participation for Fusion Research (CODAC 2019).
❑ Once an unknown anomaly is regularly detected (i.e., it is now
a known anomaly), we could use supervised learning to detect
such anomaly.
Preliminary Results – Discussion
❑AE networks can learn the shape of a waveform (one
model for each signal).
❑AE networks can be used for anomaly detection in
signals.
❑The specialists have to define the parameters to
distinguish the noise from the real anomalies.
❑It is possible to design supervised systems that allows
the detection of previous detected/studied anomalies.
Conclusions
Recommended