Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Fast tomography of soft X-ray data using
Gaussian processes and neural networks
Presenter : Tianbo Wang
Author list : Tianbo Wang1,2,3, Didier Mazon3, Jakob Svensson4, Yuan Li6, Zhiyin Zhou7, Junhao Guo7, Rong Hou7, Liqing Xu5, Liqun Hu5,
Yanmin Duan5, Geert Verdoolaege2
1 Southwestern Institute for Physics, CNNC, C-610200 Chengdu, People’s Republic of China2 Department of Applied Physics, Ghent University, B-9000 Ghent, Belgium
3 CEA, IRFM, F-13108 Saint-Paul-lez-Durance, France4 Max-Planck-Institut für Plasmaphysik, D-17491 Greifswald, Germany
5 Institute of Plasma Physics, Chinese Academy of Sciences Hefei 230031, People’s Republic of China6 Remark Holdings, Inc., 89169 Las Vegas, United States of America
7 China Science IntelliCloud Technology Co. LTD, Hefei 230031, People’s Republic of China
• Line-of-sight tomography problems
• Gaussian process tomography (GPT)
• Neural network tomography (NNT)
• Conclusion and perspectives
GUIDELINE
2
Line-of-sight tomography problems
Line-of-sight diagnostics
• Bolometry
• Soft X-ray spectrometry
• Hard X-ray spectrometry
• Interferometry
• Ha measurements, and others…
ITER bolometry systemITER Radial Neutron Camera 4
LOS tomography problems
75 Vertical line-of-sight (LOS)
Major radius (mm)M
ino
r ra
diu
s (m
m)
128 Horizontal
line-of-sight
(LOS)
100 × 100 pixels
16 mm × 16 mm pixel size
➢ An example of WEST two camera SXR system
5D. Mazon et al, Design of soft-X-ray tomographic system in WEST using GEM detectors, Fusion Eng. Des. 96–97, 856 (2015).
ҧ𝑑𝑀 = ധ𝑅 ത𝐸𝑁
ҧ𝑑𝑀 SXR array detection signal (203 × 1 dimension for WEST SXR)
ധ𝑅 Response matrix (LOS geometric dependence)
ത𝐸𝑁 emissivity value of each pixels (100 × 100 dimension for WEST SXR)
LOS tomography problems
➢Forward model of LOS tomography
ധ𝑅 =
𝐿11 , 𝐿2
1 , … 𝐿𝑛1
𝐿12 , 𝐿2
2 , … 𝐿𝑛2
⋮𝐿1𝑚, 𝐿2
𝑚, … 𝐿𝑛𝑚
6
➢Forward and inverse problem
forward problem
inverse problem
Deductive, predictive :
answer is perfectly known
Inductive, inferential :
many possible answers with uncertainty
LOS tomography problems
7
Many tomography methods exist:
Least squares with Tikhonov regularization,
minimum Fisher information regularization…
A probabilistic solution:Gaussian Process
Tomography
𝑝 𝑒𝑚𝑖𝑠𝑠𝑖𝑣𝑖𝑡𝑦 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 ~ 𝑝 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 𝑒𝑚𝑖𝑠𝑠𝑖𝑣𝑖𝑡𝑦 𝑝(𝑒𝑚𝑖𝑠𝑠𝑖𝑣𝑖𝑡𝑦)
Likelihood given by forward model:
• Suppose we would know the emissivity field,
to what measurements would it lead?
Prior information:
• What are the possible emissivity
fields before taking measurements?
Posterior information:
• What is the probability distribution of the
emissivity, given the measurements?
• What is the most likely emissivity, given the
measurements, and how certain are we?
The logic of Bayesian inference
9
Mean vector of Gaussian process
(most probable emissivity value in each pixel)
Covariance matrix of Gaussian process
(smoothness and might also related to temporal resolution)
ҧ𝜇𝐸
ധ𝛴𝐸
Gaussian Process modelling
10
Squared exponential kernel function:
ധ𝛴𝐸 =
𝑘 𝑟1, 𝑟1 ⋯𝑘 𝑟1, 𝑟𝑛⋮ ⋱ ⋮
𝑘 𝑟𝑛, 𝑟1 ⋯𝑘 𝑟𝑛, 𝑟𝑛
𝑘 = 𝜎𝑓2exp(−
𝒅 2
2𝜎𝑙2 ) , 𝑑 = റ𝑟 − റ𝑟′
➢ റ𝑟 location of pixel
➢ 𝑑 distance between pixels
➢ 𝝈𝒇 basic variance value (hyperparameter)
➢ 𝝈𝒍 length scale (hyperparameter)
➢ ҧ𝜃 all the hyperparameters summarized by a vector.
Gaussian Process prior: 𝑝 ത𝐸𝑁 ҧ𝜃 =1
2𝜋𝑁2 ന𝛴𝐸
12
𝑒𝑥𝑝 −1
2( )ത𝐸𝑁 − ҧ𝜇𝐸
𝑇 ധ𝛴𝐸−1 ത𝐸𝑁 − ҧ𝜇𝐸
Gaussian Process likelihood: 𝑝 ҧ𝑑𝑀 ത𝐸𝑁, ҧ𝜃 =1
2𝜋𝑀2 ന𝛴𝑑
12
𝑒𝑥𝑝 −1
2( ൯ധ𝑅 ∙ ത𝐸𝑁 − ҧ𝑑𝑀
𝑇ധ𝛴𝑑−1 ധ𝑅 ∙ ത𝐸𝑁 − ҧ𝑑𝑀
Covariance matrix of posterior:
ധ𝛴𝐸𝑝𝑜𝑠𝑡
= ധ𝑅𝑇 ധ𝛴𝑑 ധ𝑅 + ധ𝛴𝐸−1 −1
Mean vector of posterior:
ҧ𝜇𝐸𝑝𝑜𝑠𝑡
= ҧ𝜇𝐸𝑝𝑟𝑖𝑜𝑟
+ ധ𝑅𝑇 ധ𝛴𝑑 ധ𝑅 + ധ𝛴𝐸−1 −1
ധ𝑅𝑇 ധ𝛴𝑑−1 ҧ𝑑𝑀 − ധ𝑅 ∙ ҧ𝜇𝐸
For tomography problems, the forward model is linear,
thus the product of two Gaussian probability density functions is also a Gaussian.
Closed-form posterior mean and covariance real-time calculation.
Gaussian process modelling
11
J. Svensson and J.-E. Contributors, JET Intern. Rep. (2010).
T. Wang, D. Mazon, J. Svensson, D. Li, A. Jardin and G. Verdoolaege. Gaussian Process tomography for soft X-ray
spectroscopy at WEST without equilibrium information. Rev Sci Instrum. 2018 Jun. 89(6).
Gaussian Process regularization
Kernel function
Prior covariance
Regularization
(smoothness)
Prior covariance
ധ𝛴𝐸 =
𝑘 𝑟1, 𝑟1 ⋯𝑘 𝑟1, 𝑟𝑛
⋮ ⋱ ⋮𝑘 𝑟𝑛 , 𝑟1 ⋯𝑘 𝑟𝑛 , 𝑟𝑛
Covariance kernel function ( with equilibrium information applied ):
𝑘𝑆𝐸 = 𝜎𝑓2exp(−
𝒅⊥2
2𝜎𝑙⊥2 +
𝒅⫽2
2𝜎𝑙⫽2 ) .
The distances between pixels
following perpendicular and
parallel directions are
calculated in advance
12
T. Wang, D. Mazon, J. Svensson, D. Li, A. Jardin and G. Verdoolaege. Incorporating magnetic equilibrium information
in Gaussian Process tomography for soft X-ray spectroscopy at WEST. Rev Sci Instrum. 2018 Oct. 89(10).
Point 3
Point 1
Point 2
Gaussian Process regularization
𝑑⊥ =D ⊥ (point 2, point 3)
𝑑⫽ =
D ⫽ (point 1, point 3)
Perpendicular distance Parallel distance
13
GPT’s attractive features
Non-parametric inference
The hyperparameters can be optimized through marginal likelihoodJ. Svensson and J.-E. Contributors, JET Intern. Rep. (2010).
T. Wang, D. Mazon, J. Svensson, D. Li, A. Jardin and G. Verdoolaege. Rev Sci Instrum. 2018 Jun. 89(6).
Non-iterative algorithm
The inference consists of a single step: direct calculation of posterior mean
Low computational complexity
𝑂(𝑛2𝑚) comparing to least squares optimization 𝑂(𝑛3).
Real time ability:
for WEST SXR case study
Pixel number n=10000, Measurement number m=203
GPT execution time 30ms using Matlab
50 times faster than MFI in each iteration.
14
➢Observation of a kink mode triggered by a sawtooth #70750
ҧ𝑑𝑀 SXR array detection signal
(92 × 1 dimension for EAST SXR)
ധ𝑅 Response matrix
(92 × 2500 dimension for EAST SXR)
ത𝐸𝑁 Emissivity value of each pixels
(50 × 50 dimension for EAST SXR)
Execution time less than 8ms
GPT real time application on EAST SXR
15
T. Wang, G. Verdoolaege, D. Mazon, J. Svensson, Bayesian data analysis for Gaussian process tomography, J. Fusion Energ.(2018)
SVD first
topos
SVD second
topos
SVD third
topos
➢MHD mode structure analysis: (1,1) kink mode structure identified.
16
Singular value decomposition
GPT real time application on EAST XUV
➢ Impurity accumulation caused by plasma-wall interaction
17
ҧ𝑑𝑀 SXR array detection signal
(64 × 1 dimension for EAST XUV)
ധ𝑅 Response matrix
(64 × 2500 dimension for EAST XUV)
ത𝐸𝑁 Emissivity value of each pixels
(50 × 50 dimension for EAST XUV)
Execution time less than 4ms
More real-time applications?
Adaptive hyperparameters ?
• The hyperparameter optimization is very time consuming.
• Not possible for Real-time.
Integrated Data Analysis
& Other diagnostics ?
• The forward model is not always linear.
• Full Bayesian analysis and MCMC is unavoidable.
• Not possible for Real-time.
Learn Deeper:Neural Network
tomography
➢ Synthetic training data (Phantom test data) from GPT results
30000 GPT results
from EAST SXR #70750
ധ𝑅
EAST SXR response matrix 30000 synthetic measurements
With 5% random noise
20
Establish training data set
• 𝐼𝑛𝑝𝑢𝑡 𝑑𝑎𝑡𝑎 = 𝑑1, 𝑑2, 𝑑3, 𝑑4 … , 𝑑1 ∗ 𝑑1, 𝑑1 ∗ 𝑑2, 𝑑1 ∗ 𝑑3…
• 𝐼𝑛𝑝𝑢𝑡 𝑑𝑎𝑡𝑎 dimension 92 + 92 × 92
• Activation function: Rectified linear unit (ReLU)
• 𝑅𝑒𝐿𝑈 = ቊ𝑥 , 𝑥 > 00 , 𝑥 ≤ 0
21
Convolutional Neural Network (CNN)
➢6 layer network structure
Output layer:
(50,50)
dimension
matrix with
RELU activation,
SXR emissivity
map
Input layer:
the SXR 92
channels
measurement
with interaction
dimension
expansionLayer 1
Layer 2Layer 3
Layer 4
• Averaged Maximun Error in 1000 tests = 6.73%.
• Averaged Root-mean-squared deviation (RMSD) in 1000 tests = 3 × 10−4
• Compared to GPT RMSD = 3.4 × 10−3
• 𝑅𝑀𝑆𝐷 =σ𝑡=1𝑛 (𝐸𝑡,𝑖
𝑟𝑒𝑐−𝐸𝑡,𝑖)
2
𝑛.
• One time slice reconstruction execution time: 3 ms with GPU.
22
Convolutional Neural Network (CNN)
➢Reconstruction results
5% noisy SXR data
input
With interaction
dimension expansion
Fully connected
layer
Sigmoidlayer
Relulayer
Data dimension:
92+92*92 2500 2500 2500 2500 2500
Fully connected
layer
Fully connected
layer
Fully connected
layer
Fully connected
layer
Relulayer
Relulayer
23
Fully connected Neural Network (FCNN)
➢11 layer Network structure
Output layer:
(50,50)
dimension
matrix with
RELU activation,
SXR emissivity
map
• Averaged Maximun Error in 1000 tests = 2.36%.
• Averaged Root-mean-squared deviation (RMSD) in 1000 tests = 1.1 × 10−4
• Compared to GPT RMSD = 3.4 × 10−3
• Compared to standard CNN RMSD = 3 × 10−4
• One time slice reconstruction execution time: 1.6 ms with GPU.
24
Fully connected Neural Network (FCNN)
➢Reconstruction results
25
Training & Testing
➢Training data set from #70750
➢Testing data set form # 70754
FCNN result
26
Green curve:
GPT pseud signal
Blue curve:
FCNN pseud signal
Red dotes:
Measurement data
Benchmark of FCNN & Equi-GPT
➢On training set #70750
SVD first topos SVD second topos SVD third topos
From FCNN
results
From GPT
results
27
➢SVD analysis of training set #70750
Benchmark of FCNN & Equi-GPT
FCNN result
Green curve:
GPT pseud signal
Blue curve:
FCNN pseud signal
Red dotes:
Measurement data
Benchmark of FCNN & Equi-GPT
➢On testing set #70754
27
SVD first topos SVD second topos SVD third topos
From FCNN
results
From GPT
results
29
➢SVD analysis of testing set #70754
Benchmark of FCNN & Equi-GPT
Conclusion and perspectives
• Gaussian process tomography (GPT) has been successfully
implemented for the inversion of soft X-ray emissivity tomography in
EAST and WEST geometry, with a view to real-time control of impurity
transport and MHD activity.
• By modelling the emissivity field as a Gaussian process, the posterior
emissivity distribution is also in a Gaussian process form, permitting
fast reconstruction (execution time in 10ms for one time-step on the
dual-core CPU with MATLAB).
• Based on GPT tomography results database, a complementary neural
network approach has been implemented, using both fully connected
and convolutional architectures. This decreases the computational load
significantly (order 1 ms GPU time with Python).
31
Conclusion
32
• This is a first attempt using the neural networks in real-time
tomography. Good results are achieved using linear diagnostic forward
models.
Perspectives
• The neural network approach can be easily transferred to other
diagnostics (e.g. bolometry) or plasma physics processes.
• The control system could also be implemented into neural network
framework. It is possible to use artificial intelligence to control
tokamaks in the future.
• The neural network approach has even more potential for non-linear
forward models. Additional studies will be performed on more complex
non-linear diagnostic models.Control
&
heating
Plasma
response
Diagnostic
data
Data analysis & inference
AI control system
THANKS!