Fast tomography of soft X-ray data using Gaussian

Fast tomography of soft X-ray data using

Gaussian processes and neural networks

Presenter ： Tianbo Wang

Author list : Tianbo Wang1,2,3, Didier Mazon3, Jakob Svensson4, Yuan Li6, Zhiyin Zhou7, Junhao Guo7, Rong Hou7, Liqing Xu5, Liqun Hu5,

Yanmin Duan5, Geert Verdoolaege2

1 Southwestern Institute for Physics, CNNC, C-610200 Chengdu, People’s Republic of China2 Department of Applied Physics, Ghent University, B-9000 Ghent, Belgium

3 CEA, IRFM, F-13108 Saint-Paul-lez-Durance, France4 Max-Planck-Institut für Plasmaphysik, D-17491 Greifswald, Germany

5 Institute of Plasma Physics, Chinese Academy of Sciences Hefei 230031, People’s Republic of China6 Remark Holdings, Inc., 89169 Las Vegas, United States of America

7 China Science IntelliCloud Technology Co. LTD, Hefei 230031, People’s Republic of China

• Line-of-sight tomography problems

• Gaussian process tomography (GPT)

• Neural network tomography (NNT)

• Conclusion and perspectives

GUIDELINE

2

Line-of-sight tomography problems

Line-of-sight diagnostics

• Bolometry

• Soft X-ray spectrometry

• Hard X-ray spectrometry

• Interferometry

• Ha measurements, and others…

ITER bolometry systemITER Radial Neutron Camera 4

LOS tomography problems

75 Vertical line-of-sight (LOS)

Major radius (mm)M

ino

r ra

diu

s (m

m)

128 Horizontal

line-of-sight

(LOS)

100 × 100 pixels

16 mm × 16 mm pixel size

➢ An example of WEST two camera SXR system

5D. Mazon et al, Design of soft-X-ray tomographic system in WEST using GEM detectors, Fusion Eng. Des. 96–97, 856 (2015).

ҧ𝑑𝑀 = ധ𝑅 ത𝐸𝑁

ҧ𝑑𝑀 SXR array detection signal (203 × 1 dimension for WEST SXR)

ധ𝑅 Response matrix (LOS geometric dependence)

ത𝐸𝑁 emissivity value of each pixels (100 × 100 dimension for WEST SXR)


➢Forward model of LOS tomography

ധ𝑅 =

𝐿11 , 𝐿2

1 , … 𝐿𝑛1

𝐿12 , 𝐿2

2 , … 𝐿𝑛2

⋮𝐿1𝑚, 𝐿2

𝑚, … 𝐿𝑛𝑚

6

➢Forward and inverse problem

forward problem

inverse problem

Deductive, predictive :

answer is perfectly known

Inductive, inferential :

many possible answers with uncertainty


7

Many tomography methods exist:

Least squares with Tikhonov regularization,

minimum Fisher information regularization…

A probabilistic solution:Gaussian Process

Tomography

𝑝 𝑒𝑚𝑖𝑠𝑠𝑖𝑣𝑖𝑡𝑦 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 ~ 𝑝 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 𝑒𝑚𝑖𝑠𝑠𝑖𝑣𝑖𝑡𝑦 𝑝(𝑒𝑚𝑖𝑠𝑠𝑖𝑣𝑖𝑡𝑦)

Likelihood given by forward model:

• Suppose we would know the emissivity field,

to what measurements would it lead?

Prior information:

• What are the possible emissivity

fields before taking measurements?

Posterior information:

• What is the probability distribution of the

emissivity, given the measurements?

• What is the most likely emissivity, given the

measurements, and how certain are we?

The logic of Bayesian inference

9

Mean vector of Gaussian process

(most probable emissivity value in each pixel)

Covariance matrix of Gaussian process

(smoothness and might also related to temporal resolution)

ҧ𝜇𝐸

ധ𝛴𝐸

Gaussian Process modelling

10

Squared exponential kernel function:

ധ𝛴𝐸 =

𝑘 𝑟1, 𝑟1 ⋯𝑘 𝑟1, 𝑟𝑛⋮ ⋱ ⋮

𝑘 𝑟𝑛, 𝑟1 ⋯𝑘 𝑟𝑛, 𝑟𝑛

𝑘 = 𝜎𝑓2exp(−

𝒅 2

2𝜎𝑙2 ) , 𝑑 = റ𝑟 − റ𝑟′

➢ റ𝑟 location of pixel

➢ 𝑑 distance between pixels

➢ 𝝈𝒇 basic variance value (hyperparameter)

➢ 𝝈𝒍 length scale (hyperparameter)

➢ ҧ𝜃 all the hyperparameters summarized by a vector.

Gaussian Process prior: 𝑝 ത𝐸𝑁 ҧ𝜃 =1

2𝜋𝑁2 ന𝛴𝐸

12

𝑒𝑥𝑝 −1

2( )ത𝐸𝑁 − ҧ𝜇𝐸

𝑇 ധ𝛴𝐸−1 ത𝐸𝑁 − ҧ𝜇𝐸

Gaussian Process likelihood: 𝑝 ҧ𝑑𝑀 ത𝐸𝑁, ҧ𝜃 =1

2𝜋𝑀2 ന𝛴𝑑

12

𝑒𝑥𝑝 −1

2( ൯ധ𝑅 ∙ ത𝐸𝑁 − ҧ𝑑𝑀

𝑇ധ𝛴𝑑−1 ധ𝑅 ∙ ത𝐸𝑁 − ҧ𝑑𝑀

Covariance matrix of posterior:

ധ𝛴𝐸𝑝𝑜𝑠𝑡

= ധ𝑅𝑇 ധ𝛴𝑑 ധ𝑅 + ധ𝛴𝐸−1 −1

Mean vector of posterior:

ҧ𝜇𝐸𝑝𝑜𝑠𝑡

= ҧ𝜇𝐸𝑝𝑟𝑖𝑜𝑟

+ ധ𝑅𝑇 ധ𝛴𝑑 ധ𝑅 + ധ𝛴𝐸−1 −1

ധ𝑅𝑇 ധ𝛴𝑑−1 ҧ𝑑𝑀 − ധ𝑅 ∙ ҧ𝜇𝐸

For tomography problems, the forward model is linear,

thus the product of two Gaussian probability density functions is also a Gaussian.

Closed-form posterior mean and covariance real-time calculation.

Gaussian process modelling

11

J. Svensson and J.-E. Contributors, JET Intern. Rep. (2010).

T. Wang, D. Mazon, J. Svensson, D. Li, A. Jardin and G. Verdoolaege. Gaussian Process tomography for soft X-ray

spectroscopy at WEST without equilibrium information. Rev Sci Instrum. 2018 Jun. 89(6).

Gaussian Process regularization

Kernel function

Prior covariance

Regularization

(smoothness)

Prior covariance

ധ𝛴𝐸 =

𝑘 𝑟1, 𝑟1 ⋯𝑘 𝑟1, 𝑟𝑛

⋮ ⋱ ⋮𝑘 𝑟𝑛 , 𝑟1 ⋯𝑘 𝑟𝑛 , 𝑟𝑛

Covariance kernel function ( with equilibrium information applied ):

𝑘𝑆𝐸 = 𝜎𝑓2exp(−

𝒅⊥2

2𝜎𝑙⊥2 +

𝒅⫽2

2𝜎𝑙⫽2 ) .

The distances between pixels

following perpendicular and

parallel directions are

calculated in advance

12

T. Wang, D. Mazon, J. Svensson, D. Li, A. Jardin and G. Verdoolaege. Incorporating magnetic equilibrium information

in Gaussian Process tomography for soft X-ray spectroscopy at WEST. Rev Sci Instrum. 2018 Oct. 89(10).

Point 3

Point 1

Point 2

Gaussian Process regularization

𝑑⊥ =D ⊥ (point 2, point 3)

𝑑⫽ =

D ⫽ (point 1, point 3)

Perpendicular distance Parallel distance

13

GPT’s attractive features

Non-parametric inference

The hyperparameters can be optimized through marginal likelihoodJ. Svensson and J.-E. Contributors, JET Intern. Rep. (2010).

T. Wang, D. Mazon, J. Svensson, D. Li, A. Jardin and G. Verdoolaege. Rev Sci Instrum. 2018 Jun. 89(6).

Non-iterative algorithm

The inference consists of a single step: direct calculation of posterior mean

Low computational complexity

𝑂(𝑛2𝑚) comparing to least squares optimization 𝑂(𝑛3).

Real time ability:

for WEST SXR case study

Pixel number n=10000, Measurement number m=203

GPT execution time 30ms using Matlab

50 times faster than MFI in each iteration.

14

➢Observation of a kink mode triggered by a sawtooth #70750

ҧ𝑑𝑀 SXR array detection signal

(92 × 1 dimension for EAST SXR)

ധ𝑅 Response matrix


ത𝐸𝑁 Emissivity value of each pixels


Execution time less than 8ms

GPT real time application on EAST SXR

15

T. Wang, G. Verdoolaege, D. Mazon, J. Svensson, Bayesian data analysis for Gaussian process tomography, J. Fusion Energ.(2018)

SVD first

topos

SVD second

topos

SVD third

topos

➢MHD mode structure analysis: (1,1) kink mode structure identified.

16

Singular value decomposition

GPT real time application on EAST XUV

➢ Impurity accumulation caused by plasma-wall interaction

17

ҧ𝑑𝑀 SXR array detection signal

(64 × 1 dimension for EAST XUV)

ധ𝑅 Response matrix


ത𝐸𝑁 Emissivity value of each pixels


Execution time less than 4ms

More real-time applications?

Adaptive hyperparameters ?

• The hyperparameter optimization is very time consuming.

• Not possible for Real-time.

Integrated Data Analysis

& Other diagnostics ?

• The forward model is not always linear.

• Full Bayesian analysis and MCMC is unavoidable.

• Not possible for Real-time.

Learn Deeper:Neural Network

tomography

➢ Synthetic training data (Phantom test data) from GPT results

30000 GPT results

from EAST SXR #70750

ധ𝑅

EAST SXR response matrix 30000 synthetic measurements

With 5% random noise

20

Establish training data set

• 𝐼𝑛𝑝𝑢𝑡 𝑑𝑎𝑡𝑎 = 𝑑1, 𝑑2, 𝑑3, 𝑑4 … , 𝑑1 ∗ 𝑑1, 𝑑1 ∗ 𝑑2, 𝑑1 ∗ 𝑑3…

• 𝐼𝑛𝑝𝑢𝑡 𝑑𝑎𝑡𝑎 dimension 92 + 92 × 92

• Activation function: Rectified linear unit (ReLU)

• 𝑅𝑒𝐿𝑈 = ቊ𝑥 , 𝑥 > 00 , 𝑥 ≤ 0

21

Convolutional Neural Network (CNN)

➢6 layer network structure

Output layer:

(50,50)

dimension

matrix with

RELU activation,

SXR emissivity

map

Input layer:

the SXR 92

channels

measurement

with interaction

dimension

expansionLayer 1

Layer 2Layer 3

Layer 4

• Averaged Maximun Error in 1000 tests = 6.73%.

• Averaged Root-mean-squared deviation (RMSD) in 1000 tests = 3 × 10−4

• Compared to GPT RMSD = 3.4 × 10−3

• 𝑅𝑀𝑆𝐷 =σ𝑡=1𝑛 (𝐸𝑡,𝑖

𝑟𝑒𝑐−𝐸𝑡,𝑖)

2

𝑛.

• One time slice reconstruction execution time: 3 ms with GPU.

22

Convolutional Neural Network (CNN)

➢Reconstruction results

5% noisy SXR data

input

With interaction

dimension expansion

Fully connected

layer

Sigmoidlayer

Relulayer

Data dimension:

92+92*92 2500 2500 2500 2500 2500

Fully connected

layer

Fully connected

layer

Fully connected

layer

Fully connected

layer

Relulayer

Relulayer

23

Fully connected Neural Network (FCNN)

➢11 layer Network structure

Output layer:

(50,50)

dimension

matrix with

RELU activation,

SXR emissivity

map

• Averaged Maximun Error in 1000 tests = 2.36%.

• Averaged Root-mean-squared deviation (RMSD) in 1000 tests = 1.1 × 10−4

• Compared to GPT RMSD = 3.4 × 10−3

• Compared to standard CNN RMSD = 3 × 10−4

• One time slice reconstruction execution time: 1.6 ms with GPU.

24

Fully connected Neural Network (FCNN)

➢Reconstruction results

25

Training & Testing

➢Training data set from #70750

➢Testing data set form # 70754

FCNN result

26

Green curve:

GPT pseud signal

Blue curve:

FCNN pseud signal

Red dotes:

Measurement data

Benchmark of FCNN & Equi-GPT

➢On training set #70750

SVD first topos SVD second topos SVD third topos

From FCNN

results

From GPT

results

27

➢SVD analysis of training set #70750


FCNN result

Green curve:

GPT pseud signal

Blue curve:

FCNN pseud signal

Red dotes:

Measurement data


➢On testing set #70754

27

SVD first topos SVD second topos SVD third topos

From FCNN

results

From GPT

results

29

➢SVD analysis of testing set #70754


Conclusion and perspectives

• Gaussian process tomography (GPT) has been successfully

implemented for the inversion of soft X-ray emissivity tomography in

EAST and WEST geometry, with a view to real-time control of impurity

transport and MHD activity.

• By modelling the emissivity field as a Gaussian process, the posterior

emissivity distribution is also in a Gaussian process form, permitting

fast reconstruction (execution time in 10ms for one time-step on the

dual-core CPU with MATLAB).

• Based on GPT tomography results database, a complementary neural

network approach has been implemented, using both fully connected

and convolutional architectures. This decreases the computational load

significantly (order 1 ms GPU time with Python).

31

Conclusion

32

• This is a first attempt using the neural networks in real-time

tomography. Good results are achieved using linear diagnostic forward

models.

Perspectives

• The neural network approach can be easily transferred to other

diagnostics (e.g. bolometry) or plasma physics processes.

• The control system could also be implemented into neural network

framework. It is possible to use artificial intelligence to control

tokamaks in the future.

• The neural network approach has even more potential for non-linear

forward models. Additional studies will be performed on more complex

non-linear diagnostic models.Control

&

heating

Plasma

response

Diagnostic

data

Data analysis & inference

AI control system

THANKS!

Documents

Fast tomography of soft X-ray data using Gaussian