Benchmarking robust regression techniques for global ... · Benchmarking robust regression...

Benchmarking robust regression techniques

for global energy con�nement scaling in tokamaks

Geert VerdoolaegeDepartment of Applied Physics, Ghent University, Ghent, Belgium

Laboratory for Plasma Physics, Royal Military Academy (LPP�ERM/KMS), Brussels, Belgium

IAEA TM Fusion Data Processing, Validation and Analysis, May 30, 2017

1 Motivation

2 Geodesic least squares regression (GLS)

3 Energy con�nement scaling

4 Conclusion

Overview

1 Motivation

4 Conclusion

Overview

Parametric dependencies

Validation, prediction

Ordinary least squares

Uncertainties:

All variables (`x ' and `y ')

Heterogeneous data, outliers

Model: deterministic +stochastic component

Collinearity: regularizationy = β0 + β1x + ε

ε ∼ N (0, σ2

Power scaling laws: astronomy, biology, geology, �nance, . . .

Regression analysis

Robust regression analysis

Need a robust, general-purpose regression technique that is easy to apply.

1 Motivation

4 Conclusion

Overview

Two measurements

Zooming in...

Example 1: electron density

Example 1: electron density distribution

Example 2: inter-ELM time

Example 2: inter-ELM time distribution

Di�erence/distance between measurements

Euclidean distance

Which distance?

A point and a distribution

Sum of squares

Mahalanobis distance

p(yi |xi , θ) =1√2πσ

(yi − µi

)2 → maximum likelihood

µi = fi (xi , θ)e.g.= β0 + β1xi

Mahalanobis distance

Telling cats from dogs

Rao geodesic distance

Information geometry

Pseudosphere model

The Gaussian probability space

1√2π(

σ2y + ∑m

j=1 βj2

[y −

(β0 + ∑m

j=1 βj xij

)]2σ2y + ∑m

j=1 βj2

Modeled

distribution

1√2π σobs

(y − yi )2

σobs2

]Observed distribution

Rao GD

To be estimated: σobs, β0, β1, . . . , βm

iid data: minimize sum of squared GDs =⇒ geodesic least squares (GLS) regression

If σmod = σobs ⇒ Mahalanobis distance

G. Verdoolaege et al., Nucl. Fusion 55, 113019, 2015

GLS with linear model

1 Motivation

4 Conclusion

Overview

Engineering parameters:

τE,th = β0 IβIp B

βBt n̄

βne P

l RβR κβκ εβε MβM

Dimensionless variables:

ωciτE,th = α0 ρ∗αρ βαβ

t ν∗αν qαq

95κακ εαε MαM

ITPA global H-mode database: 1296 measurements from 9 tokamaks

IPB98(y,2):τE,th ∝ I 0.93p B0.15

t n̄0.41e P−0.69l R1.97 κ0.78 ε0.58M0.19eff

ωciτE,th ∝ ρ∗−2.70 β−0.90t ν∗−0.01 q−3.095

κ3.3 ε0.73M0.96eff

Global con�nement scaling

ITER-relevance

Uncon�rmed predictions

New predictor variables

Not robust:

Heterogeneous data

Outliers

Log-linear vs. nonlinear

Issues with IPB98

Proportional error bars

Unconstrained

100 bootstrap samples:

Average

95% con�dence interval

Benchmarking:

Ordinary least squares (OLS)

Iteratively reweighted least squares (ROB)

Bayesian: uninformative priors, marginalized σ (ROB)

Kullback-Leibler least squares (KLD)

Geodesic least squares (GLS)

Methodology

β0 βI βB βn βP βR βκ βε βM τ̂E,th (s)

IPB98 0.056 0.93 0.15 0.41 −0.69 1.97 0.78 0.58 0.19 4.9

OLS ll. 0.049 0.78 0.32 0.44 −0.67 2.24 0.39 0.58 0.18 4.3± 0.25OLS nl. 0.058 0.67 0.50 0.47 −0.83 2.60 1.0 0.86 −0.26 3.5± 0.33

ROB 0.046 0.77 0.32 0.45 −0.66 2.26 0.33 0.57 0.24 4.4± 0.24

BAY 0.051 0.87 0.13 0.47 −0.67 2.13 0.17 0.49 0.23 4.3

KLD ll. 0.056 0.61 0.49 0.46 −0.81 2.53 0.93 1.0 0.18 3.2± 0.29KLD nl. 0.053 0.60 0.49 0.49 −0.81 2.57 0.94 1.0 0.18 3.3± 0.37

GLS ll. 0.048 0.65 0.44 0.49 −0.76 2.52 0.63 0.87 0.27 4.0± 0.23GLS nl. 0.047 0.65 0.44 0.50 −0.75 2.52 0.62 0.85 0.22 4.1± 0.25

ll. = log-linear, nl. = nonlinear33

Regression results

J.G. Cordey et al., Nucl. Fusion 45, 1078, 200534

Robustness w.r.t. error bars

Interpretation on pseudosphere: JET data

1 Motivation

4 Conclusion

Overview

Geodesic least squares regression: �exible and robust

Easy to use, fast optimization

Works for linear and nonlinear relations and any distribution model

Revisit established scaling laws, contribute to new regression analyses

Robust estimation of con�nement scaling

Comparing probability distributions:

Quanti�cation of stochasticity

Model validation

Conclusions

Benchmarking robust regression techniques for global ... · Benchmarking robust regression...

Documents

LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model

1 Curve-Fitting Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression Interpolation

Benchmarking State-Of-The-Art Regression Algorithms For Loss Given ...€¦ · Benchmarking State-Of-The-Art Regression Algorithms For Loss Given Default Modelling Gert Loterman1,

Applied Regression Analysis - Department of …honli/teaching/Regression/lectureNotes/Lect3.pdf · Applied Regression Analysis Recall simple linear regression Multiple Linear Regression

MAINTENANCE BENCHMARKING INSTRUMENT - cou.fi · MAINTENANCE BENCHMARKING INSTRUMENT ... benchmarking, maintenance, maintenance costs, ... best practices benchmarking can be …

Benchmarking benchmarking, and optimizing optimization ...cr.yp.to/talks/2016.10.19/slides-djb-20161019-benchbench-a4.pdf · Benchmarking benchmarking, and optimizing optimization

BENCHMARKING - dinus.ac.iddinus.ac.id/.../docs/ajar/2016_BPK_07_-_Pengenalan_Benchmarking.pdf · •Benchmarking Advantages & Disadvantages . What is Benchmarking? Benchmarking is

Regression analysis Linear regression Logistic regression

Cost of Capital Estimation. Methods for Benchmarking the Cost of Equity Capital 1. Capital Asset Pricing Model using beta from a regression analysis (top-down

1 Curve-Fitting Polynomial Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression

BENCHMARKING Benchmarking A Bridge to School Improvement

Chapter 2 Simple Linear Regression Analysis The simple ...home.iitk.ac.in/~shalab/regression/Chapter2-Regression-Simple... · Regression Analysis | Chapter 2 | Simple Linear Regression

SIMPLE LINEAR REGRESSION. 2 Simple Regression Linear Regression

Benchmarking Social Media Tools/ Benchmarking herramentas Social Media

Experiences with Enumeration of Integer Projections of Parametric Polytopes Sven Verdoolaege, Kristof Beyls, Maurice Bruynooghe, Francky Catthoor Compiler

Modern Regression - Ridge Regression

Benchmarking

Benchmarking Africa’s Costs and Competitivenesssiteresources.worldbank.org/EXTAFRSUMAFTPS/Resources/chapter4.pdf · Benchmarking Africa’s Costs and Competitiveness ... Benchmarking

Regression analysis Regression Models

SPIN Benchmarking. Content Introduction to D4S Benchmarking. What? Why? How? Different kinds of Benchmarking The 10 Step D4S Benchmarking approach Lessons