25
Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame Joint work with Florence Forbes INRIA, team MISTIS, Grenoble LMNO, Caen October 27, 2016 1 / 25

Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Inverse regression approach to (robust) non-linear high-to-lowdimensional mapping

Emeline Perthame

Joint work with Florence Forbes

INRIA, team MISTIS, Grenoble

LMNO, Caen

October 27, 2016

1 / 25

Page 2: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Outlines

1. Non linear mapping problem

2. GLLiM/SLLiM: inverse regression approach

3. Estimation of parameters

4. Results and conclusion

2 / 25

Page 3: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Outlines

1. Non linear mapping problem

2. GLLiM/SLLiM: inverse regression approach

3. Estimation of parameters

4. Results and conclusion

3 / 25

Page 4: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

A non linear mapping problem

• A non linear mapping problem

y =

y1.........yD

g(y) x1

...xL

= x

• Prediction of X from Y through a non linear regression function g

E(X |Y = y) = g(y)

with Y ∈ RD ,X ∈ RL,D L

4 / 25

Page 5: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

A non linear mapping problem

• Application: Ω mission on Mars → launch of a spectrometer aroundMars

• Problem: Retrieving physical properties from hyperspectral images

− Y: spectrum (D=184)

− X: composition of the ground (L=3)

Mars Express - Omega (2004) [http://geops.geol.u-psud.fr/]

0 50 100 150

0.1

0.2

0.3

0.4

0.5

Wavelength

Refl

ecta

nce prop. of dust

prop. of CO2 ice

prop. of water ice

5 / 25

Page 6: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Some approaches

• Difficulty: D large → curse of dimensionality

• Solutions: via dimensionality reduction

− Reduce dimension of y before regression: eg. PCA on y

→ Risk: poor prediction of x

− Take x into account: PLS, SIR, Kernel SIR, PC based methods

→ Two steps approaches not expressed as a single optimizationproblem

→ Our approach: inverse regression to reduce dimension

6 / 25

Page 7: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Outlines

1. Non linear mapping problem

2. GLLiM/SLLiM: inverse regression approach

3. Estimation of parameters

4. Results and conclusion

7 / 25

Page 8: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Proposed Method: An inverse regression strategy

• x ∈ RL low-dimensional space,

• y ∈ RD high-dimensional space,

• (y , x) are realizations of (Y ,X ) ∼ p(Y ,X ; θ), θ parameters

Inverse conditional density: p(Y | X ; θ)

• Y is a noisy function of X

• Modeled via mixtures → Tractable θ estimation

Forward conditional density: p(X | Y ; θ∗), with θ∗ = f (θ)

→ High-to-low prediction, eg. X = E[X | Y = Y ; θ∗]

8 / 25

Page 9: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Student Locally-linear Mapping (SLLiM)

A piecewise affine model:

• Introduce a missing variable Z → Z = k ⇔ Y is the image of X by anaffine transformation

Y =K∑

k=1

I(Z = k)(AkX + bk + Ek )

Definition of SLLiM

p(Y |X ,Z = k ; θ) = S(Y ;AkX + bk ,Σk , αyk , γ

yk )

• Affine transformations are local: mixture of K Student laws

p(X |Z = k ; θ) = S(X ; ck ,Γk , αk , 1)

p(Z = k ; θ) = πk

• The set of all model parameters is:

θ = πk , ck ,Γk ,Ak , bk ,Σk , αk , k = 1 . . .K

9 / 25

Page 10: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Why a Student mixture ?

• Dealing with outliers → Generalized Student distribution for the jointdensity of (X ,Y )

SM (y ;µ,Σ, α, γ) =Γ(α+ M /2)

|Σ|1/2 Γ(α) (2πγ)M/2[1 + δ(y , µ,Σ)/(2γ)]−(α+M/2),

• Gaussian scale mixture representation (using weight variable Udistributed according to a Gamma distribution )

SM (y ;µ,Σ, α, γ) =

∫ ∞0

NM (y ;µ,Σ/u) G(u;α, γ) du

• Parameters estimation is tractable by an EM algorithm

-6 -4 -2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

x

Den

sity

GaussianStudent α=0.1

10 / 25

Page 11: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Low-to-high (Inverse) Regression

• If X and Y are both observed

− The parameter vector, θ, can be estimated in closed-form using an EMinference procedure

− This yields the inverse conditional density which is a Student mixture:

p(Y |X ; θ) =K∑

k=1

πkS(X ; ck ,Γk , αk , 1)∑Kj=1 πjS(X ; cj ,Γj , αj , 1)

S(Y ;AkX + bk ,Σkαyk , γ

yk )

• Both densities are Student mixtures parameterized by θ. Therefore, toobtain:

− A low-to-high inverse regression function:

E[Y |X = x ; θ] =K∑

k=1

πkS(x ; ck ,Γk , αk , 1)∑Kj=1 πjS(x ; cj ,Γj , αk , 1)

(Akx + bk ),

11 / 25

Page 12: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

High-to-low (Forward) Regression

• The forward conditional density is a Student mixture as well:

p(X |Y ; θ∗) =K∑

k=1

π∗kS(Y ; c∗k ,Γ∗k , αk , 1)∑K

j=1 π∗j S(Y ; c∗j ,Γ

∗j , αj , 1)

S(X ;A∗kY + b∗k ,Σ∗k , α

xk , γ

xk )

• The forward parameter vector, θ∗ has an analytic expression as afunction of θ

• Both densities are Student mixtures parameterized by θ. Therefore, toobtain:

− A high-to-low forward regression function:

E[X |Y = y ; θ] =K∑

k=1

πkS(y ; c∗k ,Γ∗k , αk , 1)∑K

j=1 πjS(y ; c∗j ,Γ∗j , αj , 1)

(A∗ky + b∗k ).

12 / 25

Page 13: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

The forward parameter vector θ∗ from θ

c∗k = Akck + bk ,

Γ∗k = Σk + AkΓkATk ,

A∗k = Σ∗kATk Σ−1

k ,

b∗k = Σ∗k (Γ−1k ck −AT

k Σ−1k bk ),

Σ∗k = (Γ−1k + AT

k Σ−1k Ak )−1.

13 / 25

Page 14: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

A joint model approach to reduce the number of parameters

• Joint model

p(X = x ,Y = y |Z = k) = SL+D

([xy

];mk ,Vk , αk , 1

)with

mk =

[ck

Akck + bk

]and Vk =

[Γk ΓkA

Tk

AkΓk Σk + AkΓkATk

]• Reduce the number of parameters to estimate

− Forward strategy + Γk diagonal

∗ nb. par. = 12D(D − 1) + DL + 2L + D

∗ D = 500,L = 2→ 126 254 parameters

− Inverse strategy + Σk diagonal

∗ nb. par. 12L(L− 1) + DL + 2D + L

∗ D = 500,L = 2→ 2 003 parameters

14 / 25

Page 15: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Extension to partially observed responses

• Incorporate a latent component into the low-dimensional variable:

X =

[TW

]where T ∈ RLt is observed and W ∈ RLw is latent (L = Lt + Lw)

• Example on Mars data: lighting ? temperature ? grain size ?

• Observed pairs (yn ,Tn),n = 1 . . .N (T ∈ RLt)

• Additional latent variable W (W ∈ RLw)

• Assuming the independence of T and W given Z :

p(X = (T ,W )> | Z = k) = SL((T ,W )>; ck ,Γk , αk , 1)

with ck =

[ctk0

], Γk =

[Γtk 0

0 ILw

]

15 / 25

Page 16: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Extension to partially observed responses

• Extension of SLLiM to more general covariance structure

• With Ak =[At

k Awk

],

Y =

K∑k=1

I(Z = k)(AtkT + Aw

k W + bk + Ek )

rewrites

Y =

K∑k=1

I(Z = k)(AtkT + bk + E ′k )

with Var(E ′k ) ∝ Σk + Aw

k Aw>k

− Diagonal Σk −→ Factor analysis with Lw factors (at most)

− A compromise between full O(D2) and diagonal O(D) covariances

16 / 25

Page 17: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Outlines

1. Non linear mapping problem

2. GLLiM/SLLiM: inverse regression approach

3. Estimation of parameters

4. Results and conclusion

17 / 25

Page 18: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Estimation of θ = (ck ,Γk ,Ak , bk ,Σk , πk , αk )1≤k≤K by EM algorithm

• E-step

− Update posterior probabilities

(EZ ) p(Z = k |t , y , θ(i)) → “SMM-like”

(EW ) p(W |Z = k , t , y , θ(i)) → Probabilistic PCA or FactorAnalysis like

(EU ) E(U |Z = k , t , y , θ(i)) → Down-weighting extreme/atypicvalues in estimators → More robust

• M-step

(MX ) (πk , ck ,Γk ) → “SMM-like”

(MY |X ) (Ak , bk ,Σk ) → Hybrid between linear regression andPPCA/FA

Ak = Yk XTk (

[0 0

0 Swk

]+ Xk X

Tk )−1

(Mα) αk → Not in closed-form but standard (specific to Student)

18 / 25

Page 19: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Outlines

1. Non linear mapping problem

2. GLLiM/SLLiM: inverse regression approach

3. Estimation of parameters

4. Results and conclusion

19 / 25

Page 20: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Application L = D = 1

• RATP → Subway in Paris

• Measure of air quality atChatelet station, line 4

• March 2015 → N = 341measures

• Prediction of NO (L=1) fromNO2 (D=1)

→ Robustness of SLLiM

20 30 40 50 60 70 80

010

020

0300

400

500

NO2

NO

20 / 25

Page 21: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Application L = D = 1 / SLLiM compared to GLLiM

20 30 40 50 60 70 80

010

020

0300

400

500

NO2

NO

GLLiMSLLiM

20 30 40 50 60 70 80

010

020

0300

400

500

NO2

NO

GLLiMSLLiM

→ Illustration of robustness of the proposed model

21 / 25

Page 22: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Application L = D = 1 / SLLiM compared to GLLiM

1 2 3 4 5 6 7 8 9 10

0.76

0.78

0.80

0.82

0.84

K

NRMSE

GLLiMSLLiMGLLiM-WOSLLiM-WO

→ SLLiM achieves better prediction rates than GLLiM on complete data

→ SLLiM becomes equivalent to GLLiM when outliers are removed

22 / 25

Page 23: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Other applications and augmented version of SLLiM

• Application when D L

− Hyperspectral data on Mars (D=184, L=2, N=6983)

→ Comparison with other non linear regression methods

Table: Mars data: average NRMSE and standard deviations in parenthesis forproportions of CO2 ice and dust over 100 runs.

Method Prop. of CO2 ice Prop. of dust

SLLiM (K=10) 0.168 (0.019) 0.145 (0.020)GLLiM (K=10) 0.180 (0.023) 0.155 (0.023)MARS 0.173 (0.016) 0.160 (0.021)SIR 0.243 (0.025) 0.157 (0.016)RVM 0.299 (0.021) 0.275 (0.034)

23 / 25

Page 24: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Results - Application to hyperspectral image analysis

GLLiM SLLiM SplinesProportion of CO2 ice

Proportion of dust

24 / 25

Page 25: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame

Conclusion and future work

• Mixture model used for prediction

• Addition of latent variables of partially observed responses

• Selection of K and Lw

− K fixed ? Or selected by BIC ?

− Lw selected by BIC ?

Thank you for your attention ! Any questions ?

25 / 25