Structured Regularization for conditional Gaussian graphical model

Structured Regularization for conditional GaussianGraphical Models

Julien Chiquet, Stephane Robin, Tristan Mary-Huard

MAP5 – April the 4th, 2014

arXiv preprint http://arxiv.org/abs/1403.6168

Application to Multi-trait genomic selection (MLCB 2013 NIPS Workshop)

R-package spring https://r-forge.r-project.org/projects/spring-pkg/.

1

http://arxiv.org/abs/1403.6168

https://r-forge.r-project.org/projects/spring-pkg/

Multivariate regression analysis

Consider n samples and let for individual i

I yi be the q-dimensional vector of responses,

I xi be the p-dimensional vector of predictors,

I B be the p × q matrix of regression coefficients

I εi be a noise term with a q-dimensional covariance matrix R.

yi = BTxi + εi , εi ∼ N (0,R), ∀i = 1, . . . ,n,

Matrix notation

Let Y(n × q) and X(n × p) be the data matrices, then

Y = XB + ε, vec(ε) ∼ N (0, Ip ⊗R).

Remark

If X is a design matrix, this is called the “General Linear Model” (GLM).

2

Motivating example: cookie dough data

Osborne, B.G., Fearn, T., Miller, A.R., and Douglas, S.

Application of near infrared reflectance spectroscopy to compositional analys is of

biscuits and biscuit doughs. J. Sci. Food Agr., 1984.

10

20

30

40

50

fat sucrose dry flour water

variable

fat

sucrose

dry flour

water

responses

0.8

1.2

1.6

1500 1750 2000 2250position

refle

ctan

ce

predictors

I q = 4 responses related to the composition of biscuit dough.I p = 256 wavelengths equally sampled between 1380nm and 2400nm.I n = 70 biscuit dough samples.

3

From Low to High dimensional setup

Low dimensional setup

Mardia, Kent and Bibby, Multivariate Analysis, Academic Press, 1979.

I Mathematics is the same for both GLM and MLR.

I Application of maximum likelihood, least squares and generalizedleast squares lead to an estimator which is not defined when n < p

B =(XTX

)−1XTY.

High dimensional setup: regularization is a popular answer

Biais B towards a given feasible set to enhance both predictionperformance and interpretability.

What features are required for the coefficients? (sparsity, and. . . )m

How do we shape the feasible this set?4

From Low to High dimensional setup

Low dimensional setup

Mardia, Kent and Bibby, Multivariate Analysis, Academic Press, 1979.

I Mathematics is the same for both GLM and MLR.

I Application of maximum likelihood, least squares and generalizedleast squares lead to an estimator which is not defined when n < p

B =(XTX

)−1XTY.

High dimensional setup: regularization is a popular answer

Biais B towards a given feasible set to enhance both predictionperformance and interpretability.

What features are required for the coefficients? (sparsity, and. . . )m

How do we shape the feasible this set?

Our proposal: SPRINGa

aStructured selection of Primordial Relationships IN the General linear model

1. account for the dependency structure between the outputs if it exists by estimating R

2. pay attention to the direct links between predictors and responses by means of sparse GGM

3. Integrate some prior information about the predictors by means of graph-regularization

4

Outline

Statistical Model

Regularizing Scheme and optimization

Inference and Optimization

Simulation Studies

Spectroscopy and the cookie dough data

Multi-trait genomic selection for a biparental population (Colza)

5

Outline

Statistical Model



Simulation Studies



6

Connection between multivariate regression and GGM (I)

Multivariate Linear Regression (MLR)

The model writesyi |xi ∼ N (BTxi ,R).

with negative log-likelihood

− logL(B,R) =n

2log |R|+ 1

2tr((Y −XB)R−1(Y −XB)T

)+ cst,

which is only bi-convex in (B,R).

7

Connection between multivariate regression and GGM (II)Used in Sohn & Kim (2012) and others

Assume that xi ,yi are centered and jointly Gaussian such as(xi

yi

)∼ N (0,Σ), with Σ =

(Σxx Σxy

Σyx Σyy

), Ω , Σ−1 =

(Ωxx Ωxy

Ωyx Ωyy

).

A Convex loglikelihood

The model writes yi |xi ∼ N(−Ω−1

yyΩyxxi ,Ω−1yy

)and

− 2

nlogL(Ωxy,Ωyy) = − log |Ωyy|+ tr (SyyΩyy)

+ 2tr (SxyΩyx) + tr(ΩyxSxxΩxyΩ−1yy) + cst.

(with Sxx = XTX/n and so on)

8

CGGM: interpretation (I)

Matrix Ω is related to partial correlations (direct links).

corij |\i ,j =−Ωij√ΩiiΩjj

, so Ωij = 0⇔ Xi ⊥⊥ Xj |\i , j .

Linking parameters of MLR to cGGM

The cGGM “splits” the regression coefficients into two parts

B = −ΩxyΩ−1yy , R = Ω−1

yy .

1. Ωxy describes the direct links between predictors and responses

2. Ωyy is the inverse of the residual covariance R

B entails both direct and indirect links, the latter due to correlationbetween responses

9

CGGM: interpretation (II)Illustrative examples

NO structure along the predictors

I Ωxy: p = 40 predictors, q = 5 outcomes.

I R: toeplitz scheme Rij = τ |i−j |

I B = −ΩxyR.

Direct relationships are masked in B in case ofstrong correlations between the responses.

Ωxy

10




I R: toeplitz scheme Rij = 0.1|i−j |

I B = −ΩxyR.


Rlow

Ωxy Blow

10





I B = −ΩxyR.


Rmed

Ωxy Bmed

10





I B = −ΩxyR.


Rhigh

Ωxy Bhigh

10


Strong structure along the predictors


I R: toeplitz scheme Rij = τ |i−j |

I B = −ΩxyR.


R

Ωxy B

10





I B = −ΩxyR.


Rlow

Ωxy Blow

10





I B = −ΩxyR.


Rmed

Ωxy Bmed

10





I B = −ΩxyR.


Rhigh

Ωxy Bhigh

10


along the predictors


I R: toeplitz scheme Rij = |i−j |

I B = −ΩxyR.


R

Ωxy B

Consequence

Our regularization scheme will be applied on the direct links Ωxy.

Remarks

I sparsity on Ωxy does not necessarily induce sparsity on B.

I The prior structure on the predictors is identical on Ωxy and B as itapplies on the ”rows”

10

Outline

Statistical Model



Simulation Studies



11

Ball crafting towards structured regularization (1)

Elastic-Net

Grouping effect that catches highlycorrelated predictors simultaneously

minimizeβ∈Rp

||Xβ − y||22

+ λ1 ||β||1+ λ2 ||β||22 .

Zou, H. and Hastie, T.

Regularization and variable selection via the elastic net. JRSS B, 2005.

12


Fused-Lasso

Encourages sparsity and identicalconsecutive parameters.

minimizeβ∈Rp

||Xβ − y||22

+ λ1 ||β||1

+ λ2

p−1∑j=1

|βj+1 − βj |.

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K.

Sparsity and smoothness via the fused lasso. JRSS B, 2005.

13


Fused-Lasso

Encourages sparsity and identicalconsecutive parameters.

minimizeβ∈Rp

||Xβ − y||22

+ λ1 ||β||1+ λ2 ||Dβ||1 ,

with

D =

1 ... ... p

1 −1 1...

. . .. . .

p−1 −1 1

13


Structured/Generalized Elastic-Net

A “smooth” version of fused-Lasso(neighbors should be close, notidentical).

minimizeβ∈Rp

||Xβ − y||22

+ λ1 ||β||1

+ λ2

p−1∑j=1

(βj+1 − βj )2.

Slawski, zu Castell and Tutz.Feature selection guided by structural information. Ann. Appl. Stat., 2010.

Hebiri and van De Geer

The smooth-lasso and other l1 + l2 penalized methods EJS, 2011.

14


Structured/Generalized Elastic-Net

A “smooth” version of fused-Lasso(neighbors should be close, notidentical).

minimizeβ∈Rp

||Xβ − y||22

+ λ1 ||β||1+ λ2 β

TDTDβ.

L = DTD =

1 ... ... ... p

1 1 −1... −1 2 −1

. . .

.... . .

. . .. . .

−1 2 1p −1 1

.

14

Generalized fused penalty: the univariate case

Graphical interpretation of the fusion penalty

p−1∑j=1

|βj+1 − βj |︸︷︷︸Fused Lasso

p−1∑j=1

(βj+1 − βj )2

︸︷︷︸Generalized ridge

A chain graph between the successive (ordered) predictors

Generalization via a graphical argument

Let G = (E ,V,W) be a graph with weighted edges. Then∑(i ,j )∈E

wij |βj − βi | = ‖Dβ‖1∑

(i ,j )∈E

wij (βj − βi)2 = βTLβ,

L = DTD 0 is the graph Laplacian associated o G

15

Adapting this scheme to multivariate settings

Bayesian Interpretation

Suppose the prior structure is encoded in a matrix L.

I Univariate case: the conjugate prior for β is N (0,L−1).

I Multivariate case: combine with the covariance, then

vec(B) ∼ N (0,R⊗ L−1).

I Using vec and ⊗ properties, we have for the direct links

vec(Ωxy) ∼ N (0,R−1 ⊗ L−1).

Corresponding regularization term

logP(Ωxy|L,R) =1

2tr(ΩT

xyLΩxyR)

+ cst.

16

Outline

Statistical Model



Simulation Studies



17

Optimization problem

Penalized criterion

Encourage sparsity with structuring prior on the direct links:

J (Ωxy,Ωyy) =

− 1

nlogL(Ωxy,Ωyy) +

λ2

2tr(ΩyxLΩxyΩ−1

yy

)+ λ1‖Ωxy‖1.

Proposition

The objective function is jointly convex in (Ωxy,Ωyy) and admits atleast one global minimum which is unique when n ≥ q and (λ2L + Sxx)is positive definite.

18

Algorithm

Alternate optimization

Ω(k+1)yy = arg min

Ωyy0Jλ1λ2(Ω

(k)xy ,Ωyy), (1a)

Ω(k+1)xy = arg min

Ωxy

Jλ1λ2(Ωxy, Ω(k+1)yy ). (1b)

I (1a) boilds down to the diagonalization of a q × q matrix. O(q3)

I (1b) can be recast as generalized Elastic-Net with size pq . O(npqk) where k is the final number of nonzero entries in Ωxy

Convergence

Despite nonsmoothness of the objective, the `1 penalty is separable in(Ωxy,Ωyy) and results of Tseng (2001, 2009) on convergence ofcoordinate descent apply.

19

First block: covariance estimationAnalytic resolution of R

If Ωxy = 0, then Ωyy = S−1yy . Otherwise we rely on the following:

Proposition

Let n > q . Assume that the following eigen decomposition holds

ΩyxΣλ2xxΩxySyy = Udiag(ζ)U−1

and denote by η = (η1, . . . , ηq) the roots of η2j − ηj − ζj . Then

Ωyy = Udiag(η/ζ)U−1ΩyxΣλ2xxΩxy(= R−1), (2a)

Ω−1yy = SyyUdiag(η−1)U−1(= R). (2b)

Proof.

Differentiation of the objective, commuting matrices property,algebra.

20

Second block: parameters estimationReformulation as a Elastic-Net problem

Proposition

Solution Ωxy for a fix Ωyy is given by vec(Ωxy) = ω where ω solves theElastic-Net problem

arg minω∈Rpq

1

2‖Aω − b‖22 + λ1‖ω‖1

λ2

2ωT(Ω−1yy ⊗ L

)ω,

where A and b are defined thanks to the Cholesky decomposition

CTC = Ω−1yy , so as

A =(C⊗X/

√n), b = −vec

([YC−1/

√n]T)

.

Proof.

Algebra with bad vec/tr/⊗ properties.

21

Monitoring convergenceExample on the cookies dough data

0 50 100 150 200 250

−5

−4

−3

−2

−1

0

iteration

obje

ctiv

e

3e−

01

2e−

01

1e−

01

1e−

01

7e−

02

4e−

02

3e−

02

2e−

02

1e−

02

9e−

03

6e−

03

4e−

03

3e−

03

2e−

03

1e−

03

8e−

04

5e−

04

4e−

04

2e−

04

2e−

04

λ1

Figure: monitoring the objective along the whole path of λ122

Monitoring convergenceExample on the cookies dough data

0 50 100 150 200 250

5010

015

020

0

iteration

logl

ikel

ihoo

d

3e−

01

2e−

01

1e−

01

1e−

01

7e−

02

4e−

02

3e−

02

2e−

02

1e−

02

9e−

03

6e−

03

4e−

03

3e−

03

2e−

03

1e−

03

8e−

04

5e−

04

4e−

04

2e−

04

2e−

04

λ1

Figure: monitoring the likelihood along the whole path of λ122

Tuning the penalty parameters

K−fold cross-Validation

Computationally intensive, but works! For κ : 1, . . . ,n → 1, . . . ,K,

(λcv1 , λcv2 ) = arg min

(λ1,λ2)∈Λ1×Λ2

1

n

n∑i=1

∥∥∥xTi Bλ1,λ2−κ(i) − yi

∥∥∥2

2.

Information criteria adapted to regularized methods

(λpen1 , λpen2 ) = arg minλ1,λ2

−2 logL(Ω

λ1,λ2xy , Ω

λ1,λ2yy ) + pen (dfλ1,λ2)

.

Proposition (Generalized degrees of freedom)

dfλ1,λ2 = card(A)− λ2tr(

(R⊗ L)AA(R⊗ (Sxx + λ2L))−1AA

),

where A =j : vec

(Ωλ1,λ2xy

)6= 0

, the set of active guys.

23




(λcv1 , λcv2 )

?= arg min

(λ1,λ2)∈Λ1×Λ2

1

n

n∑i=1

logL(Ωλ1,λ2xy , Ω

λ1,λ2yy ; xi ,yi).



−2 logL(Ω

λ1,λ2xy , Ω


.




),

where A =j : vec

(Ωλ1,λ2xy

)6= 0


23




(λcv1 , λcv2 )

?= arg min

(λ1,λ2)∈Λ1×Λ2

1

n

n∑i=1

logL(Ωλ1,λ2xy , Ω

λ1,λ2yy ; xi ,yi).



−2 logL(Ω

λ1,λ2xy , Ω


.




),

where A =j : vec

(Ωλ1,λ2xy

)6= 0


23

Outline

Statistical Model



Simulation Studies



24

Assessing gain brought by covariance estimationSimulation settings

Parameters

p = 40 predictors, q = 5 outcomes.

I Ωxy: 25 non null entries in −1, 1no particular structure along the predictors

I R: toeplitz schemeRij = τ |i−j | with τ ∈ 0.1, 0.5, 0.9.

I B = −ΩxyR.

Data generation

Draw ntrain = 50 + ntest = 1000 samples from

yi = BTxi+εi , with xi ∼ N (0, I) and εi ∼ N (0,R).

Evaluating performance

Compare prediction error on 100 runs betweenLasso, group-Lasso and SPRING.

Ωxy

25


Parameters




I B = −ΩxyR.

Data generation





Rlow

Ωxy Blow

25


Parameters




I B = −ΩxyR.

Data generation





Rmed

Ωxy Bmed

25


Parameters




I B = −ΩxyR.

Data generation





Rhigh

Ωxy Bhigh

25


Parameters




I B = −ΩxyR.

Data generation





R

Ωxy B

25


Parameters




I B = −ΩxyR.

Data generation





R

Ωxy B

25

Assessing gain brought by covariance estimationResults

1

2

3

4

low medium high

Pre

dict

ion

Err

or estimatorspring (oracle)springlassogroup−lasso

Figure: Prediction error for 100 runs illustrates the influence of correlationsbetween outcomes. Scenarios low,med, high map to τ ∈ .1, .5, .9

26

Assessing gain brought by structure integrationSimulation settings

Parameters

p = 100, q = 1 to remove covariance effect.

I Ωxy , ωxy: a vector with two successive bumps

ωj =

−((30− j )2 − 100)/200 j = 21, . . . 39,

((70− j )2 − 100)/200 j = 61, . . . 80,

0 otherwise.

−0.50

−0.25

0.00

0.25

0.50

0 25 50 75 100

I R , ρ = 5: a residual (scalar) variance.

I β = −ωxy/ρ.

Data generation


yi = βTxi + εi , with xi ∼ N (0, I) and εi ∼ N (0, ρ).

27

Assessing gain brought by structure integrationSimulation settings

Parameters

p = 100, q = 1 to remove covariance effect.

I Ωxy , ωxy: a vector with two successive bumps

ωj =

−((30− j )2 − 100)/200 j = 21, . . . 39,

((70− j )2 − 100)/200 j = 61, . . . 80,

0 otherwise.

I R , ρ = 5: a residual (scalar) variance.

I β = −ωxy/ρ.

Data generation


yi = βTxi + εi , with xi ∼ N (0, I) and εi ∼ N (0, ρ).

27

Assessing gain brought by structure informationResults (1): predictive performance

40

80

120

160

0.001 0.100

penalty level λ1 (scaled)

Pre

dict

ion

Err

or

estimatorspring (λ2 = .01)spring (λ2 = .00)lasso

Figure: Mean PE + standard error for 100 runs on a grid of λ1 – SPRING withand without structural regularization (L = D>D) and Lasso

28

Assessing gain brought by structure informationResults (2): robustness

What if we introduce a “wrong” structure?

Evaluate performance with the same settings but

I randomly swap all elements in ωxy to remove any structure.

I keep exactly the same xi , εi ,

I draw yi with swapped an unswapped parameters,

I use the same folds for cross-validation,

then replicate 100 times.

Method Scenario MSE PELASSO – .336 (.096) 58.6 (10.2)E-Net (L = I) – .340 (.095) 59 (10.3)SPRING (L = I) – .358 (.094) 60.7 (10)S. E-net unswapped .163 (.036) 41.3 ( 4.08)(L = DTD) swapped .352 (.107) 60.3 (11.42)SPRING unswapped .062 (.022) 31.4 ( 2.99)(L = DTD) swapped .378 (.123) 62.9 (13.15)

29

Assessing gain brought by structure informationResults (2): robustness

What if we introduce a “wrong” structure?

Evaluate performance with the same settings but

I randomly swap all elements in ωxy to remove any structure.

I keep exactly the same xi , εi ,

I draw yi with swapped an unswapped parameters,

I use the same folds for cross-validation,

then replicate 100 times.

Method Scenario MSE PELASSO – .336 (.096) 58.6 (10.2)E-Net (L = I) – .340 (.095) 59 (10.3)SPRING (L = I) – .358 (.094) 60.7 (10)S. E-net unswapped .163 (.036) 41.3 ( 4.08)(L = DTD) swapped .352 (.107) 60.3 (11.42)SPRING unswapped .062 (.022) 31.4 ( 2.99)(L = DTD) swapped .378 (.123) 62.9 (13.15)

29

Outline

Statistical Model



Simulation Studies



30

Cookie dough data: performance

Method fat sucrose flour waterStep. MLR .044 1.188 .722 .221Decision th. .076 .566 .265 .176PLS .151 .583 .375 .105PCR .160 .614 .388 .106Bayes. Reg. .058 .819 .457 .080LASSO .045 .860 .376 .104grp LASSO .127 .918 .467 .102str E-net .039 .666 .365 .100MRCE .151 .821 .321 .081SPRING (CV) .065 .397 .237 .083SPRING (BIC) .048 .389 .243 .066

Table: Test error

Brown, P.J., Fearn, T., and Vannucci, M.

Bayesian wavelet regression on curves with applications to a spectroscopic

calibration problem. JASA, 2001.

31



Table: Test error

−100

0

100

1500 1750 2000 2250

The Lasso induces sparsity on B

I No structure along the predictors.

I No structure between responses.

31



Table: Test error

−100

−50

0

50

100

150

1500 1750 2000 2250

The Group-Lasso induces sparsity on B group-wise across the responses


I (Too) strong structure between responses.

31



Table: Test error

−20

−10

0

10

20

30

1500 1750 2000 2250

The Structured Elastic-Net induces sparsity on B with a smoothneighborhood prior along the predictors (L = DTD)

I Structure along the predictors.

I No structure between responses.31



Table: Test error

−20

−10

0

10

20

30

1500 1750 2000 2250

MRCE induces sparsity on B and sparsity on R−1


I (Supposed to add) Structure between responses.

31



Table: Test error

−20

−10

0

10

20

30

1500 1750 2000 2250

SPRING

I Use GGM to induce structured sparsity on the direct links betweenthe responses and the predictors + smooth neighborhood prior viaL = DTD.

31

Cookie dough data: parameters

B −Ωxy R

−20

−10

0

10

20

30

1500 1750 2000 2250

−20

−10

0

10

20

30

1500 1750 2000 2250

dry flour

fat

sucrose

water

dry

flour

fat

sucr

ose

wat

er

−0.250.000.250.50

value

32

Cookie dough data: model selection with BIC

−700

−600

−500

−400

0.001 0.100log10(λ1)

crite

rion'

s va

lue λ2

0.01

0.1

1

10

BIC

33

Outline

Statistical Model



Simulation Studies



34

Quantitative Trait Loci (QTL) study in Colza

Doubled haployd samples

I n = 103 homozygous lines of Brassica napus by crossing ‘Stellar’and ‘Major’ cultivars.

Bi-parental markers

I p = 300 markers with known loci dispatched on the 19 chromosomeswith value in Major, Stellar, Missing → 1,−1, 0.

Traits

Consider q = 8 traits including

I survival traits (% survival in winter)surv92, surv93, surv94, surv97, surv99

I flowering traits (no vernalization, 4 weeks or 8 weeks vernalization)flower0, flower4, flower8

35

Include genetic linkage information

Genetic distance between markers A1 and A2

Let r12 be the recombination rate between A1 and A2, then

d12 = −1

2log 1− 2r12 .

Linkage disequilibrium as covariance between the markers

In a biparental population with independent recombination events, onehas

cor(A1,A2) = ρd13 = ρd12+d13 , with ρ = e−2.

Proposition (Including LD information in the model)

The matrix L is given by inverting the covariance matrix, which can bedone analytically.

36

Analytical form of L as a precision matrixUsually met in AR(1) processes

L is given by the inverse of the correlation between the markers

L = UTΛU

with

U =

1 −ρd12 0 . . . 0 00 1 −ρd23 . . . 0 0

0 0 1. . . 0 0

......

.... . .

. . ....

0 0 0 . . . 1 −ρdm−1m

0 0 0 . . . 0 1

Λ =

(1− ρ2d12 )−1 0 . . . . . . . . . 00 (1− ρ2d23 )−1 0 . . . . . . 0

0 0. . .

......

.... . .

...0 0 0 . . . (1− ρ2dm−1m )−1 00 0 0 . . . 0 1

.

37

Predictive performance

1. Split the data into training/test sets (n1 = 70,n2 = 33),2. Adjust each procedure using 5-fold CV for model selection,3. Compute test (prediction) error.

Method surv92 surv93 surv94 surv97 surv99 Mean PELASSO .79 .98 .90 1.02 1.00 .938group-LASSO .90 1.00 .92 .99 .92 .946Enet (no LD) .87 1.01 .97 1.03 1.03 .983Gen-Enet LD) .75 .98 .89 1.03 1.02 .934our proposal (LD) .77 .96 .84 1.00 1.02 .918

Table: Survival traits

Method flower0 flower4 flower8 Mean PELASSO .58 .53 .74 .616group-LASSO .59 .55 .74 .626Enet (no LD) .55 .54 .69 .593Gen-Enet (LD) .55 .50 .74 .596our proposal (LD) .48 .46 .68 .54

Table: Flowering traits38

Estimated Residual Covariance R

flower0

flower4

flower8

surv92

surv93

surv94

surv97

surv99

flower0 flower4 flower8 surv92 surv93 surv94 surv97 surv99

−1.0

−0.5

0.0

0.5

1.0correlation

39

Estimated Regression Coefficients B

−0.1

0.0

0.1

0 500 1000 1500position of the markers

outcomessurv92surv93surv94surv97surv99flower0flower4flower8

40

Estimated Direct Effects Ωxy

−0.1

0.0

0.1



41

Estimated Direct Effects Ωxy

−0.1

0.0

0.1



41

QTL Mapping (chr. 2, 8, 10), regression coefficients B

ec2e5a

E33M49.117ec3b12wg2d11bwg1g8a

E32M49.73ec3a8wg7f3aE33M59.59E35M62.117wg6b10wg8g1bwg5a5tg6a12Aca1E38M50.133

E35M59.117

ec2d1a

wg1a10tg2h10btg2f12

wg4b6b

wg6g9E33M59.147eru1ec4h3E33M62.99E38M50.157

wg6d9

E38M62.189tg3c1ec3d3bE33M49.175E33M48.268E35M62.80E35M48.143wg1g4aE33M47.182b

E38M50.119wg7b3

E33M59.64

ec3g3c

ec2h2bE32M48.212

0

40

80

120

2 8 10

loci

outcomes

surv92surv93surv94surv97surv99flower0flower4flower8

42

QTL Mapping (chr. 2, 8, 10), direct links Ωxy

ec2e5a

E33M49.117ec3b12wg2d11bwg1g8a

E32M49.73ec3a8wg7f3aE33M59.59E35M62.117wg6b10wg8g1bwg5a5tg6a12Aca1E38M50.133

E35M59.117

ec2d1a

wg1a10tg2h10btg2f12

wg4b6b


wg6d9

E38M62.189tg3c1ec3d3bE33M49.175E33M48.268E35M62.80E35M48.143wg1g4aE33M47.182b

E38M50.119wg7b3

E33M59.64

ec3g3c

ec2h2bE32M48.212

0

40

80

120

2 8 10

loci

outcomes

surv92surv94surv97surv99flower0flower4flower8

42

QTL Mapping (all chromosomes), B

wg1h4c

wg1g5c

tg5e11b

E35M47.262

tg6c3aE32M48.249E33M50.252E35M48.120

ec5d5

wg1g10aE33M48.369E33M50.451tg1f8

wg7b6aisoACO

ec4f1

ec2e5a

E33M49.117

ec3b12wg2d11b

wg1g8a

E32M49.73wg3g11ec3a8wg7f3aE33M59.59

E35M62.117wg6b10

wg8g1bwg5a5tg6a12

Aca1E38M50.133

E35M59.117

ec2d1a

wg1a10tg2h10btg2f12

E33M49.491E38M62.229wg1g10bec4h7

wg4d10

E32M50.409E38M50.159E33M47.338ec2d8a

E33M49.165wg4f4cE35M48.148wg6c6ec4g7bwg7a11

wg5b1aec4e7aE32M61.218

wg4a4bE33M50.183E33M62.140

wg9c7

wg6b2

E33M49.211wg2e12bisoIdh

wg3f7c

ec3b4

E33M62.206wg5b2E32M59.302

wg6a12

wg4d7cec4c5b

E32M61.166

E32M47.136

E32M62.107wg6f10

ec5e12c

E38M62.358E35M62.256

ec5a7bwg3c9

E33M47.154E35M59.581E32M47.460

ec4g7aec6b2

E35M62.111wg1g6E35M62.201ec4c5aec5a7awg1h5

wg6a10E33M50.120ec4e8

E33M48.191

E32M47.168E35M62.225E35M62.340wg1g8cE32M62.75E32M49.473E32M59.330wg7e10wg6h1bwg2c1tg5h12wg3b6

wg7d9awg1g3b

wg7h2

wg9d5

E32M59.359

E33M59.353

E32M61.137ec3h4

wg8g3wg2a11tg2b4E35M47.367ec2e4bE32M47.512ec2h2aLemtg5d9awg7f5awg5a1aec3e12a

wg4b6b


wg6d9

E33M60.50

wg4h5a

wg3h8ec3d3aec2c7

wg4d11tg1h12ec2e5bE38M62.461

wg3f7aE35M60.312

E38M62.189tg3c1ec3d3bE33M49.175E33M48.268E35M62.80E35M48.143wg1g4a

E33M47.182b

E38M50.119wg7b3

E33M59.64

ec3g3c

ec2h2bE32M48.212

wg1g5btg6c3bec5a1wg6f3

E32M62.115E33M62.250E32M62.186

wg2b7wg8h5

wg3h4

tg2h10a

tg5e11aE32M50.90ec2d1b

E32M50.77

wg1g4c

wg8g1awg2c3wg7f3bwg4h1

ec4e7bwg5a6

ec2c12

wg2d11a

ec2e12a

wg7a8a

isoLapE33M62.176

E35M48.84E33M49.293E35M62.136

eru2E38M50.186

ec4f11

E32M50.252

E32M59.107

wg1g8b

wg2g9E33M50.282E35M48.123wg1e3wg6d6wg4f4a

ec5c4E35M48.198E35M62.135wg1a4ec2e12bwg3h6

wg4d5awg5b1bE33M61.54

ec3d2E32M48.191E33M59.333

wg6e3bec4d11E32M59.88

ec4g4wg1g4b

ec3g12

ec3g3a

wg9f2

E35M62.222

wg4a4aE33M59.234E33M61.84wg4d7b

ec5e12bec4c11wg6e3a

E32M48.69

ec3b2b

E32M47.186ec4d9

wg4d5c

E33M48.67E35M60.329wg1h4b

E38M62.188

E32M50.261

E33M50.118

wg1g3a

E35M60.230wg6a11wg6h1a

E32M62.241E32M47.288

E33M48.316E33M59.225ec2e4c

ec3e12bwg5a1bwg2a3c

ec5e12awg7f5bE32M47.159tg5d9bslg6E35M59.85

ec2d8bE35M62.132

E35M47.337E35M47.257wg9e9

ec2b3E33M60.229

E32M50.325

wg6c1ec3b2aE35M47.170

wg2d5a

E33M60.120

E33M47.115

wg1g10c

E33M62.130E32M47.344

E32M50.255

wg3g9E32M50.424

pr2

tg5b2

E33M59.165E35M60.125E33M62.196

ec2h2cwg3f7b

ec3f1

E35M60.107

wg1g2

E33M48.346E33M50.371

E33M47.138

tg4d2bE32M62.394E33M47.189

E32M49.409wg7b6b

0

50

100

150

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19chromosomes

loci

outcomes


43

QTL Mapping (all chromosomes), Ωxy

wg1h4c

wg1g5c

tg5e11b

E35M47.262

tg6c3aE32M48.249E33M50.252E35M48.120

ec5d5

wg1g10aE33M48.369E33M50.451tg1f8

wg7b6aisoACO

ec4f1

ec2e5a

E33M49.117

ec3b12wg2d11b

wg1g8a

E32M49.73wg3g11ec3a8wg7f3aE33M59.59

E35M62.117wg6b10

wg8g1bwg5a5tg6a12

Aca1E38M50.133

E35M59.117

ec2d1a

wg1a10tg2h10btg2f12

E33M49.491E38M62.229wg1g10bec4h7

wg4d10

E32M50.409E38M50.159E33M47.338ec2d8a

E33M49.165wg4f4cE35M48.148wg6c6ec4g7bwg7a11

wg5b1aec4e7aE32M61.218

wg4a4bE33M50.183E33M62.140

wg9c7

wg6b2

E33M49.211wg2e12bisoIdh

wg3f7c

ec3b4

E33M62.206wg5b2E32M59.302

wg6a12

wg4d7cec4c5b

E32M61.166

E32M47.136

E32M62.107wg6f10

ec5e12c

E38M62.358E35M62.256

ec5a7bwg3c9

E33M47.154E35M59.581E32M47.460

ec4g7aec6b2

E35M62.111wg1g6E35M62.201ec4c5aec5a7awg1h5

wg6a10E33M50.120ec4e8

E33M48.191

E32M47.168E35M62.225E35M62.340wg1g8cE32M62.75E32M49.473E32M59.330wg7e10wg6h1bwg2c1tg5h12wg3b6

wg7d9awg1g3b

wg7h2

wg9d5

E32M59.359

E33M59.353

E32M61.137ec3h4

wg8g3wg2a11tg2b4E35M47.367ec2e4bE32M47.512ec2h2aLemtg5d9awg7f5awg5a1aec3e12a

wg4b6b


wg6d9

E33M60.50

wg4h5a

wg3h8ec3d3aec2c7

wg4d11tg1h12ec2e5bE38M62.461

wg3f7aE35M60.312

E38M62.189tg3c1ec3d3bE33M49.175E33M48.268E35M62.80E35M48.143wg1g4a

E33M47.182b

E38M50.119wg7b3

E33M59.64

ec3g3c

ec2h2bE32M48.212

wg1g5btg6c3bec5a1wg6f3

E32M62.115E33M62.250E32M62.186

wg2b7wg8h5

wg3h4

tg2h10a

tg5e11aE32M50.90ec2d1b

E32M50.77

wg1g4c

wg8g1awg2c3wg7f3bwg4h1

ec4e7bwg5a6

ec2c12

wg2d11a

ec2e12a

wg7a8a

isoLapE33M62.176

E35M48.84E33M49.293E35M62.136

eru2E38M50.186

ec4f11

E32M50.252

E32M59.107

wg1g8b

wg2g9E33M50.282E35M48.123wg1e3wg6d6wg4f4a

ec5c4E35M48.198E35M62.135wg1a4ec2e12bwg3h6

wg4d5awg5b1bE33M61.54

ec3d2E32M48.191E33M59.333

wg6e3bec4d11E32M59.88

ec4g4wg1g4b

ec3g12

ec3g3a

wg9f2

E35M62.222

wg4a4aE33M59.234E33M61.84wg4d7b

ec5e12bec4c11wg6e3a

E32M48.69

ec3b2b

E32M47.186ec4d9

wg4d5c

E33M48.67E35M60.329wg1h4b

E38M62.188

E32M50.261

E33M50.118

wg1g3a

E35M60.230wg6a11wg6h1a

E32M62.241E32M47.288

E33M48.316E33M59.225ec2e4c

ec3e12bwg5a1bwg2a3c

ec5e12awg7f5bE32M47.159tg5d9bslg6E35M59.85

ec2d8bE35M62.132

E35M47.337E35M47.257wg9e9

ec2b3E33M60.229

E32M50.325

wg6c1ec3b2aE35M47.170

wg2d5a

E33M60.120

E33M47.115

wg1g10c

E33M62.130E32M47.344

E32M50.255

wg3g9E32M50.424

pr2

tg5b2

E33M59.165E35M60.125E33M62.196

ec2h2cwg3f7b

ec3f1

E35M60.107

wg1g2

E33M48.346E33M50.371

E33M47.138

tg4d2bE32M62.394E33M47.189

E32M49.409wg7b6b

0

50

100

150

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19chromosomes

loci

outcomes


43

Some concluding remarks

Perspectives

1. ModellingI Generalized Fused-Lasso penaltyI Automatic inference of LI Environment? Multiparental aspect, Multi-population ?

2. Technical algorithmic pointsI active set strategy in the alternating algorithmI smart screening of irrelevant predictorsI full C++ implementation

3. Applications to regulatory motifs discovery

I Y is a matrix of q microarrays for n genes (the individuals),I X is the matrice of motif counts in the promotor of each gene.I L is a matrix based upon editing distance between motifs.

A first attempt is made in the paper but we would like to considerlarge scale problems (10s/100s of q , 1000s of n, 10,000s of p).

44

Thanks

Hiring! We are looking for a post-doc with strong background inOptimization and Statistics.

Thanks to you for your patience and to my co-workers

45

Science

Structured Regularization for conditional Gaussian graphical model