Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Luc Perreault (Hydro-Québec Research Institute)
James Merleau (Hydro-Québec Research Institute)
Guillaume Évin (University of NewCastle)
Mixture of normal, gamma and Gumbel distributions
Workshop on Statistical Methods for Meteorology and Climate Change
12-14 January 2011
2 Groupe – Technologie
Outline
> Motivation
> Models, estimation and selection
> Applications• Spring floods
• Tree-rings time series
> Conclusions and future work
3 Groupe – Technologie
1
0
( | , , ) ( | , )K
t k t kk
g y d w f y d−
=
=∑w θ θ
( | , )t kf y dθ
where
wk
typrobability that yt belongs to population k
observation at time t
probability density function for a given family d with vector of parameters θθθθk
K-component mixture Titterington et al. (1985), Robert (1996), Stephens (2000), Marin and Robert (2007)
Motivation
4 Groupe – Technologie
> Probabilistic structure of some observed hydrometeorologicaldata can be complex
• Hydrologists use elaborate parameterised distributions (log-Pearson Type 3, Wakeby, Kappa, Halphen,…)
• Some authors have also proposed nonparametric approaches (Adamowski, 1985; O’Connell, 2005)
Model flexibility
200 250 300 350 400 4500
2
4
6
8
10
12
Fre
quen
cy
1970 1980 1990 2000150
200
250
300
350
400
450
500
Vol
umes
(hm
3 )
Spring volumes of the Aux Écorces River (m3/s)
Motivation
5 Groupe – Technologie
> Mixture components can be interpreted
> Mixture models address formally the problem of non-homogeneous data
Spring peak discharge of the Moisie River (m3/s)
Heterogeneity
1500 2000 2500 3000 35000
2
4
6
8
10
Fre
quen
cy
1970 1980 1990 20001000
1500
2000
2500
3000
3500
4000
Spr
ing
flood
pea
k (m
3 /s)
Motivation
6 Groupe – Technologie
Motivation
> Significant changepoints in annual inflows for numerous watersheds in Québec
> Confirmed in recent studies• Perreault et al. (2000, 2007)
• Jandhyala et al. (2009)
• Saïd et al. (2009)
• Jalbert et al. (2010)
From Jalbert et al. (2010)
No changepoints
Changepoints
7 Groupe – Technologie
Motivation
> Alternatives to normal mixtures
• gamma and Gumbel
> With and without persistence
> Automatic model selection
> Bayesian framework
Objectives
0 5 10 150
0.1
0.2
0 5 100
0.2
0.4normale
-5 0 5 10 150
0.2
0.4
0 500
0.02
0.04
0 10 200
0.05
0.1gamma
0 10 200
0.5
0 5 10 150
0.2
0.4
0 10 200
0.2
0.4Gumbel
0 5 100
0.2
0.4
8 Groupe – Technologie
( )Berntz w w∼
( ), ,t t k t ky z k f y d= θ θ∼
[ ]Pr 0tw z= =
Latent variables
Observations
( ),k wπ θ
Prior
θθθθzt
y1 y2 yt yn… …
…z1 z2 zt zn…
w
Models, estimation and selection0 1( | , , ) ( | , ) (1 ) ( | , )t t tg y w d wf y d w f y d= + −θ θ θ
EM or
MCMC
Introducing latent variables (i.d. case)( ) ( ) ( )1
0 1| , , | , | ,t tz z
t t t tg y z d f y d f y d−=θ θ θ
9 Groupe – Technologie
Latent variables
( )1, Markovt tz z − W W∼
00 01
10 11
1 1
1 1
Pr( 0 | 0) Pr( 1| 0)
Pr( 0 | 1) Pr( 1| 1)t t t t
t t t t
w w
w w
z z z z
z z z z− −
− −
=
= = = = = = = = =
W
Models, estimation and selection
θθθθzt
y1 y2 yt yn… …
…z1 z2 zt zn…
W
Introducing latent variables (d. case)
EM or
MCMC
10 Groupe – Technologie
( )1( | , , 2) exp / , 0 ,( )t t t tf y y y y
φφ
φφζ φ φ ζ
ζ φ−= − ≤ < ∞
Γ
2( | , , 1) exp ( ) ,2 2t t tf y y yφ φζ φ ζπ
= − − −∞ < < ∞
Models, estimation and selection
[ ]{ } )(exp)(exp)3 ,,|( EtEtt cycyyf −−−−−−−= ζφζφφφζ
Parameterization Nelder et Wedderburn (1972)
> normal (d = 1)
> gamma (d = 2)
> Gumbel (d = 3)
1
0
( | , , ) ( | , )K
t k t kk
g y d w f y d−
=
=∑w θ θ
11 Groupe – Technologie
> Normal• Conjugate priors for all parameters
• Gibbs sampling
> Gamma• Use of the Stirling approximation : conjugate priors
for all parameters
• Gibbs sampling
> Gumbel• No conjugate prior for 1 parameter
• Metropolis-Hasting within Gibbs sampling
Models, estimation and selection
Estimation : MCMC
12 Groupe – Technologie
> Which family of probability distribution shouldbe considered ?
> How many components ?• Do significant changepoints have occured in the
time series ?
• How many changepoints and switching regimesoccured ?
Models, estimation and selection
Model selection
13 Groupe – Technologie
where ΛΛΛΛ = (all the parameters and latent variables)
> Evaluation of the marginal densities
( )1
01
( ) ( | , )n K
k t kkt
m w f y d dπ−
==
= ∑∏∫y θ Λ Λ
[Évin et al., 2010]
Models, estimation and selection
Model selection
• Chib (1995) when full conditional posterior distributions are directly sampled
• Chib et Jeliazkov (2001) generalize this method when full conditional posterior distributions contain an unknown constant of proportionality.
14 Groupe – Technologie
Peaks Moisie River
YES
NO
NO
Dep.
-279-277-2792
-281-280-2812
-281-279-2791
Gumbelgammanormalk
Log-marginals for the Moisie Riverspring flood peaks log[m(y)]
1000 1500 2000 2500 3000 35000
1
2
3
4
5
6
7
8
9
10
Fre
quen
cy
1970 1975 1980 1985 1990 1995 2000 20051000
1500
2000
2500
3000
3500
Year
Sp
ring
flo
od
pe
ak
(m3 /s
)
Applications : Spring flood
15 Groupe – Technologie
Peaks Moisie River
1000 1500 2000 2500 3000 35000
1
2
3
4
5
6
7
8
9
10
Fre
quen
cy
1970 1975 1980 1985 1990 1995 2000 20051000
1500
2000
2500
3000
3500
Year
Sp
ring
flo
od
pe
ak
(m3 /s
) Inference about the parameters for the gamma mixture with persistence
Applications : Spring flood
16 Groupe – Technologie
Peaks Moisie River
1965 1970 1975 1980 1985 1990 1995 2000 20051000
2000
3000
Spr
ing
flood
pea
ks (
m3 /s
)1965 1970 1975 1980 1985 1990 1995 2000 20050
0.5
1
Pro
b(Z
= 0
| y)
1965 1970 1975 1980 1985 1990 1995 2000 20050
0.5
1
Year
Pro
b(Z
= 1
| y)
FitMixture of gammawith persistence
Classification
1000 1500 2000 2500 3000 35000
1
2
3
4
5
6
7
8
9
10
Fre
quen
cy
Applications : Spring flood
17 Groupe – Technologie
Peaks Moisie River
1965 1970 1975 1980 1985 1990 1995 2000 20051000
2000
3000
Spr
ing
flood
pea
ks (
m3 /s
)1965 1970 1975 1980 1985 1990 1995 2000 20050
0.5
1
Pro
b(Z
= 0
| y)
1965 1970 1975 1980 1985 1990 1995 2000 20050
0.5
1
Year
Pro
b(Z
= 1
| y)
FitMixture of gammawith persistence
Classification
1000 1500 2000 2500 3000 35000
1
2
3
4
5
6
7
8
9
10
Fre
quen
cy
Applications : Spring flood
18 Groupe – Technologie
> Have the changepoints and regimesobserved in hydrometeorological reallyoccured ?
> No more than 50 years of data !!
> Archives Project[Bégin et al., 2010]
Applications : Tree-rings
19 Groupe – Technologie
Black spruce tree-ring time series [Bégin et al., 2010]
Applications : Tree-rings
20 Groupe – Technologie
14012664805793155-29924
54413566825995159-26933
13113550804794159-31932
-201104-1859-260140-82841
HERRT630ROZXROZMROZIHM2HM1DA1XDA1MCANEK
Sites
Log-marginals for tree-rings time series log[m(y)](normal mixtures with persistence)
Model selection : How many components ?
Applications : Tree-rings
21 Groupe – Technologie
ClassificationInference : site ROZM
1850 1870 1890 1910 1930 1950 1970 1990 20040.5
1
1.5
Sta
nd. t
ree
-rin
g w
idth
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 0
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 1
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 2
)
Year
Parameters
0.8 1 1.2 1.4 1.60
5
10
15
20
25
30
Pos
terio
r de
nsity
Mean
0 0.05 0.1 0.15 0.2 0.250
10
20
30
40
50
Pos
terio
r de
nsity
Sigma
Applications : Tree-rings
22 Groupe – Technologie
ClassificationInference : site ROZM
1850 1870 1890 1910 1930 1950 1970 1990 20040.5
1
1.5
Sta
nd. t
ree
-rin
g w
idth
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 0
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 1
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 2
)
Year
Parameters
0.8 1 1.2 1.4 1.60
5
10
15
20
25
30
Pos
terio
r de
nsity
Mean
0 0.05 0.1 0.15 0.2 0.250
10
20
30
40
50
Pos
terio
r de
nsity
Sigma
Applications : Tree-rings
23 Groupe – Technologie
Inference : site ROZMParameters
1840 1860 1880 1900 1920 1940 1960 1980 2000 20200.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Year
Sta
nda
rdiz
ed
tree-
ring
wid
th
Smoothing with 90% credibility interval
( )0
ˆsm PrK
t t kk
z k µ=
= =∑
0.8 1 1.2 1.4 1.60
5
10
15
20
25
30
Pos
terio
r de
nsity
Mean
0 0.05 0.1 0.15 0.2 0.250
10
20
30
40
50
Pos
terio
r de
nsity
Sigma
Applications : Tree-rings
24 Groupe – Technologie
ClassificationInference : site DA1X
1850 1870 1890 1910 1930 1950 1970 1990 20040.5
0.6
0.7
0.8
0.9
Sta
nd.
tree
-rin
g w
idth
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 0
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 1
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 2
)
Year
Parameters
0.4 0.5 0.6 0.7 0.8 0.90
5
10
15
20
25
30
35
Pos
terio
r de
nsity
Mean
0 0.02 0.04 0.06 0.08 0.10
10
20
30
40
50
60
70
Pos
terio
r de
nsity
Sigma
Applications : Tree-rings
25 Groupe – Technologie
ClassificationInference : site DA1X
1850 1870 1890 1910 1930 1950 1970 1990 20040.5
0.6
0.7
0.8
0.9
Sta
nd.
tree
-rin
g w
idth
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 0
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 1
)
1850 1870 1890 1910 1930 1950 1970 1990 20040
0.5
1
Pr(
z t = 2
)
Year
Parameters
0.4 0.5 0.6 0.7 0.8 0.90
5
10
15
20
25
30
35
Pos
terio
r de
nsity
Mean
0 0.02 0.04 0.06 0.08 0.10
10
20
30
40
50
60
70
Pos
terio
r de
nsity
Sigma
Applications : Tree-rings
26 Groupe – Technologie
Inference : site DA1XParameters
1840 1860 1880 1900 1920 1940 1960 1980 2000 20200.4
0.5
0.6
0.7
0.8
0.9
1
Year
Sta
nda
rdiz
ed
tree-
ring
wid
th
Smoothing with 90% credibility interval
0.4 0.5 0.6 0.7 0.8 0.90
5
10
15
20
25
30
35
Pos
terio
r de
nsity
Mean
0 0.02 0.04 0.06 0.08 0.10
10
20
30
40
50
60
70
Pos
terio
r de
nsity
Sigma
( )0
ˆsm PrK
t t kk
z k µ=
= =∑
Applications : Tree-rings
27 Groupe – Technologie
Conclusions
> Alternatives to normal mixtures
• better fit to environmental r.v.
• parsimony
> Automatic model selection
• normal, gamma and Gumbel mixtures
• i.d. and with Markovian dependence
• number of components
> Can be generalized to other distributions
28 Groupe – Technologie
References
> Évin, G., Merleau, J. and Perreault, L. (2010). Mixtures of normal, gamma,and Gumbel distributions. Submitted to Water ResourcesResearch.
> Perreault, L., R. Garçon and J. Gaudet (2007). Modelling hydrologictime series using regime switching models and measures of atmospheric circulation . La Houille Blanche, 6 : 111-123.