Upload
sydney-flood
View
219
Download
3
Tags:
Embed Size (px)
Citation preview
Parametric Families of Distributions and Their Interaction
with the Workshop Title
Chris Jones
The Open University, U.K.
How the talk will pan out …
• it will start as a talk in distribution theory– concentrating on generating one family of
distributions• then will continue as a talk in distribution theory
– concentrating on generating a different family of distributions
• but in this second part, the talk will metamorphose through links with kernels and quantiles …
• … and finally get on to a more serious application to smooth (nonparametric) QR
• the parts of the talk involving QR are joint with Keming Yu
Set
Starting point: simple symmetric g
How might we introduce (at most two) shape parameters a and b which will account for skewness and/or “kurtosis”/tailweight (while retaining unimodality)?
Modelling data with such families of distributions will, inter alia, afford robust estimation of location (and maybe scale).
.1,0
FAMILY 1
g
Actual density of order statistic:
)1,(
))(1()()()(
1
iniB
xGxGxgxf
ini
Generalised density of order statistic:
),(
))(1()()()(
11
baB
xGxGxgxf
ba
)),((1 baBetaGX
(i,n integer)
(a,b>0 real)
Roles of a and b
• a=b=1: f = g
• a=b: family of symmetric distributions
• a≠b: skew distributions
• a controls left-hand tail weight, b controls right
• the smaller a or b, the heavier the corresponding tail
Properties of (Generalised) Order Statistic Distributions
• Distribution function: • Tail behaviour. For large x>0:
– power tails:– exponential tails:
• Limiting distributions:– a and b large: normal distribution– one of a or b large, appropriate extreme value
distribution
),()( baI xG
1)1( ~~ bxfxg
Other properties such as moments and modality need to be examined on a case-by-case basis
bxx efeg ~~
For more, see Jones (2004, Test)
Tractable Example 1
1
2/1
2
2/1
2
2),(
11
)(
ba
ba
babaB
xba
x
xba
x
xf
Jones & Faddy’s (2003, JRSSB) skew t density
When a=b, Student t density on 2a d.f.
Some skew t densities
4)2,2( tt )2,4(t
)2,8(t
)2,128(t
… and with a and b swopped
4)2,2( tt
)4,2(t
)8,2(t
)128,2(t
f = skew t density arises from ???
g
Yes, the t distribution on 2 d.f.!
2/32 )2(
1)(
xxg
221
2
1)(
x
xxG
)1(2
12)(
uu
uuQ
Tractable Example 2
Q: The (order statistics of the) logistic distribution generate the ???
A : Log F distribution– This has exponential tails
These examples, seen before, are therefore log F distributions
The log F distribution
bax
ax
e
e
baBxf
)1(),(
1)(
axexfx ~)(
bxexfx ~)(
The simple exponential tail property is shared by:
• the log F distribution
• the asymmetric Laplace distribution
• the hyperbolic distribution
Is there a general form for such distributions?
)0()0(exp)(
xbxIxaxIba
abxf
2122
exp)( xba
xba
xf
FAMILY 2: distributions with simple exponential tails
Starting point: simple symmetric g with distribution function G and
x
dttGxG .)()(]2[
General form for density is:
)()(exp)( ]2[ xGbaaxxf
Special Cases
• G is point mass at zero, G^[2]=xI(x>0)☺f is asymmetric Laplace• G is logistic, G^[2]=log(1+exp(x))☺f is log F• G is t_2, G^[2]=½(x+√(1+x^2))☺f is hyperbolic• G is normal, G^[2]= xΦ(x)+φ(x)• G uniform, G^[2]=½(1+x)I(-1<x<1)+I(x>1)
solid line: log Fdashed line: hyperbolic
dotted line: normal-based
Practical Point 1
• the asymmetric Laplace is a three parameter distribution; other members of family have four;
• fourth parameter is redundant in practice: (asymptotic) correlations between ML estimates of σ and either of a or b are very near 1;
• reason: σ, a and b are all scale parameters, yet you only need two such parameters to describe main scale-related aspects of distribution [either (i) a left-scale and a right-scale or (ii) an overall scale and a left-right comparer]
Practical Point 2
Parametrise by μ, σ, a=1-p, b=p.Then, score equation for μ reads:
This is kernel quantile estimation, with kernel G and bandwidth σ
n
i
iXGn
p1
1
Includes bandwidth selection by choosing σ to solve the second score equation:
But its simulation performance is variable:
n
i
ii
XGpX
n 1
)(1
And so to Quantile Regression:
The usual (regression) log-likelihood,
,)1(log1
]2[
n
iiiii XY
GXY
pn
is kernel localised to point x by
n
iiiiii xXY
GxXY
pnh
XxK1
1]2[1 )()()1(log
this (version of) DOUBLE KERNEL LOCALLINEAR QUANTILE REGRESSION satisfies
Writing )()()( 1
1
)(i
kn
j iki XxhKxXx
and ,)(1
)(
n
i
kik xS
.1,0,)(
)()( 11
)(0
k
YxXGxxpS iin
i
ki
Contrast this with Yu & Jones (1998, JASA) version of DKLLQR:
,1,0,)())()()((1
2120
k
YGxwxSxSxSp in
i i
).()()()()( 1)1(
2)0( xSxxSxxw iii where
The ‘vertical’ bandwidth σ=σ(x) can also be estimated by ML: solve
.)(
)]()[()( 111
)0(0
ii
ii
n
i i
YxXGpxXYxxS
Compare 3 versions of DKLLQR:
,~pq
,
0ˆ pq
,
1ˆ pq
Yu & Jones (1998) including r-o-t σ and h;
new version including r-o-t σ and h;
new version including above σ and r-o-t h.
Based on this limited evidence:
• Clear recommendation:– replace Yu & Jones (1998) DKLLQR method
by (gently but consistently improved) new version
• Unclear non-recommendation:– use new bandwidth selection?
References