Upload
cordelia-griffin
View
223
Download
2
Embed Size (px)
DESCRIPTION
Statistical Data Analysis 3 Today’s topics: Bootstrap (Chapter 4: 4.3, 4.4) 4. Bootstrap 4.1. Simulation (read yourself) (last week) 4.2. Bootstrap estimators for distribution (last week) 4.3. Bootstrap confidence intervals 4.4. Bootstrap tests
Citation preview
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis
2011/2012
M. de Gunst
Lecture 4
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 2
Statistical Data Analysis: Introduction
TopicsSummarizing dataExploring distributions Bootstrap (continued)Robust methodsNonparametric testsAnalysis of categorical dataMultiple linear regression
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 3
Today’s topics: Bootstrap (Chapter 4: 4.3, 4.4)
4. Bootstrap4.1. Simulation (read yourself) (last week)4.2. Bootstrap estimators for distribution (last week)4.3. Bootstrap confidence intervals4.4. Bootstrap tests
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 4
Bootstrap: recap (1)
Situation realizations of , independent, unknown distr.
P
Bootstrap to estimate distribution of
estimator or test statistic
Which steps? First errorSecond errorStep 1. Estimate by
Step 2. Estimate by i.e. by empirical distribution of
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 5
Bootstrap: recap (2)
Step 1: Determine theoretical bootstrap estimator
empirical distributioni) Estimate P by parametric distribution, parameter estimated stochastic: estimator
ii) Estimate by
stochastic: bootstrap estimator
First error
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 6
Bootstrap: recap (3)
Step 2: From estimator to estimate: fixed
i) If has explicit expression, then done ii) If not, then estimate the estimate: use bootstrap (sampling) scheme to estimate
where and from by empirical distribution of , is stochastic: estimator empirical distr. of simulated realizations of is estimate
Second error
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 7
Bootstrap: recap (4)
Obtain empirical distr. of simulated realizations of
with bootstrap (sampling) scheme:
With the B bootstrap values get impression of (characteristics of) unknown distribution of Tn:
draw histogram compute sample variance compute sample sd
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 8
4.3. Bootstrap confidence intervals (1)
Tn : estimator of unknown parameter θ
Seen: accuracy of estimator Tn : variance of estimator’s distribution
Now: accuracy of estimator Tn : confidence interval
(1 - 2α)x100% confidence interval for θ is interval around Tn such that it contains `true’ θ with probability > 1 - 2α
If interval is [Tn - b1, Tn + b2], how to determine b1 and b2?
(blackboard)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 9
Bootstrap confidence intervals (2)
(1 - 2α)x100% confidence interval for θ is interval around Tn such that it contains `true’ θ with probability > 1 - 2α
If interval is [Tn - b1, Tn + b2], then b1 and b2 determined by
[Tn - b1, Tn + b2] =
with , the distribution of Tn – θ,
So b1 and –b2 are quantiles of unknown distribution
How to estimate the quantiles b1 and –b2?
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 10
Bootstrap confidence intervals (3)
Interval is [Tn - b1, Tn + b2] =
How to estimate quantiles b1 and –b2 of unknown distribution of Tn – θ?
Estimate with , use bootstrap
Givesestimate of conf interval: (4.1)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 11
Estimate of conf interval: (4.1)
In practice, determine in steps:
1. Estimate unknown distribution of Tn – θ with ,: use bootstrap
Same as before? No: Tn – θ , need bootstrap values
2. Estimate quantiles by empirical quantiles of bootstrap values
3. Bootstrap confidence interval:
Bootstrap confidence intervals (4)
(4.2)(You have to know this formula!!)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 12
Estimate of confidence interval:Correspondingbootstrap confidence interval:
This is original bootstrap confidence interval, also called reflection method
Other method: percentile methodEstimate of confidence interval:Correspondingbootstrap confidence interval:
Only suitable if symmetric around 0. (Asymptotically two methods give same result)
Bootstrap confidence intervals (5)
(4.2)
(4.1)
We will use!!
We just discussed:
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 13
Bootstrap confidence intervals (5)
How to obtain the (sample) α-quantile ?
R: if zstar contains the bootstrap values > quantile(zstar, α)
Note: always same function of as of
For two samples and Y1 , . . . , Ym method is same
Example: if Tn,m = Xn-Ym, then Tn,m* = Xn * - Ym *
and Zn* = Xn * - Ym * - (Xn-Ym ) (cf. Example 4.4. in Reader)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 14
4.4. Bootstrap Tests (1)
Remember last week’s slide:
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 15
From lecture 3: Kolmogorov-Smirnov test (5)
Data: yH0: F is normal ← composite null hypothesisH1 : F is not normal Test statistic:
R:> ks.test(y,pnorm)D = 0.6922, p-value = 6.661e-16
> ks.test(y,pnorm,mean=mean(y),sd=sd(y))D = 0.1081, p-value = 0.5655> mean(y)[1] 3.62158> sd(y)[1] 3.043356
adj
Incorrect: this is test for H0: F = N(0,1) H1: F ≠ N(0,1)
Incorrect : this is test for
H0: F = N(3.62158,(3.04335)2)
H1: F ≠ N(3.62158,(3.04335)2)
of y
Example
We have not used Dadj ! ! p-value should be
0.126 (next week)
Correct?
Correct?
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 16
Bootstrap Tests (2)
Solve this with bootstrap test!
General idea on blackboard
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 17
Bootstrap Tests (3) Example
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 18
Bootstrap Tests (4)
> hist(dprec, prob=T)> qqnorm(dprec)
Example
dprec
dprec
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 19
Bootstrap Tests (5) Example
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 20
Bootstrap Tests (6) Example
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 21
Bootstrap Tests (7) Example
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 22
Bootstrap Tests (8) Example
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 23
Bootstrap Tests (9) Example
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 24
Recap
Bootstrap4.3. Bootstrap confidence intervals4.4. Bootstrap tests
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 25
Bootstrap
The end