Upload
eitan
View
63
Download
0
Embed Size (px)
DESCRIPTION
(Better) Bootstrap Confidence Intervals. Shachar Kaufman Based on Efron and Tibshirani’s “An introduction to the bootstrap” Chapter 14. TAU Bootstrap Seminar 2011 Dr. Saharon Rosset. Agenda. What’s wrong with the simpler intervals? The (nonparametric) BC a method - PowerPoint PPT Presentation
Citation preview
(Better)Bootstrap Confidence Intervals
TAU Bootstrap Seminar 2011Dr. Saharon Rosset
Shachar Kaufman
Based on Efron and Tibshirani’s “An introduction to the bootstrap”
Chapter 14
Agenda
• What’s wrong with the simpler intervals?• The (nonparametric) BCa method• The (nonparametric) ABC method– Not really
Example: simpler intervals are bad
Example: simpler intervals are bad
Under the assumption that i.i.d.
Have exact analytical interval Can do parametric-bootstrap
Under the assumption that i.i.d.
Can do nonparametric bootstrap
Why are the simpler intervals bad?• Standard (normal) confidence interval
assumes symmetry around • Bootstrap-t often erratic in practice– “Cannot be recommended for general nonparametric
problems”• Percentile suffers from low coverage– Assumes nonp. distribution of is representative of (e.g. has
mean like does)• Standard & percentile methods assume homogenous
behavior of , whatever is– (e.g. standard deviation of does not change with )
A more flexible inference model
MeanStandard deviationSkewness
Account for higher-order statistics
�̂�∗
A more flexible inference model• If doesn’t work for the data, maybe we could find a transform
and constants and for which we can accept that
• Additional unknowns– allows a flexible parameter-description scale– allows bias: – allows “” to change with
• As we know, “more flexible” is not necessarily “better”• Under broad conditions, in this case it is (TBD)
Where does this new model lead?
Assume known and , and initially that , hence
Calculate a standard -confidence endpoint from this
Now reexamine the actual stdev, this time assuming that
According to the model, it will be
Where does this new model lead?
Ok but this leads to an updated endpoint
Which leads to an updated
If we continue iteratively to infinity this way we end up with the confidence interval endpoint
Where does this new model lead?
• Do this exercise considering and get
• Similarly for with
Enter BCa
• “Bias-corrected and accelerated”• Like percentile confidence interval– Both ends are percentiles , of the bootstap
instances of – Just not the simple
BCa
• Instead
• and are parameters we will estimate– When both zero, we get the good-old percentile
CI• Notice we never had to explicitly find
BCa
• tackles bias
(since is monotone)
• accounts for a standard deviation of which varies with (linearly, on the “normal scale” )
BCa
• One suggested estimator for is via the jackknife
where
and
• You won’t find the rationale behind this formula in the book (though it is clearly related to one of the standard ways to define skewness)
Theoretical advantages of BCa
• Transformation respecting– If the interval for is then the interval for a
monotone is – So no need to worry about finding transforms of
where confidence intervals perform well • Which is necessary in practice with bootstrap-t CI • And with the standard CI (e.g. Fisher corrcoeff trans.)• Percentile CI is transformation respecting
Theoretical advantages of BCa
• Accuracy– We want s.t. – But a practical is an approximation where
– BCa (and bootstrap-t) endpoints are “second order accurate”, where
– This is in contrast to the standard and percentile methods which only converge at rate (“first order accurate”) errors one order of magnitude greater
But BCa is expensive
• The use of direct bootstrapping to calculate delicate statistics such as and requires a large to work satisfactorily
• Fortunately, BCa can be analytically approximated (with a Taylor expansion, for differentiable ) so that no Monte Carlo simulation is required
• This is the ABC method which retains the good theoretical properties of BCa
The ABC method
• Only an introduction (Chapter 22)• Discusses the “how”, not the “why”• For additional details see Diciccio and Efron
1992 or 1996
The ABC method
• Given the estimator in resampling form
– Recall , the “resampling vector”, is an dimensional random variable with components
– Recall • Second-order Taylor analysis of the estimate – as a function of the bootstrap resampling
methodology
The ABC method
• Can approximate all the BCa parameter estimates (i.e. estimate the parameters in a different way)
– , where
• something akin to a Hessian component but along a specific direction not perpendicular to any natural axis (the “least favorable family” direction)
The ABC method
• And the ABC interval endpoint
• Where– with
• Simple and to the point, aint it?