Upload
gwyn
View
39
Download
0
Tags:
Embed Size (px)
DESCRIPTION
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS. Mathematical expectation. The mean (x) of random variable x is:. where n is the number of observations, the variance (s 2 ) is:. Mathematical expectation. The standard deviation ( s ) is:. The coefficient of variation is:. - PowerPoint PPT Presentation
Citation preview
The mean (x) of random variable x is:
where n is the number of observations, the variance (s2) is:
Mathematical expectationMathematical expectation
Mathematical expectationMathematical expectation
The standard deviation (s) is:
The coefficient of variation is:
Basic probability
The probability of a event occurring is expressed as: P(event)
The probability of the event not occurring, 1- P(event) or P(~event).
If events are independent, the probability of events A and B occurring is estimated as: p(A) * p(B).
The probability of catching a fish during a single event: p(capture)
On all three sampling occasions is: p(capture)*p(capture)*p(capture) = p(capture)3, the probability of not catching it during any of the 3 occasions is: (1- p(capture))*(1- p(capture))*(1- p(capture)) = (1- p(capture))3,
and the probability of catching it on at least 1 occasion is the complement of not catching it during any of the occasions: 1- (1- p(capture))3.
Probability example
Models and fisheries managementModels and fisheries management
“True” Models•Fundamental assumption: there is no “true” model that generates biological data
•Truth in biological sciences has essentially infinite dimension; hence, full reality cannot be revealed with finite samples.
•Biological systems are complex with many small effects, interactions, individual heterogeneity, and environmental covariates.
•Thus all models are approximations of reality
•Greater amounts of data are required to model smaller effects.
• Several models can represent a single hypotheses
Models = hypotheses
• Models are tools for evaluating hypotheses
• Models are very explicit representations of hypotheses
• Hypotheses are unproven theories, suppositions that are tentatively accepted to explain facts or as the basis for further investigation
Models and hypothesesModels and hypotheses
Hypothesis: shoal bass reproduction success is greater when there are more reproductively active adults
Y = aN
Y = aN/(1+bN)
Number of young is proportional to the number of adults
Number of young increases with the number of adultsuntil nesting areas are saturated
Y = aNe-bN Number of young is increases until the carrying capacity ofnesting and rearing areas is reached
Models and hypotheses: exampleModels and hypotheses: example
Tapering Effect SizesTapering Effect Sizes
• Biological systems there are often large important effects, followed by smaller effects, and then yet smaller effects.
• These effects might be sequentially revealed as sample size increases because information content increases
• Rare events yet are more difficult to study (e.g. fire, flood, volcanism)
Big effects
smalleffects
Model selection Model selection
• Determine what is the best explanation given the data
• Determine what is the best model for predicting the response
• Two approaches in fisheries/ecology
Null hypothesis testingInformation theoretic approaches
Null hypothesis testingNull hypothesis testing
Develop an a priori hypothesis
Deduce testable predictions (i.e., models)
Carry out suitable test (experiment)
Compare test results with predictions
Retain or reject hypothesis
Hypothesis testing example:Hypothesis testing example:Density independence for lake sturgeon populations Density independence for lake sturgeon populations
Hypothesis: lake sturgeon reproduction is density independent
Prediction: there is no relation between adult density and age 0 density
Test: measure age 0 density for various adult densities over time
Compare: Linear regression between age 0 and adult sturgeon densities, P value = 0.1839
Result: Retain hypothesis lake sturgeon reproduction is density independent
Using a critical -level = 0.05, we conclude no significant relationship
Model: Y = B0
Model selection based on p-valuesModel selection based on p-values
• No theoretical basis for model selection
• P-values ~ precision of estimate
• P-values strongly dependent on sample size
P(the data (or more extreme data)| Model) vs. L(model | the data)
JUST SAY NO TO STATISTICAL SIGNIFICANCE TESTINGJUST SAY NO TO STATISTICAL SIGNIFICANCE TESTING
If you really need a p-value….If you really need a p-value….
Mark implements likelihood ratio tests-nested models only
e.g., Full model: S = f(temperature, flow)Nested model: S = f(flow)
Ho: survival related flowHa: survival related to temperature and flow
Information theory Information theory
If full reality cannot be included in a model, how do we tell how close we are to truth.
Entropy is synonymous with uncertainty
truth
Kullback-Leibler distance based on information theory
The measures how much information is in accounted for in a model
K,L distance (information) is represented by: I (truth| model)
AIC is based on the concept of minimizing K-L distance
It represents information lost when the candidate model is used toApproximate truth thus SMALL values mean better fit
Akaike noticed that the maximum log likelihoodLog( L (model or parameter estimate | the data) ) was related to K-L distance
Information theory Information theory
Sums of squares in regression also is a measure of the relative fit of a model
What a maximum likelihood estimate?What a maximum likelihood estimate?
It is those parameter values that maximize the value of the likelihood,
given the data
05
1015202530354045
0 5 10
SSE = deviations2
The maximum log likelihood (and SSE) is a biased estimate of K-L distance
AIC = -2ln(L (model | the data)) + 2K
Akaike’s contribution was that he showed that:
AIC = -2ln(likelihood) + 2*K
Measures model lack of fit Penalty for increasing model size(enforces parsimony)
It is based on the principle of parsimony
Number of parametersFew Many
Varia
nce
Bia
s2
Heuristic interpretation
If ratio of n/K is < 40 then use AICc
AIC: Small sample bias adjustmentAIC: Small sample bias adjustment
AICc = -2*ln(likelihood | data) + 2*K + (2*K*(K+1))/(n-K-1)
As n gets big….
(2*K*(K+1))/(n-K-1) = 1/very large number
(2*K*(K+1))/(n-K-1) = 0
So…. AICc = AIC
AIC by itself is relatively meaningless.Recall that we find the best model by comparing various models and examiningTheir relative distance to the “truth”
Model selection with AICModel selection with AIC
What is model selection?
We do this by calculating the difference between the best fitting model (lowest AIC) and the other models.
Model selection uncertainty
Which model is the best?What about if you collect data at the same spot next year,next week, next door?
AIC weights-- long run interpretation vs. Bayesian.
Confidence set of models analogous to confidence intervals
Interpreting AICInterpreting AIC
Best model (lowest AICc)
Difference between lowest AIC and model(relative distance from truth)
Interpreting AICInterpreting AIC
AICc weight, ranges 0-1 with 1 = best model
Interpreted a relative likelihood that model is best, given the data and the other models in the set
Interpreting AICInterpreting AIC
Ratio of 2 weights interpreted as the strength of evidence for one model over another
Here the best model is 0.86748/0.13056 = 6.64 times more likely to be The best model for estimating striped bass population size
Confidence model setConfidence model set
Using a 1/8 (0.12) rule for weight of evidence, my confidence set includes the top two models (both model likelihoods > 0.12).
Analogous to a confidence interval for a parameter estimate
Linear models reviewLinear models review
Y: response variable (dependent variable)
X: predictor variable (independent variable)Y = 0 + 1*X + e
0 is the intercept1 is the slope (parameter) associated with Xe is the residual error
Linear models reviewLinear models review
When Y is a probability it is bounded by 0, 1
Y = 0 + 1*XCan provide values <0 and > 1, we need to transformor use a link function
For probabilities, the logit link is the most useful
Log linear modelsLog linear models(logistic regression)(logistic regression)
= 0 + 1*X is the log odds
0 is the intercept1 is the slope (parameter) associated with X
Betas are on a logit scale and the log-odds needs to be back transformed
Back transformation:Back transformation:Inverse logit linkInverse logit link
-1
1+exp( )
is the log odds p is the probability of an event
p =
Back transformation exampleBack transformation example
= -2.5 + 0.5*2 = -1.5
11+exp(1.5)= 0.18 or 18%
Interpreting beta estimatesInterpreting beta estimates
exp(0.5) = 1.65
Betas are on a logit scale, to interpret calculate odds ratiosUsing the exponential function
1 = 0.5
Interpretation: for each 1 unit increase in X, the event is 1.65 times more likely to occur
For example, for each 1 inch increase in length, a fish is 1.65 times more likely to be caught
OverdispersionOverdispersion
Extra variabilityMissing covariatesHeterogeneity in S, p, etc.
Possible solutionsInclude additional covariatesHeterogeneity modelsc-hat adjustment in MARK
quasi-AIC (QAIC)variances and confidence intervals