16
Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Embed Size (px)

Citation preview

Page 1: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Inference

Mary M. Whiteside, Ph.D.Nonparametric Statistics

Page 2: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Two Sides of InferenceParametric

Interval estimation, xbar Hypothesis testing, 0

Nonparametric Interval estimates, EDF Hypothesis testing, P(X<Y) > P(X>Y)

Page 3: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Meaning of Nonparametric

Not about parametersMethods for non-normal distributionsMethods for ordinal data

Data ScalesNominal, categorical, qualitativeOrdinalIntervalRatio - natural zero

Page 4: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Random Sample - Type 1Random sample from a finite population

Simple Stratified Cluster

Inferences are about the finite population Audit comprised of a sample from a

population of invoices Public opinion polls QC samples of delivered goods

Page 5: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Random Sample - Type 2

Observations of (iid) random variablesInferences are about the probability

distributions of the random variables Weekly average miles per gallon for your

new Lexus Chi square tests of independence in

medical treatment offered men and women Effect of female literacy on infant mortality

worldwide

Page 6: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Transition from data sets to distributions

All random variables, by definition, have probability functions (pmf or pdf) and cumulative probability distributions

Random variables defined on a random sample (Type 1 or 2) are called statistics with probability distributions that are called sampling distributions

Page 7: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Sampling DistributionsStatistics support both sides of

inferenceEstimators - random variables used

to create interval estimatesTest statistics - random variables

used to test hypotheses

Page 8: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Consider Xbar - a parametric statistic

Type I sample - subset of invoices where X = sales tax paid on an invoice randomly selected from a finite population Xbar is the average sales tax of n randomly selected

invoices Xbar is an estimator of , the average sales tax paid

for the population of invoices (with standard deviation )

Xbar is a test statistic for testing hypotheses H0: = 0

Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean and standard deviation n

Page 9: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Consider Xbar - a parametric statistic

Type 2 sample - the complete set of miles per gallon observations made by you since buying your Lexus where X = mpg for your Lexus in a given week Xbar is the average mpg for n observations of X Xbar is an estimator of the expected value (X) of the

RV X Xbar is a test statistic for testing hypotheses

H0: = 0

Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean X and standard deviation X/ n

Page 10: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

X in the Type 1 sample

If X from a Type 1 sample is regarded as a random variable, then it has the discrete uniform distribution

Prob [X = x] = 1/N for all x in the population (where the N values of x are assumed to be unique)

Page 11: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Order statistics of rank k - a nonparametric statistic

the kth order statistic is the kth smallest observation

the first order statistic is the smallest observation in a sample

the nth order statistic is the largestLarge body of literature on sampling

distributions of order statistics

Page 12: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

EstimationDefinitions

EDF pth sample quantile sample mean, variance, and standard

deviation unbiased estimators (S2 and s2)

Page 13: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Intervals for parameter estimation

(point estimate - r*standard error of the estimator, point estimate +q*standard error of the point estimate) where r is the /2 quantile and q is the (1-/2) quantile from the sampling distribution of the estimator r equals -q in symmetric distributions with

mean 0 (z = +/- 1.96 or t = +/-2.02581) r does not equal -q in skewed distributions

such as Chi squared and F

Page 14: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Sampling distribution of the estimator

Parametric procedures - Assumed normal or normal based from the Central Limit Theorem and sample size Xbar is approximately normal if n is large Xbar is t if X is normal and is unknown Xbar’s distribution is unknown if X’s

distribution is unknown and n is small

Page 15: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Sampling distribution of the estimator

Nonparametric distribution-free procedures I.e. the sampling distribution of the statistic (estimator or test statistic) is “free” from the distribution of X rank order statistics bootstrapped distributions - /2 and 1-

/2 quantiles

Page 16: Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Parametric vs nonparametric sampling distributions

Exact distributions with approximate models

Exact distributions with exact models (but usually small samples)

orAsymptotic distributions with exact

models