Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Preview:

Citation preview

Inference

Mary M. Whiteside, Ph.D.Nonparametric Statistics

Two Sides of InferenceParametric

Interval estimation, xbar Hypothesis testing, 0

Nonparametric Interval estimates, EDF Hypothesis testing, P(X<Y) > P(X>Y)

Meaning of Nonparametric

Not about parametersMethods for non-normal distributionsMethods for ordinal data

Data ScalesNominal, categorical, qualitativeOrdinalIntervalRatio - natural zero

Random Sample - Type 1Random sample from a finite population

Simple Stratified Cluster

Inferences are about the finite population Audit comprised of a sample from a

population of invoices Public opinion polls QC samples of delivered goods

Random Sample - Type 2

Observations of (iid) random variablesInferences are about the probability

distributions of the random variables Weekly average miles per gallon for your

new Lexus Chi square tests of independence in

medical treatment offered men and women Effect of female literacy on infant mortality

worldwide

Transition from data sets to distributions

All random variables, by definition, have probability functions (pmf or pdf) and cumulative probability distributions

Random variables defined on a random sample (Type 1 or 2) are called statistics with probability distributions that are called sampling distributions

Sampling DistributionsStatistics support both sides of

inferenceEstimators - random variables used

to create interval estimatesTest statistics - random variables

used to test hypotheses

Consider Xbar - a parametric statistic

Type I sample - subset of invoices where X = sales tax paid on an invoice randomly selected from a finite population Xbar is the average sales tax of n randomly selected

invoices Xbar is an estimator of , the average sales tax paid

for the population of invoices (with standard deviation )

Xbar is a test statistic for testing hypotheses H0: = 0

Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean and standard deviation n

Consider Xbar - a parametric statistic

Type 2 sample - the complete set of miles per gallon observations made by you since buying your Lexus where X = mpg for your Lexus in a given week Xbar is the average mpg for n observations of X Xbar is an estimator of the expected value (X) of the

RV X Xbar is a test statistic for testing hypotheses

H0: = 0

Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean X and standard deviation X/ n

X in the Type 1 sample

If X from a Type 1 sample is regarded as a random variable, then it has the discrete uniform distribution

Prob [X = x] = 1/N for all x in the population (where the N values of x are assumed to be unique)

Order statistics of rank k - a nonparametric statistic

the kth order statistic is the kth smallest observation

the first order statistic is the smallest observation in a sample

the nth order statistic is the largestLarge body of literature on sampling

distributions of order statistics

EstimationDefinitions

EDF pth sample quantile sample mean, variance, and standard

deviation unbiased estimators (S2 and s2)

Intervals for parameter estimation

(point estimate - r*standard error of the estimator, point estimate +q*standard error of the point estimate) where r is the /2 quantile and q is the (1-/2) quantile from the sampling distribution of the estimator r equals -q in symmetric distributions with

mean 0 (z = +/- 1.96 or t = +/-2.02581) r does not equal -q in skewed distributions

such as Chi squared and F

Sampling distribution of the estimator

Parametric procedures - Assumed normal or normal based from the Central Limit Theorem and sample size Xbar is approximately normal if n is large Xbar is t if X is normal and is unknown Xbar’s distribution is unknown if X’s

distribution is unknown and n is small

Sampling distribution of the estimator

Nonparametric distribution-free procedures I.e. the sampling distribution of the statistic (estimator or test statistic) is “free” from the distribution of X rank order statistics bootstrapped distributions - /2 and 1-

/2 quantiles

Parametric vs nonparametric sampling distributions

Exact distributions with approximate models

Exact distributions with exact models (but usually small samples)

orAsymptotic distributions with exact

models

Recommended