89

4-1 Statistical Inference

Embed Size (px)

DESCRIPTION

4-1 Statistical Inference. The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population . These methods utilize the information contained in a sample from the population in drawing conclusions. - PowerPoint PPT Presentation

Citation preview

Page 1: 4-1 Statistical Inference
Page 2: 4-1 Statistical Inference
Page 3: 4-1 Statistical Inference
Page 4: 4-1 Statistical Inference

4-1 Statistical Inference

• The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.

• These methods utilize the information contained in

a sample from the population in drawing conclusions.

Page 5: 4-1 Statistical Inference

4-1 Statistical Inference

Page 6: 4-1 Statistical Inference

4-2 Point Estimation

Page 7: 4-1 Statistical Inference

4-2 Point Estimation

Page 8: 4-1 Statistical Inference

4-2 Point Estimation

Page 9: 4-1 Statistical Inference

4-3 Hypothesis Testing

We like to think of statistical hypothesis testing as the data analysis stage of a comparative experiment, in which the engineer is interested, for example, in comparing the mean of a population to a specified value (e.g. mean pull strength).

4-3.1 Statistical Hypotheses

Page 10: 4-1 Statistical Inference

4-3 Hypothesis Testing

For example, suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems.

• Now burning rate is a random variable that can be described by a probability distribution.

• Suppose that our interest focuses on the mean burning rate (a parameter of this distribution).

• Specifically, we are interested in deciding whether or not the mean burning rate is 50 centimeters per second.

4-3.1 Statistical Hypotheses

Page 11: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.1 Statistical HypothesesTwo-sided Alternative Hypothesis

One-sided Alternative Hypotheses

Page 12: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.1 Statistical HypothesesTest of a Hypothesis • A procedure leading to a decision about a particular

hypothesis

• Hypothesis-testing procedures rely on using the information in a random sample from the population of interest.

• If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false.

Page 13: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 14: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 15: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Sometimes the type I error probability is called the significance level, or the -error, or the size of the test.

Page 16: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical HypothesesReduce α1. push critical regions further2. Increasing sample size increasing Z value

Page 17: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 18: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

error when μ =52 and n=16.

Increasing the sample size results in a decrease in β.

Page 19: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

error when μ =50.5 and n=10.

β increases as the true value of μ approaches the hypothesized value.

Page 20: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 21: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

• The power is computed as 1 - b, and power can be interpreted as the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power properties.

• For example, consider the propellant burning rate problem whenwe are testing H 0 : m = 50 centimeters per second against H 1 : m not equal 50 centimeters per second . Suppose that the true value of the mean is m = 52. When n = 10, we found that b = 0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when m = 52.

Page 22: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.3 P-Values in Hypothesis Testing

P-value carries info about the weight of evidence against H0.The smaller the p-value, the greater the evidence against H0.

Page 23: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.4 One-Sided and Two-Sided Hypotheses

Two-Sided Test:

One-Sided Tests:

Page 24: 4-1 Statistical Inference

4-3 Hypothesis Testing

4-3.5 General Procedure for Hypothesis Testing

Page 25: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

Assumptions

Page 26: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

We wish to test:

The test statistic is:

Page 27: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

Reject H0 if the observed value of the test statistic z0 is either:

or

Fail to reject H0 if

Page 28: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

Page 29: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known4-4.1 Hypothesis Testing on the Mean

Page 30: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

Ex. 4-3X – burning rate μ = 50, s = 2, a = 0.05, n = 25, = 51.3

i) HypothesisH0: μ = 50 H1: μ ≠ 50

ii) Test Statistic = 1.3/0.4 = 3.25

iii) p-value = 2[1 – Φ(3.25)] = 0.0012Critical Region (CR) 또는 Region of Rejection (ROR)z0 > z0.025= 1.96 or z0< - z0.025= -1.96

iv) Conclusionp-value=0.0012 값이 유의수준 a = 0.05 보다 작기 때문에 ( 또는 , 통계량 =3.25 값이 기각역 (z0 = z0.025> 1.96) 에 포함됨으로 ) H0 를 기각함 . 즉 , 평균 burning rate 이 50m/sec 이 아니다라고 결론내림 .

Page 31: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

OPTIONS NODATE NONUMBER;DATA EX43;MU=50; SD=2; N=25; ALPHA=0.05; XBAR=51.3;SEM=SD/SQRT(N); /*MEAN STANDARD ERROR*/Z=(XBAR-MU)/SEM;PVAL=2*(1-PROBNORM(Z));PROC PRINT;

VAR Z PVAL ALPHA;TITLE 'TEST OF MU=50 VS NOT=50';

RUN; QUIT;

TEST OF MU=50 VS NOT=50

OBS Z PVAL ALPHA 1 3.25 .001154050 0.05

Page 32: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.2 Type II Error and Choice of Sample Size

Finding The Probability of Type II Error

Page 33: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.2 Type II Error and Choice of Sample Size Sample Size Formulas

Page 34: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.2 Type II Error and Choice of Sample Size

Sample Size Formulas

Page 35: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known4-4.2 Type II Error and Choice of Sample Size

Page 36: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known4-4.2 Type II Error and Choice of Sample Size

Page 37: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.3 Large Sample Test

In general, if n 30, the sample variance s2 will be close to σ2 for most samples, and so s can be substituted for σ in the test procedures with little harmful effect.

Page 38: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.4 Some Practical Comments on Hypothesis Testing

The Seven-Step Procedure

Only three steps are really required:

Page 39: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.4 Some Practical Comments on Hypothesis Testing

Statistical versus Practical Significance

Page 40: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.4 Some Practical Comments on Hypothesis Testing

Statistical versus Practical Significance

Page 41: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Two-sided confidence interval:

One-sided confidence intervals:

Confidence coefficient:

Page 42: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Page 43: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Page 44: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Page 45: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Relationship between Tests of Hypotheses and Confidence Intervals

If [l,u] is a 100(1 - a) percent confidence interval for the parameter, then the test of significance level a of the hypothesis

will lead to rejection of H0 if and only if the hypothesized value is not in the 100(1 - a) percent confidence interval [l, u].

Page 46: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Confidence Level and Precision of Estimation

The length of the two-sided 95% confidence interval is

whereas the length of the two-sided 99% confidence interval is

Page 47: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Choice of Sample Size

Page 48: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Choice of Sample Size

Page 49: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

Choice of Sample Size

Page 50: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Confidence Interval on the Mean

One-Sided Confidence Bounds

Page 51: 4-1 Statistical Inference

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 General Method for Deriving a Confidence Interval

Page 52: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 53: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 54: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 55: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Calculating the P-value

Page 56: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 57: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 58: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 59: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

OPTIONS NODATE NONUMBER;DATA ex47;xbar=0.83725; sd=0.02456; mu=0.82; n=15;alpha=0.05; df=n-1; t0=(xbar-mu)/(sd/sqrt(n)); pval = 1- PROBT(t0, df); cr=tinv(1-alpha, df); /* critical region */PROC PRINT;

var t0 pval cr;title 'Test of mu=0.82 vs mu>0.82';title2 'Ex. 4-7 in pp193';

RUN; QUIT;

Test of mu=0.82 vs mu>0.82 Ex. 4-7 in pp193

OBS t0 pval cr 1 2.72023 .008292926 1.76131

Page 60: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

OPTIONS NOOVP NODATE NONUMBER;DATA ex47;input coeffs @@;coeffs1=coeffs-0.82;cards;

0.8411 0.8191 0.8182 0.8125 0.87500.8580 0.8532 0.8483 0.8276 0.79830.8042 0.8730 0.8282 0.8359 0.8660

proc univariate data=ex47 plot normal;var coeffs1;title 'using proc uinivariate for t-test for one population mean';title2 “Example 4-7 in page 191”;

RUN; QUIT; 

Page 61: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

using proc uinivariate for t-test for one population mean Example 4-7 in page 191

UNIVARIATE 프로시저 변수 : coeffs1

위치모수 검정 : Mu0=0

검정 -- 통계량 --- -------p 값 -------

스튜던트의 t t 2.718979 Pr > |t| 0.0166 부호 M 2.5 Pr >= |M| 0.3018 부호 순위 S 39 Pr >= |S| 0.0256

정규성 검정

검정 ---- 통계량 ---- -------p 값 -------

Shapiro-Wilk W 0.960869 Pr < W 0.7075 Kolmogorov-Smirnov D 0.110275 Pr > D >0.1500 Cramer-von Mises W-Sq 0.026454 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.193435 Pr > A-Sq >0.2500

Page 62: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

using proc uinivariate for t-test for one population meanExample 4-7 in page 191

UNIVARIATE 프로시저 변수 : coeffs1 극 관측치 ------ 최소 ------ ------ 최대 ------ 값 관측치 값 관측치 -0.0217 10 0.0332 7 -0.0158 11 0.0380 6 -0.0075 4 0.0460 15 -0.0018 3 0.0530 12 -0.0009 2 0.0550 5

줄기 잎 # 상자그림 5 35 2 | 4 6 1 | 3 38 2 +-----+ 2 18 2 | | 1 6 1 *--+--* 0 88 2 | | -0 821 3 +-----+ -1 6 1 | -2 2 1 | ----+----+----+----+ 값 : ( 줄기 . 잎 )*10**-2

정규 확률도 0.055+ * +++* | *++++ | *+*++ | **++ 0.015+ ++*+ | ++++** | ++*+* * | +++* -0.025+ ++*+ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2

Page 63: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

OPTIONS NOOVP NODATE NONUMBER LS=80;DATA ex47;input coeffs @@;cards;

0.8411 0.8191 0.8182 0.8125 0.87500.8580 0.8532 0.8483 0.8276 0.79830.8042 0.8730 0.8282 0.8359 0.8660

proc ttest data=ex47 h0=0.82 sides=u; /* h0 는 귀무가설 , sides 는 양측일 경우 2, lower 단측 검정일경우 L, upper 단측 검정일 경우 u*/

var coeffs;title 'using t-test for one population mean';title “Example 4-7 in page 191”;

RUN;QUIT; 

using t-test for one population mean  The TTEST Procedure Variable: coeffs  N Mean Std Dev Std Err Minimum Maximum 15 0.8372 0.0246 0.00634 0.7983 0.8750  Mean 95% CL Mean Std Dev 95% CL Std Dev 0.8372 0.8261 Infty 0.0246 0.0180 0.0387  DF t Value Pr > t 14 2.72 0.0083

Page 64: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Fig 4-20 Normal probability plot of the coefficient of restitution data from Example 4-7.

Page 65: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.2 Type II Error and Choice of Sample Size

Fortunately, this unpleasant task has already been done, and the results are summarized in a series of graphs in Appendix A Charts Va, Vb, Vc, and Vd that plot for the t-test against a parameter d for various sample sizes n.

Page 66: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.2 Type II Error and Choice of Sample Size

These graphics are called operating characteristic (or OC) curves. Curves are provided for two-sided alternatives on Charts Va and Vb. The abscissa scale factor d on these charts is defined as

Page 67: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.3 Confidence Interval on the Mean

Page 68: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.3 Confidence Interval on the Mean

Page 69: 4-1 Statistical Inference

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.4 Confidence Interval on the Mean

Page 70: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.1 Hypothesis Testing on the Variance of a Normal Population

Page 71: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.1 Hypothesis Testing on the Variance of a Normal Population

Page 72: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.1 Hypothesis Testing on the Variance of a Normal Population

Page 73: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.1 Hypothesis Testing on the Variance of a Normal Population

Page 74: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.1 Hypothesis Testing on the Variance of a Normal Population

Page 75: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.1 Hypothesis Testing on the Variance of a Normal Population

Page 76: 4-1 Statistical Inference

4-6 Inference on the Variance of a Normal Population

4-6.2 Confidence Interval on the Variance of a Normal Population

Page 77: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.1 Hypothesis Testing on a Binomial Proportion

We will consider testing:

Page 78: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.1 Hypothesis Testing on a Binomial Proportion

Page 79: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.1 Hypothesis Testing on a Binomial Proportion

Page 80: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.1 Hypothesis Testing on a Binomial Proportion

Page 81: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.2 Type II Error and Choice of Sample Size

Page 82: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.2 Type II Error and Choice of Sample Size

Page 83: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.3 Confidence Interval on a Binomial Proportion

Page 84: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.3 Confidence Interval on a Binomial Proportion

Page 85: 4-1 Statistical Inference

4-7 Inference on Population Proportion

4-7.3 Confidence Interval on a Binomial ProportionChoice of Sample Size

Page 86: 4-1 Statistical Inference

4-8 Other Interval Estimates for a Single Sample

4-8.1 Prediction Interval

Page 87: 4-1 Statistical Inference

4-8 Other Interval Estimates for a Single Sample

4-8.2 Tolerance Intervals for a Normal Distribution

Page 88: 4-1 Statistical Inference

4-10 Testing for Goodness of Fit

• So far, we have assumed the population or probability distribution for a particular problem is known.

• There are many instances where the underlying distribution is not known, and we wish to test a particular distribution.

• Use a goodness-of-fit test procedure based on the chi-square distribution.

Page 89: 4-1 Statistical Inference