161
SAA 2023 COMPUTATIONALTECHNIQUE SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS FOR BIOSTATISTICS Inferential Inferential Statistics Statistics

SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Embed Size (px)

DESCRIPTION

SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS. Inferential Statistics. Inferential Statistics. Computational Statistics: - All you need to do is choose statistic - Computer does all other steps for you!!!!!!. Inferential Statistics. Inferential Statistics. Inferential Statistics. - PowerPoint PPT Presentation

Citation preview

Page 1: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

SAA 2023 SAA 2023 COMPUTATIONALTECHNIQUE FOR COMPUTATIONALTECHNIQUE FOR

BIOSTATISTICSBIOSTATISTICS

SAA 2023 SAA 2023 COMPUTATIONALTECHNIQUE FOR COMPUTATIONALTECHNIQUE FOR

BIOSTATISTICSBIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Computational Statistics- All you need to do is choose statistic- Computer does all other steps for you

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 2: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Computational Statistics- All you need to do is choose statistic- Computer does all other steps for you

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 3: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 4: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 5: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 6: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 7: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsFor instance if I am asked by a layperson what is meant

exactly by statistics I will refer to the following Old Persian saying ldquoMosht Nemouneyeh Kharvar Ast (translated a handful represents the heap)

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 8: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsThis brief statement will describe in one sentence the

general concept of inferential statistics In other words learning about a population by studying a randomly chosen sample from that particular population can be explained by using an analogy similar to this one

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 9: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInvolves using obtained sample

statistics to estimate the corresponding population parameters

Most common inference is using a sample mean to estimate a population mean (surveys opinion polls)

Drawing conclusions from sample to population

Sample should be representative

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 10: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics allow us to make

determinations about whether groups are significantly different from each other

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 11: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is not a system of magic and trick mirrors Inferential statistics are based on the concepts of probability (what is likely to occur) and the idea that data distribute normally

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 12: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 13: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Sample of observations

Entire population of observations

StatisticX

Parametermicro=

Random selection

Statistical inference

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 14: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential statistics is based on a strange and mystical concept called falsification

Although you might think the process is simple

Write a hypothesis test it hope to prove it

Inferential statistics works this way

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 15: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Write a hypothesis you believe to be true

Write the OPPOSITE of this hypothesis which is called the null hypothesis

Test the null hoping to reject it- If the null is rejected you have evidence

that the hypothesis you believe to be true may be true

- If the null is failed to reject reach no conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 16: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsTwo major sources of error in researchInferential statistics are used to make generalizations

from a sample to a population There are two sources of error (described in the Sampling module) that may result in a samples being different from (not representative of) the population from which it is drawn

These are

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 17: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsInferential statistics take into account sampling

error These statistics do not correct for sample bias That is a research design issue Inferential statistics only address random error (chance)

Sampling error - chance random error Sample bias - constant error due to inadequate design

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 18: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 19: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 20: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 21: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 22: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 23: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

p-p-valuevalueThe reason for calculating an inferential statistic is

to get a p-value (p = probability) The p value is the probability that the samples are from the same population with regard to the dependent variable (outcome)

Usually the hypothesis we are testing is that the samples (groups) differ on the outcome The p- value is directly related to the null hypothesis

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 24: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

The p-value determines whether or not we reject the null hypothesis We use it to estimate whether or not we think the null hypothesis is true

The p-value provides an estimate of how often we would get the obtained result by chance if in fact the null hypothesis were true

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 25: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

If the p-value (lt α- value) is small reject the null hypothesis and accept that the samples are truly different with regard to the outcome

If the p-value (gt α- value) is large fail to reject the null hypothesis and conclude that the treatment or the predictor variable had no effect on the outcome

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 26: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsSteps for testing hypotheses Calculate descriptive statistics Calculate an inferential statistic Find its probability (p-value) Based on p-value accept or reject the null hypothesis (H0) Draw conclusion

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 27: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 28: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 29: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 30: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 31: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 32: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Variable N Mean StDev SE Mean 90 CI

GPAs 50 29580 04101 00580 (28608 30552)

Variable N Mean StDev SE Mean 95 CI

GPAs 50 29580 04101 00580 (28414 30746)

Sample mean = 296

- 90 confident that μ (population mean) is between 286 and 306

- 95 confident that μ (population mean) is between 284 and 307

95 CI width gt 90 CI width

The larger confidence coefficient the greater the CI width

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 33: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 34: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 35: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 36: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 37: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential Statistics

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 38: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs

Test of μ = 3 vs lt 3 (directional one-tail to the right) 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

Test of μ = 3 vs ne 3 (non-directional two-tail)

Variable N Mean StDev SE Mean 95 CI T PGPAs 50 29580 04101 00580 (28414 30746) -072 0472

Test of μ = 3 vs gt 3 (directional one-tail to the left)

95 LowerVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 28608 -072 0764

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 39: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsOne-Sample T GPAs Test of μ = 3 vs lt 3 95 UpperVariable N Mean StDev SE Mean Bound T PGPAs 50 29580 04101 00580 30552 -072 0236

The test statistic t = -072 is the number of std deviations that the sample mean 296 is from the hypothesized mean μ = 3

p-value is the probability that a random sample mean is less than or equal to 296 when Ho μ = 3 is true

The rejection region consists of all p-values less than α

p-value = 0236 gt α = 005

Then we fail to reject Ho

Conclusion There is not sufficient evidence in the sample to conclude that the true mean GPA μ is less than 300

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 40: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential Statistics

Two sample T-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 41: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 42: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 43: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 44: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Cholesterol Example Suppose population mean μ is 211

= 220 mgml s = 386 mgml n = 25 (town)

H0 m = 211 mgml

HA m sup1 211 mgml

For an a = 005 test we use the critical value determined from the t(24) distribution

Since |t| = 117 lt 2064 (table t) at the a = 005 level

We fail to reject H0

The difference is not statistically significant

x

17125638

211220

0

ns

Xt

064224050 t

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 45: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 46: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

We set a standard beyond which results would be rare (outside the expected sampling error)

We observe a sample and infer information about the population

If the observation is outside the standard we reject the hypothesis that the sample is representative of the population

0

2

4

6

8

10

109

020

80

307

040

60

505

060

40

703

080

20

901

099

01

0

2

4

6

8

10

0

2

4

6

8

10

One-sample t-testOne-sample t-testOne-sample t-testOne-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 47: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 48: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 49: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 50: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 51: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 52: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 53: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 54: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 55: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 56: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Degree of Freedom (df)Degree of Freedom (df)

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 57: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 58: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 59: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential StatisticsInferential StatisticsInferential StatisticsInferential StatisticsNon-directional (2-tailed test) In this form of the test departure can

be observed from either end of the distribution Thus no direction for expected results are specified The null and alternative hypotheses are as follows

Directional (1-tailed test) In this form of the test the rejection region lies at only one end of the distribution The direction is specified before any analysis begins

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 60: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 61: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 62: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 63: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 64: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 65: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 66: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 67: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 68: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 69: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 70: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 71: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 72: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

We test the hypothesis of equal means for the two populations assuming a common variance

H0 1 = 2 HA 1 2

N Mean Std Dev

Healthy Cystic Fibrosis

9 13

189 119

59 63

)2(~)()(

212121

21

nndfts

xxt

xx

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 73: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

086220050 t

733720

76754

2139

361139519

2

11

22

21

222

2112

nn

snsns

662)13

1

9

1(737)

11(

21

2

21 nn

ss xx

632

662

911918

21

21

xxs

xxt

|| 020050 AHacceptHrejecttt

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-test

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 74: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testMINITAB PROCEDURE ndash Comparing Two Population Means (RealEstate)

Stat Basic Statistics 2-Sample t

Choose Samples in one column

Samples Select helliphelliphelliphellip(SalePrice)

Subscripts Select helliphelliphellip(Location)

Click Graphs Check Boxplots of data OK

Click Options

Confidence level Enter 90 OK OK

Ho μ1 = μ2 or μ1 - μ2 = 0

HA μ1 ne μ2 or μ1 - μ2 ne 0

WestEast

1000000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Location

Sale

Price

Boxplot of SalePrice

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 75: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two-sample t-testTwo-sample t-testTwo-sample t-testTwo-sample t-testResult Two-sample T for SalePrice

Location N Mean StDev SE MeanEast 34 312702 188163 32270West 26 339143 255571 50122

Difference = μ (East) - μ (West)Estimate for difference -2644290 CI for difference (-126602 73719)T-Test of difference = 0 (vs not =) T-Value = -044 p-Value = 0660 DF = 44

The difference in mean sale prices = $2644190 Confidence that the true mean difference in sale prices is in the interval (-$126602 lt μ (East) - μ (West) lt $73719)

p-Value = 0660 gt α = 005010Fail to reject Ho

We do not have sufficient evidence in the sample to conclude that there are significant differences in mean sale prices in markets east and west

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 76: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Test to deal with two observations with strong comparability (eg two treatments on the same individuals or one individual Before vs After treatment very close plots)

Sample 1 X11 X12 hellip X1n Sample 2 X21 X22 hellip X2n Method

Calculate differences between two measurements for each individual di = Xi1 ndash Xi2

Calculate

n

ss

n

dds

n

dd d

d

jd

j

1

)(

2

Paired t-testPaired t-testPaired t-testPaired t-test

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 77: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

)1(~

ntns

d

s

dt

dd

d

000 Ad HH

Paired t-testPaired t-testPaired t-testPaired t-test

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 78: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

5218

314

31418

8)32(56)1(

4

2222

d

d

s

s

d

0506322521

4tt

Virus 1 Virus 2Plant X1j X2j dj

1 9 10 -12 17 11 63 31 18 134 18 14 45 7 6 16 8 7 17 20 17 38 10 5 5

Total 120 88 32Mean 15 11 4

Test infection of virus on tobacco leaves Number of death pots on leaves

000 Ad HH

3652050 )7(050 t

Paired t-testPaired t-testPaired t-testPaired t-test

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 79: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Advantages1) Usually it is easy to find

a true small difference2) Do not need to consider if the

variances of two populations are same or not

22

21 xxd ss

Paired t-testPaired t-testPaired t-testPaired t-test

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 80: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Paired t-testPaired t-test

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 81: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis TestingHypothesis testing is always a five-

step procedure Formulation of the null and the

alternative hypotheses Specification of the level of significance Calculation of the test statistic Definition of the region of rejection Selection of the appropriate hypothesis

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 82: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test1 Specify the 5 required elements of a

Hypothesis test listed above

2 Using the sample data compute either the value of the test statistic or the p-value associated with the calculated test statistic

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 83: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules (dont mix

them up) 1 Table method Compare the calculated value with a

table of the critical values of the test statistic If the absolute (calculated) value of the test statistic

to the critical value from the table reject the null hypothesis (HO) and accept the alternative hypothesis (HA)

If the calculated value of the test statistic lt the critical value from the table fail to reject the null hypothesis (H0)

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 84: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 85: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 86: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 2 Graph method Compare the calculated value

with the t-distribution graph of the test statistic Reject the NULL hypothesis if the test statistic

falls in the critical region Fail to reject the NULL if the test statistic does not fall in the critical region

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 87: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 88: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

Rejection region at the α=5 significance level

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 89: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

Rejection region at the α=10 significance level

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 90: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing3 Steps to carry out a Hypothesis Test3 Use one of three possible decision rules

(dont mix them up) 3 p-value method Reject the NULL hypothesis

if the p-value is less that α Fail to reject the NULL if the p-value is greater than α

The exact p-value can be computed and if p lt 005 then H0 is rejected and the results are declared statistically significant Otherwise if p 005 then H0 is failed to reject and the results are declared not statistically significant

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 91: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis Testing

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 92: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis TestingExample of hypothesis testing can be found in the jury system There are three party involved in a court case ie plaintiff (prosecutors) defendant (accuse) and the judges The judge will form a hypothesis as below before hearing a caseHypothesisHo The evidences are not significantly strong enough to proof the defendant guiltyH1 The evidences are significantly strong to proof defendant guilty

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 93: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis TestingThe main function of the plaintiff play in a case is to continuously supply strong evidence to proof the defendant guilty

Whereas the function of the defendant is to defend himself by rejecting the evident provide by the plaintiff

The judges role is to collect information supplied by the plaintiff and defendant and make decision about the hypothesis validity

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 94: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis TestingTo perform a hypothesis test we start with two mutually exclusive hypotheses Herersquos an example when someone is accused of a crime we put them on trial to determine their innocence or guilt In this classic case the two possibilities are the defendant is not guilty (innocent of the crime) or the defendant is guilty This is classically written as

H0 Defendant is Innocent larr Null HypothesisHA Defendant is Guilty larr Alternate Hypothesis

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 95: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis TestingHypothesis TestingHypothesis Testing Unfortunately our justice systems are not perfect At times we let the guilty go free and put the innocent in jail The conclusion drawn can be different from the truth and in these cases we have made an error The table below has all four possibilities Note that the columns represent the ldquoTrue State of Naturerdquo and reflect if the person is truly innocent or guilty The rows represent the conclusion drawn by the judge or jury

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 96: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 97: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II Errors

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 98: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Alpha and BetaAlpha and Beta

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 99: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II Errors

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 100: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Reject the null

hypothesis

Fail to reject the

null hypothesis

TRUE FALSEType I error

αRejecting a true

null hypothesis

Type II errorβ

Failing to reject a

false null hypothesis

CORRECT

NULL HYPOTHESIS

CORRECT

Type I and Type II ErrorsType I and Type II Errors

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 101: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsTwo of the four possible outcomes are correct If the truth is they are innocent and the conclusion drawn is innocent then no error has been made If the truth is they are guilty and we conclude they are guilty again no error However the other two possibilities result in an error

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 102: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsA Type I (read ldquoType onerdquo) error is when the person is truly innocent but the jury finds them guilty A Type II (read ldquoType twordquo) error is when a person is truly guilty but the jury finds himher innocent Many people find the distinction between the types of errors as unnecessary at first perhaps we should just label them both as errors and get on with it

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 103: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsHowever the distinction between the two types is extremely important When we commit a Type I error we put an innocent person in jail When we commit a Type II error we let a guilty person go free Which error is worse

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 104: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsThe generally accepted position of society is that a Type I Error or putting an innocent person in jail is far worse than a Type II error or letting a guilty person go free In fact the burden of proof in criminal cases is established as ldquoBeyond reasonable doubtrdquo Another way to look at Type I vs Type II errors is that a Type I error is the probability of overreacting and a Type II error is the probability of under reacting

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 105: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsIn statistics we want to quantify the probability of a Type I and Type II error The probability of a Type I Error is α (Greek letter ldquoalphardquo) and the probability of a Type II error is β (Greek letter ldquobetardquo) Without slipping too far into the world of theoretical statistics and Greek letters letrsquos simplify this a bit What if the probability of committing a Type I error was 20

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 106: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsA more common way to express this would be that we stand a 20 chance of putting an innocent man in jail Would this meet your requirement for ldquobeyond reasonable doubtrdquo At 20 we stand a 1 in 5 chance of committing an error This is not sufficient evidence and so cannot conclude that heshe is guilty

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 107: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsThe formal calculation of the probability of Type I error is critical in the field of probability and statistics However the term Probability of Type I Error is not reader-friendly For this reason the phrase Chances of Getting it Wrong is used instead of Probability of Type I Error

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 108: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II ErrorsMost people would agree that putting an innocent person in jail is Getting it Wrong as well as being easier for us to relate to To help you get a better understanding of what this means the table below shows some possible values for getting it wrong

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 109: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Type I and Type II ErrorsType I and Type II Errors

Chances of Getting it Wrong (Probability of Type I Error)

Percentage Chances of sending an innocent man to jail

20 Chance 1 in 5

5 Chance 1 in 20

1 Chance 1 in 100

01 Chance 1 in 10000

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 110: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Controlling Type I and Type II ErrorsControlling Type I and Type II Errors and n are interrelated If one is kept constant

then an increase in one of the remaining two will cause a decrease in the other

For any fixed an increase in the sample size n will cause a in

For any fixed sample size n a decrease in will cause a in

Conversely an increase in will cause a in

To decrease both and the sample size n

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 111: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Planning a studyPlanning a studySuppose you were interested in

determining whether treatment X has an effect on outcome Ymdashthere are several issues that need to be addressed so that a sound inference can be made from the study result

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 112: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Planning a studyPlanning a studyWhat is the populationHow will you select a sample that is

representative of that population There are many ways to produce a sample but

not all of them will lead to sound inference

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 113: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Sampling strategies Sampling strategies Probability samplesmdashresult when subjects

have a known probability of entering the sample Simple random sampling Stratified sampling Cluster sampling

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 114: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Sampling strategiesSampling strategiesProbability samples can be made to be

representative of a populationNon-probability samples may or may not

be representative of a populationmdashit may be difficult to convince someone that the sample results apply to any larger population

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 115: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Planning a studyPlanning a studyClinical trials are generally designed to be

efficacy trialsmdashhighly controlled situations that maximize internal validity

We want to design a study to test the effect of treatment X on outcome Y and try to make sure that any difference in Y is due to X

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 116: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Planning a studyPlanning a study At the end of this study you observe a difference

in outcome Y between the experimental group and the control group

All of the effort in designing the study with strict control is for one reasonmdashat the end of the study you want only two plausible explanations for the observed outcome Chance Real effect of treatment X

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 117: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Planning a studyPlanning a study The reason you want only these two

explanations is because if you can rule out chance you can conclude that treatment X must have been the reason for the difference in outcome Y

All inferential statistical tests are used to estimate the probability of the observed outcome assuming chance alone is the reason for the difference

If there are multiple competing explanations for the observed result then ruling out chance offers little information about the effectiveness of treatment X

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 118: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Inferential statisticsInferential statisticsHypothesis testingmdashanswering the

question of whether or not treatment X may have no effect on outcome Y

Point estimationmdashdetermining what the likely effect of treatment X is on outcome Y

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 119: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testingThe goal of hypothesis testing is

somewhat twisted mdash it is to disprove something you donrsquot believe

In this case you are trying to disprove that treatment X has no effect on outcome Y

You start out with two hypotheses

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 120: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testingNull Hypothesis (HO)

Treatment X has no effect on outcome Y

Alternative Hypothesis (HA) Treatment X has an effect on outcome Y

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 121: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testing If the trial has been carefully controlled there

are only two explanations for a difference between treatment groupsmdashefficacy of X and chance

Assuming that the null hypothesis is correct we can use a statistical test to calculate that the observed difference would have occurred This is known as the significance level or p-value of the test

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 122: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testingP-value

The probability of the observed outcome assuming that chance alone was involved in creating the outcome In other words assuming the null hypothesis is correct what is the probability that we would have seen the observed outcome

This is only meaningful if chance is the only competing plausible explanation

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 123: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testingIf the p-value is small meaning the

observed outcome would have been unlikely we will reject that chance played the only role in the observed difference between groups and conclude that treatment X does in fact have an effect on outcome Y

How small is small

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 124: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testing

Reality -gt

Decision

HO is true HO is false

Retain HO Correct Decision

Type II Error ()

(2 1)

Reject HO Type I Error ()

(05 01)

Correct Decision

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 125: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Hypothesis testingHypothesis testing Power analysis is used to try to minimize Type II

errors Power (1-) is the probability of rejecting the null

hypothesis when the effect of X is some specified value other than zero

Usually one specifies an expected effect and uses power analysis to calculate the sample size needed to keep below some value (2 is common)

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 126: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Point estimationPoint estimation Hypothesis testing can only tell you whether or

not the effect of X is zero it does not tell you how large or small the effect is

Important mdash a p-value is not an indication of the size of an effect it depends greatly on sample size

If you want an estimate of the actual effect you need confidence intervals

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 127: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Point estimationPoint estimation Confidence intervals give you an idea of what

the actual effect is likely to be in the population of interest

The most common confidence interval is 95 and gives an upper and lower bound on what the effect is likely to be

The size of the interval depends on the sample size variability of the measure and the degree of confidence you want that the interval contains the true effect

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 128: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Point estimationPoint estimationMany people prefer confidence intervals to

hypothesis testing because confidence intervals contain more information

Not only can you tell whether the effect could be zero (is zero contained in the interval of possible effect values) but you also have the entire range of possible values the effect could be

So a confidence interval gives you all the information of a hypothesis test and a whole lot more

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 129: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Choosing the right testChoosing the right testTypically one is interested in comparing

group meansIf the outcome is continuous and one

independent variable Two groups mdash t-test Three or more groups -- ANOVA

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 130: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Choosing the right testChoosing the right testIf the outcome is continuous and there is

more than one independent variable ANOVA if all independent variables are

categorical ANCOVA or multiple linear regression if some

independent variables are continuous

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 131: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Logic of Hypothesis testingLogic of Hypothesis testing

The further the observed value is from the mean of the expected distribution the more significant the difference

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 132: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Steps in test of Hypothesis Steps in test of Hypothesis

1 Determine the appropriate test 2 Establish the level of significanceα3 Determine whether to use a one tail or

two tail test4 Calculate the test statistic5 Determine the degree of freedom6 Compare computed test statistic against

a tabled value

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 133: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

3 Determine Whether to Use One 3 Determine Whether to Use One or Two Tailed Testor Two Tailed Test

If the alternative hypothesis specifies direction of the test then one tailed

Otherwise two tailed Most cases

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 134: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

5 Determine Degrees of Freedom5 Determine Degrees of Freedom

Number of components that are free to vary about a parameter

Df = Sample size ndash Number of parameters estimated Df is n-1 for one sample test of mean

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 135: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

T-Test -compare means of two groupsintervalratio level of measurementindependent samples t-testdependent or paired samples

Common Inferential StatsCommon Inferential Stats

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 136: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

ANOVA - analysis of variance1048708more than 2 means to compare or more

than 2 testing of means1048708intervalratio level of measurement

Chi-square (x2) -testing hypothesis about number of cases that fall into various categories

nominalordinal level of measurement

Common Inferential StatsCommon Inferential Stats

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 137: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Descriptive stats summarize measures of central tendency and variability1048708

Inferential determine how likely it is that results based on sample are the same in population1048708

Must know level of measurement of variables to choose correct

Parametric and non-parametric two types of statistics requiring analysis of assumptions

1048708Pearson r t tests and ANOVA examples of parametric 1048708Pearson r measure relationship or association between 2

variables

T test determines if there is a significant difference between 2 group means

ANOVA determines if there is a significant difference between 3 or more means

X2 non-parametric statistic to assess relationship between 2 categorical variables

SummarySummary

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 138: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two types of ANOVATwo types of ANOVA

1Independent groups - two different sets of individuals In the graphic below college students are randomly assigned to Groups 1 and 2

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 139: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory to students just before the final exam in a Sociology class They also adminster it before the final exam in a Political Science class To compare the two sets of scores they use

either ANOVA or t-test for independent samples (hand calculation)

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 140: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two types of ANOVATwo types of ANOVA

2Paired samples (sometimes referred to as Repeated Measures or With Replication) - either the same individuals or from matched groups (ie matched on everything but the treatment (level of the Independent variable)

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)

Page 141: SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Two types of ANOVATwo types of ANOVA

Example Researchers are interested in exam anxiety They administer an anxiety inventory on the second day of class Then they give it again on the day of the midterm To compare the two sets of scores they use either ANOVA with replication or t-test for paired samples

(hand calculation)