Lecture 4 T-tests

  • Upload
    ze-chen

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

  • 7/29/2019 Lecture 4 T-tests

    1/26

    Introduction to hypothesistesting: t-tests

  • 7/29/2019 Lecture 4 T-tests

    2/26

    In Lecture 3, we discussed the use of the normal distribution in descriptivestatistics(bell-shaped traits)

    The second use of the normal distribution is instatistical inference

    This is possible because there are other quantities showing a normal pattern:o For example, the mean of sample means!

    Understanding this idea is key to statistical inference and testing 2

    Confused? Thats normal

  • 7/29/2019 Lecture 4 T-tests

    3/26

    3

    If these are bullet holes on a target, where do you guess the bulls eye is?

    3

    -

  • 7/29/2019 Lecture 4 T-tests

    4/26

    4

    Somewhere around here?

    -

  • 7/29/2019 Lecture 4 T-tests

    5/26

    Yesbut why?

    5

    -

  • 7/29/2019 Lecture 4 T-tests

    6/26

    Because we assume that error was randomly

    distributed around the bulls eye

    o if youre trying to hit the target, it is unlikely that you

    are going to miss more often in one direction

    As weve seen, the normal distribution

    describes such random and symmetrical

    deviations around a central value

    This common-sense observation captures the

    essence of the Central Limit Theorem, the

    analytical foundation of predictive statistics

    6

  • 7/29/2019 Lecture 4 T-tests

    7/26

    Back to statistics: lets say we take a smallsample from a population, and calculateproportion of atheists, or mean height insampleo This sample may not represent the true

    proportion of atheists or the average height in

    the population: it carries an error

    But lets say I take many samples, andcalculate mean height in each one; whatwould their distribution look like?

    Answer: the same as the bullet holes in thetargeto errors are random

    o best guess for position of true mean height is thecentre of distribution of samples

    o i.e. the best guess is the mean of the samples(the mean of sample means)

    7

    -

  • 7/29/2019 Lecture 4 T-tests

    8/26

    Example: lottery draw

    o We have 100 balls numbered 1 to 100

    o (in this case, mean=50.5, sd=29)

    o Lets say we take samples of 5 balls, and

    calculate their mean (one sample at a time)

  • 7/29/2019 Lecture 4 T-tests

    9/26

    What happens as number of samples

    increases?

    o mean of the means approaches a normal

    distribution

    o mean of means approaches true mean of the

    population of 100 balls!

    9

    N= 10 samples of 5

    N= 30 samples of 5

    N=100 samples of 5

    N=200 samples of 5

  • 7/29/2019 Lecture 4 T-tests

    10/26

    So when we are trying to identify the

    true mean of a variable in a

    population, the best guess is the

    sample means

    If we have many samples, their mean

    is the best guess

    o But this is very rarely the case!

    In most cases, we have only one

    sample; the sample mean is yourbest (and only!) estimator of truemean 10

  • 7/29/2019 Lecture 4 T-tests

    11/26

    Sample has mean and standard deviation; so what is the probabilitythat it identifies a certain value (the true mean we want t find)?

    W can say that the true mean is the sample mean plus or minus x(the margin of error)o Thats why we calculate standard deviations and confidence intervals

    Example: take lifespan variable

    > mean(lifespan, na.rm=T)

    [1] 69.71495

    sd(lifespan, na.rm=T)

    >[1] 9.644646

    So what is the true mean?

    Conventionally, we take the 95% interval around sample mean as theconfidence interval: it is 95% certain that the true mean is inside it

    To calculate 95% confidence interval around sample mean, type

    > t.test(lifespan)$conf.int

    [1] 68.34922 71.08068

    attr(,"conf.level")

    [1] 0.95

    Conclusion: if sample mean is 69.72 and standard deviation is 9.64,it is 95% likely that true mean is between 68.35 and 71.08

    11

    95%

    95%

    If my estimator has mean and

    standard deviation shown, is

    the true mean (the bulls eye)

    inside my 95% confidence

    interval? In this example, yes!

  • 7/29/2019 Lecture 4 T-tests

    12/26

    A difference: instead of standard deviation, we calculate astandard error ofmean (because the sample deviation is an error relative to true mean)

    Sem is like the standard deviation, but divided by sample size (minus one to avoid bias technicality!)

    sem =

    1

    Interpretation:

    (i)sem is proportional to standard deviation in the population()o when sampled population (balls in a bag, subjects in my study) shows more variation, samples are more

    variable and error (deviation between sample mean and true mean) is larger

    (ii)sem is inversely proportional to the size of the sample (n)o A random sample of 20 gives better estimate of true mean than samples of 5

    12

  • 7/29/2019 Lecture 4 T-tests

    13/26

    Last thing: to calculate confidence intervals

    (margin of error), we have to use the

    Students t-distribution

    o it is similar to normal, but used when standard

    deviation in population is unknown (remember:

    we only know standard deviation of sample)

    o works better than normal with small sample

    sizes

    o

    approaches normal when n is large

    This is why tests comparing means are

    called t-tests

    13

    95%

  • 7/29/2019 Lecture 4 T-tests

    14/26

    We are ready to test hypotheses about means

    o is a sample mean representative of true mean? (one-

    sample t-test)

    o are European countries richer than sub-Saharan countries?

    (two-sample t-test)o does a new drug increase survival of patients (paired t-test)

    t-tests provide such group comparisons; they are

    important to validate statements about socialindicators, income, fairness, justice, historical

    processes etc.

    o does European colonisation affect country income

    o does gender affect income?

  • 7/29/2019 Lecture 4 T-tests

    15/26

    Test of significance of difference has totake two things into account:

    (i) the sample sizes If my sample sizes are very large, even a very

    small difference in means will be statisticallysignificant

    example: difference in colon cancerincidence between people who eat morethan 600 g of red meat per week and thosewho dont is 13%, and is only identifiable withlarge samples (~100,000 people)

    (ii) the measured difference in means If difference is too large, it will be significant

    even if sample size is small Example: if I am comparing average size in

    mice and elephants, a sample of 1 mouseand 1 elephant is good enough!

    size

  • 7/29/2019 Lecture 4 T-tests

    16/26

    t-tests simply calculate whether the difference between twomeans/values is real = statistically significant = differentfrom zeroo if difference is zero, they are not different!

    In order to use probability distributions, we muststandardise variables; so the difference is standardised

    t=

    So what we want to know is whether t (difference) is toodifferent from zero (i.e. not similar)

    What is too different? Conventionally, we calculate 95%confidence intervals; if a value is inside it, it is not differentfrom test value and there is no difference (well see how it

    works)

    95% 2.5%2.5%

    0t=-1.96 t=1.96

  • 7/29/2019 Lecture 4 T-tests

    17/26

    Basic rule: what we need to know is the P-value(probability value) of a t-test

    In a t-test, the null hypothesis (=status quo,

    conservative hypothesis) is always that there isno difference between the two compared valueso i.e. if you want to prove that two groups differ, you

    must reject the null hypothesis

    The P-value of a test is the probability that nullhypothesis is true (i.e. groups are not different)o conventionally, we only reject null hypothesis is P

    value is less than 5% =P

  • 7/29/2019 Lecture 4 T-tests

    18/26

    Example: is life expectancy in the world differentfrom 70 years?o one-sample t-test: we are comparing a group to a value

    (=the test value; a hypothetical true value of 70 years)

    How to do it in R? Just specify test value as mu=70

    > t.test(lifespan, mu=70)

    One Sample t-test

    data: lifespan

    t = -0.4117, df = 193, p-value = 0.681

    alternative hypothesis: true mean is not equal to 70

    95 percent confidence interval:

    68.34922 71.08068

    sample estimates:

    mean of x

    69.71495

    Sample mean=69.71o that doesnt seem to be very

    different from 70)

    t=-0.41o t statistic, the standardised

    difference between samplemean and test value, is close

    to zero 95% CI: [68.35-71.08]

    o Confidence interval oflifespan:

    o my sample suggests that lifeexpectancy in the world inbetween 68.3 and 71.08years; and this it includes 70years

  • 7/29/2019 Lecture 4 T-tests

    19/26

    Example: is life expectancy in the world differentfrom 70 years?o one-sample t-test: we are comparing a group to a value

    (=the test value; a hypothetical true value of 70 years)

    How to do it in R? Just specify test value as mu=70

    > t.test(lifespan, mu=70)

    One Sample t-test

    data: lifespan

    t = -0.4117, df = 193, p-value = 0.681

    alternative hypothesis: true mean is not equal to 70

    95 percent confidence interval:

    68.34922 71.08068

    sample estimates:

    mean of x

    69.71495

    P value=0.681=68%o This is the probability of

    null hypothesis (=lifeexpectancy is notdifferent from 70 years)

    P is high

    o Therefore, you mustaccept the nullhypothesis

    Conclusion: based onour sample, lifeexpectancy in the worldis not significantly

    different/shorter than70 years

  • 7/29/2019 Lecture 4 T-tests

    20/26

    But is life expectancy in the world different from 75 years?o now we set mu=75

    > t.test(lifespan, mu=75)

    One Sample t-test

    data: lifespan

    t = -7.6324, df = 193, p-value = 1.033e-12alternative hypothesis: true mean is not equal to 75

    95 percent confidence interval:

    68.34922 71.08068

    sample estimates:

    mean of x

    69.71495

    So what is the probability that average lifespan across countries is 75 years?o P= 1.033*10(-12) = 0.000000000001033 = 0.00000000001033%;

    o This is very low! We must reject null hypothesis and and accept alternative hypothesis

    o t=-7.63; thats significantly different from 0

    o 75 years is outside 95% CI

    o Therefore, life expectancy is below 75 years

  • 7/29/2019 Lecture 4 T-tests

    21/26

    You may also want to test whether twosamples are significantly different in somerespecto for example, are South and Southeast Asian

    countries richer than Latin American countries?

    o i.e. do differences or similarities in economicmodels in recent decades cause differences inaverage income between the two areas?

    Procedure is similar: but t-statistic is now thedifference between means of the two

    compared groups

    t =12

    sedm (standard error of the difference ofmeans) is automatically calculated by R

  • 7/29/2019 Lecture 4 T-tests

    22/26

    In file HDR2011, variable continent is seasia for Southand Southeast Asian countries and latin for Latin Americancountries; others are NA (non-available)

    > t.test(GNI ~ continent)

    Welch Two Sample t-test

    data: GNI by continentt = -1.1455, df = 20.327, p-value = 0.2653

    alternative hypothesis: true difference in means is not equal to0

    95 percent confidence interval:

    -13340.319 3876.397

    sample estimates:

    mean in group latin mean in group seasia

    9054.355 13786.316

    Conclusion:

    We may think that the difference of ~US4,700 between theareas was large enough to prove a significant difference

    But it isnt: there is too much variation in income in the two

    areas

    Welch test is name ofthis t-test

    P-value=0.26=26% We cannot reject null

    hypothesis: Areas do not differ byincome Notice that 95% CI ofdifference in incomebetween the areas

    includes zero; i.e. youcannot exclude zerodifference in incomefrom your confidenceinterval

  • 7/29/2019 Lecture 4 T-tests

    23/26

    A paired test should be used when the two compared measurements arelinked, i.e. the subjects/cases are not independent

    For example, the two group means may be two measurements from thesame individual

    o In the case of a trial of a new drug for blood pressure, blood pressure before andafter drug administration in the same patients

    Run library ISwR> attach(intake) #this is a file in the library ISwR

    > intake # what does it look like? or try head(intake)

    The file intake has data on pre- and post-menstrual calorie consumptionin 11 women;o Question: is there a difference in caloric intake before and after menstrual cycle?

  • 7/29/2019 Lecture 4 T-tests

    24/26

    So now lets try a paired t-test:

    > t.test(pre, post, paired=T)

    Paired t-test

    data: pre and post

    t = 11.9414, df = 10, p-value = 3.059e-07

    alternative hypothesis: true difference in means is not equal to 0

    95 percent confidence interval:

    1074.072 1566.838

    sample estimates:

    mean of the differences

    1320.455

    P value: very low!

    We must reject null hypothesis (no difference)

    Confidence interval: 95% likely that difference in calorie is between 1074 and 1566kcal

    Conclusion: there is a clear difference between calorie intake pre and post

  • 7/29/2019 Lecture 4 T-tests

    25/26

    1) One-sample t-test

    Is income per capita (GNI) in the world significantly less thanUS$20000?

    2) Two-sample t-test

    Let us compare schooling years in Southeast Asia and LatinAmericao What is the average schooling of children in the two regions?

    o Does schooling significant differ between the two areas? What isthe probability that they do differ?

    3) Paired t-test

    Give two examples of studies that could require paired t-tests

  • 7/29/2019 Lecture 4 T-tests

    26/26

    Confidence intervals and all t-tests assume a normal distribution,even when sample is small

    o And they are based on a theory of means of various samples, which in

    practice we dont haveo Thats why you do not prove differences; you compare groups and give an

    estimate of the probability that they are difference or similar

    Remember: null hypothesis is always that means are not different

    Current trend is to provide confidence intervals rather then Pvalueswhen reporting results of tests in general (not just t-tests), so getused to calculating and interpreting them