13
T-Tests “Stories change people while statistics give them something to argue about.” Bernie Siegel (US Writer & retired pediatric surgeon)

T-Tests - University of Albertalkgray/uploads/7/3/6/2/... · 2018. 12. 2. · Bernie Siegel (US Writer & retired pediatric surgeon) The t-distribution (a.k.a The Student t-distribution)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • T-Tests

    “Stories change people while statistics give them something to argue about.”

    Bernie Siegel (US Writer & retired pediatric surgeon)

  • The t-distribution (a.k.a The Student t-distribution)

    t-distribution (sampling distribution)

    Normal distribution

    William Sealy Gosset (1876-1937)

    𝑑𝑓 = 𝑛 − 1

    • Has fatter tails then the normal distribution • Degrees of freedom: • As sample size increases – it approaches the normal distribution • Properties:

    Bell-shaped mean=median=mode=0 Variance > 1

    http://www.google.ca/url?sa=i&source=images&cd=&cad=rja&uact=8&docid=1QitQP8A6rds_M&tbnid=X6lqKuCEoQB1BM:&ved=0CAgQjRw&url=http://en.wikipedia.org/wiki/William_Sealy_Gosset&ei=-0WsU8DJFYuyyATHl4Jg&psig=AFQjCNHY7kMGkpc9U8Ko4rwRHnKkD3sElg&ust=1403885435491267

  • One-sample T-test

    Sample (t-distribution)

    𝑥 +s -s

    a = cutoff

    area = probability that the population mean falls below the cutoff value.

    Q(a): What is the probability that the true population mean falls above/below a given cutoff (a)?

    Practical examples: • Crop yields with a new fertilizer, where we want

    crops to achieve a certain yield. • Compare actual 911 response times to an ideal

    response time of 10min or less.

    P-value (in R): pt(t,df)

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑎 − 𝑥 𝑠

    𝑛

    One-tailed Example: 𝐻𝑂: 𝜇 < 𝑎 𝐻𝑎: 𝜇 > 𝑎

    T-value

    P-value

    Original Units

    T-test in R:

    t.test(sampleData,mu=a,alternative=“greater”)

  • One-sample T-test

    Sample (t-distribution)

    𝑥 +s -s

    Q(a): What is the probability that the true population mean falls above/below a given cutoff (a)?

    One-tailed Example: 𝐻𝑂: 𝜇 < 𝑎 𝐻𝑎: 𝜇 > 𝑎

    T-value

    P-value

    Original Units

    Signal

    • If signal is small = small ratio • If signal is large & noise is small =

    large (positive) value

    P-value = probability “I would get the observed signal to noise ratio by random chance” • A very high ratio is unlikely , therefore

    it is very unlikely the observation id due to random chance

    (e.g. something happened)

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑎 − 𝑥 𝑠

    𝑛

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑠𝑖𝑔𝑛𝑎𝑙

    𝑛𝑜𝑖𝑠𝑒

    Noise

  • One-sample T-test Q(b): What is the probability that the new value (b) belongs to the same population as the sample? We test the difference between b and sample mean Two-tailed Example: 𝐻𝑂: 𝑥 = 𝑏 𝐻𝑎: 𝑥 ≠ 𝑏

    T-value

    P-value

    T-test in R:

    t.test(sampleData,mu=b,alternative=“two.sided”,conf.level=0.95)

    x +s -s

    Practical example: • A size measurement of a suspected new beetle

    species compared to an existing species.

    b = new value How to answer this question: Let’s say p = 0.02 Therefore b is higher than 2% of scores But outside the 95% confidence interval (< 2.5%) Therefore we can reject the null hypothesis that 𝑥 = 𝑏, at the α=0.05 level.

    Purple area = 95% C.I. 2.5% on each tail

    Sample (t-distribution)

    Original Units

    P-value (in R): pt(t,df)

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑏 − 𝑥 𝑠

    𝑛

    p=0.025

    𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 = 𝑞𝑡(𝛼/2, 𝑑𝑓)

    Critical t-value: qt(/2,df=n-1)

  • One-sample T-test Q(b): What is the probability that the new value (b) belongs to the same population as the sample? We test the difference between b and sample mean One-tailed Example: 𝐻𝑂: 𝑥 < 𝑏 𝐻𝑎: 𝑥 > 𝑏

    T-value

    P-value

    T-test in R:

    t.test(sampleData,mu=b,alternative=“less”,conf.level=0.95)

    x +s -s

    Practical example: • Is my 2km outrigger canoe time trial too slow to be

    competitive at the national level?

    b = new value How to answer this question: Let’s say p = 0.02 Therefore b is higher than 2% of scores But outside the 95% confidence interval (< 5%) Therefore we can reject the null hypothesis that 𝑥 < 𝑏, at the α=0.05 level.

    Orange area = 95% C.I. 5% on lower tail

    Sample (t-distribution)

    Original Units

    P-value (in R): pt(t,df)

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑏 − 𝑥 𝑠

    𝑛

    𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 = 𝑞𝑡(𝛼, 𝑑𝑓)

    Critical t-value: qt(,df=n-1)

  • Population A Population B

    Sample A Sample B

    Q: Are the means of populations A and B the same?

    Q: Do samples A and B come from the same population?

    SAME QUESTION!

    Two-sample T-test

  • 𝐻𝑂: 𝑥 1 = 𝑥 2 𝐻𝑎: 𝑥 1 ≠ 𝑥 2 or 𝐻𝑂: 𝑥 1 − 𝑥 2 = 0 𝐻𝑎: 𝑥 1 − 𝑥 2 ≠ 0

    Practical example: • Is there a difference between fertilized and control

    plots?

    Sample A (x1)

    Sample B (x2)

    Compare Means

    Meaningless unless we also compare variance.

    Denominator = “pooled standard error” or variation within samples

    Two-sample T-test Q(x1, x2): Do samples A (x1) and B (x2) come from the same population? We test the difference between x1 and x2 sample means

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑥 1 − 𝑥 2

    𝑠12

    𝑛1+𝑠2

    2

    𝑛2

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑠𝑖𝑔𝑛𝑎𝑙

    𝑛𝑜𝑖𝑠𝑒

    Signal

    Noise

  • Two-sample T-test 𝐻𝑂: 𝑥 1 = 𝑥 2 𝐻𝑎: 𝑥 1 ≠ 𝑥 2 or 𝐻𝑂: 𝑥 1 − 𝑥 2 = 0 𝐻𝑎: 𝑥 1 − 𝑥 2 ≠ 0

    Compare Means

    A n1=10

    B n2=10

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑥 1 − 𝑥 2

    𝑠12

    𝑛1+𝑠2

    2

    𝑛2

  • Sample (t-distribution)

    Two-sample T-test 𝐻𝑂: 𝑥 1 = 𝑥 2 𝐻𝑎: 𝑥 1 ≠ 𝑥 2 or 𝐻𝑂: 𝑥 1 − 𝑥 2 = 0 𝐻𝑎: 𝑥 1 − 𝑥 2 ≠ 0

    Compare Means

    A n1=10

    B n2=10

    `x1 -`x2

    Red area = 95% C.I. α=0.05

    tactual = 2.35

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑥 1 − 𝑥 2

    𝑠12

    𝑛1+𝑠2

    2

    𝑛2

    Critical t-value: qt(𝛼 2 ,df=n1+n2-2)

    N1+2 = 20 df1+2 = n1 + n2 – 2 = 18

    P-value (in R): pt(t,df)

    Critical t-value > qt(p,df)

    > qt(.975,18)

    = 2.1009

    > qt(p,df)

    > qt(.025,18)

    = -2.1009

    T-value

    P-value pactual = 0.985

    T-test in R:

    t.test(sampleDataA,sampleDataB,mu=0,alternative=“two.sided”,conf.level=0.95)

    If tactual= 2.35

    Actual p-value > pt(t,df)

    > pt(2.35,18)

    = 0.985

    > pt(-2.35,18)

    = 0.0152 pactual = 0.015

    Reject Ho: 𝑥 1 ≠ 𝑥 2

  • Paired T-test

    Compare Means

    𝐻𝑂: 𝑥 1 = 𝑥 2 𝐻𝑎: 𝑥 1 ≠ 𝑥 2 or 𝐻𝑂: 𝑥 1 − 𝑥 2 = 0 𝐻𝑎: 𝑥 1 − 𝑥 2 ≠ 0

    Practical example: • How does soil nutrition across forest plots change

    from pre- to post-harvest

    Q(x1, x2): Do samples A (x1) and B (x2) come from the same population? BUT x1 and x2 are the same individuals before and after a treatment is applied We test the difference between x1 and x2 sample means

    Pre-treatment n=10

    Post-treatment n=10

    𝑡𝑎𝑐𝑡𝑢𝑎𝑙 =𝑋 𝐷 − 𝜇𝑜𝑠𝐷

    𝑛

    D: difference for pairs 𝑋 𝐷: the average of differences 𝜇𝑜: original mean (pre-treatment) 𝑠𝐷: standard deviation of the differences

    𝑠𝐷 = 𝐷𝑖−𝑋 𝐷

    2𝑛𝑖

    (𝑛−1)

    𝑛: number of observation pairs

    T-test in R:

    t.test(dataBefore,dataAfter,mu=0,alternative=“two.sided”,paired=T)

    Standard Error of the difference

    More powerful test because you know how much variation (i.e. error) to expect within the samples, making it easier to isolate the signal from the treatment.

    You only need to account for the error in the sample once, because the sample is comprised of the same units.

  • All these comparisons only work if we assume:

    1. Each observation of the dependent variable is independent of other observations

    2. The experimental errors of your data are normally distributed

    3. Equal variances among groups

    Remember…

    For a paired t-test we only require that the pair-differences (Ai-Bi) be independent from each other

    Skew and kurtosis will limit your ability to make meaningful comparisons

  • Lentil Challenge!

    Let’s say you are a geneticist in the agriculture field and you create a new GMO lentil (use Variety A from the class). Initially it shows increased yields so you go to your boss with this finding (hoping form some praise and compensation). Your boss sees the promise in your variety, but she also knows it is more expensive to produce and there could be a public backlash. She says she can only take the risk and move forward with production if you can be 90% sure that there will be at least a 30% productive gain (the current average yield is 500kg/ha)… How do you prove this to your boss (a.k.a. show me the math)?