22
Hypothesis Testing Comparing One Sample to its Population

Hypothesis Testing

Embed Size (px)

DESCRIPTION

Hypothesis Testing. Comparing One Sample to its Population. Hypothesis Testing w/ One Sample. If the population mean ( μ ) and standard deviation ( σ ) are known: Testing if our sample mean ( ) is significantly different from our sampling distribution of the mean - PowerPoint PPT Presentation

Citation preview

Page 1: Hypothesis Testing

Hypothesis Testing

Comparing One Sample to its Population

Page 2: Hypothesis Testing

Hypothesis Testing w/ One Sample

If the population mean (μ) and standard deviation (σ) are known: Testing if our sample mean ( ) is significantly

different from our sampling distribution of the mean

Similar to testing if how different an individual score is from other scores in the sample What is this test called?

X

Page 3: Hypothesis Testing

Hypothesis Testing w/ One Sample

z-score formula [for an individual score (x)] =

z-score formula [for means ( )] =

X

z

X

N

Xz

Page 4: Hypothesis Testing

Hypothesis Testing w/ One Sample

Testing score versus standard deviation for an distribution of scores

Testing mean versus standard deviation for distribution of sample means I.e. standard error

X

z

N

Xz

Page 5: Hypothesis Testing

Hypothesis Testing w/ One Sample

Two implications of this formula: 1. Because we are dividing by N (actually √N),

with the same data (same sample & population mean and σ), but larger sample size, our p-value will be smaller (i.e. more likely to be significant) All statistical tests that produce p-values will be

sensitive to sample size – i.e. with enough people anything is significant at p < .05

Page 6: Hypothesis Testing

Hypothesis Testing w/ One Sample

Two implications of this formula: 2. If you recall, this formula was derived from the

formula for the normal distribution This means that your data must be normally

distributed to use this test validly However, this test is robust to violations of this

assumption – i.e. you can violate it, if you have (a) a large enough sample or (b) your population data is normally distributed

Why?

Page 7: Hypothesis Testing

Hypothesis Testing w/ One Sample

The Central Limit Theorem:

Given a population with mean μ and variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to μ (i.e., μ = μ) and a variance (σ2) equal to σ2/N (and standard deviation, σ = σ/√N). The distribution will approach the normal distribution as N, the sample size, increases.

X

XX

Page 8: Hypothesis Testing

Hypothesis Testing w/ One Sample

Example #1: You want to test the hypothesis that the current

crop of Kent State freshman are more depressed than Kent State undergraduates in general. What is your sample and what is your population? What is your Ho and your H1? Are you using a one- or two-tailed test? Assuming that for current Kent State freshman, their

mean depression score is 15, while the mean for all previous Kent State undergrads (N = 100,000) is 10, and their standard deviation is 5

Page 9: Hypothesis Testing

Hypothesis Testing w/ One Sample

= 5/.0158 = 316.46

Look up p associated with z-score in z-table p < 0.0000 Since this is less than .05 (or .025 if we were using

a two-tailed test), we could conclude that the current batch of freshman are significantly more depressed than previous undergrads Also notice the effect that our large N had on our p-value

000,100

51015

z

Page 10: Hypothesis Testing

Hypothesis Testing w/ One Sample

Most often, however, we don’t know the μ and σ, because this is what we’re trying to estimate with our sample in the first place

The formula for the t statistic accomplishes this by substituting s2 for σ2 in the formula for the z statistic

Because of this substitution, we have a different statistic, which requires that we use a different table than the z-table

Don’t worry too much about why it’s different

Page 11: Hypothesis Testing

Hypothesis Testing w/ One Sample

Testing mean versus standard deviation for distribution of sample means I.e. standard error

Testing mean versus standard deviation for sample

N

sX

t

N

Xz

Page 12: Hypothesis Testing

Hypothesis Testing w/ One Sample

After computing our t statistic, we need to compare it with the t-table (called the Student’s T-Table) “Student” is a pseudonym for William Gosset

Gosset worked for the Guiness Brewing Company, but they wouldn’t let him publish under his own name

First, we will need to become familiar with the concept of degrees of freedom or df df = N – 1 This represents the number of individual subjects data

points that are free to vary, if you know the mean or s already

Page 13: Hypothesis Testing

Hypothesis Testing w/ One Sample

For example: If we already know that a particular set of data has a mean

of 5, and 10 scores in total (n = 10) Once we have nine of those scores, we can calculate the

tenth, however, if we have eight scores we do not know what the other two scores could be We can solve x + 5 = 10, but not x + y = 10, because

in the latter we have more than one unknown (x and y) x and y could be 5 and 5, 8 and 2, 4 and 6, 7 and 3, etc.

Therefore, nine scores are free to vary, then the tenth is fixed

Page 14: Hypothesis Testing

Hypothesis Testing w/ One Sample

Factors that influence the z and t statistics: The difference between the sample mean and population

mean – greater differences = greater t and z values The magnitude of s (or s2) – since we’re dividing by s,

smaller values of s result in larger values of t or z [i.e. we want to decrease variability in our sample (error)]

The sample size – the bigger the bigger t and z The significance level (α) – the smaller the α, the higher the

critical t to reject Ho – although raising α also raises our Type I Error, so we probably won’t want to do this without good reason

Whether the test is one- or two-tailed – two-tailed tests split α into two tails of p< .025, instead of one tail at p < .05

Page 15: Hypothesis Testing

General Approach to Hypothesis Testing

1. Identify H0 and H1

2. Calculate df and identify the critical test statistic

3. Determine whether to use one- or two-tailed test, determine what value of α to use (usually .05), and identify the rejection region(s) that the critical statistic is the boundary of

4. Calculate your obtained test statistic

7. Compare your value of your test statistic to your rejection region to determine whether or not to reject H0

Page 16: Hypothesis Testing

Hypothesis Testing w/ One Sample

Example #1: You’ve administered a therapy for people with anorexia that

will supposedly assist them in gaining weight. The following data are amount of weight gained in pounds over your 16 session therapy for 29 participants. Does this represent a significantly increased degree of weight gain compared to the average weight gained without treatment (-.45 lbs.)?

What are Ho and H1? Will you be using a one- or two-tailed test? Why? Based on this, what is your df?

Page 17: Hypothesis Testing

Hypothesis Testing w/ One Sample

Example #1:

1.7 -9.1

.7 2.1

-.1 -1.4

-.7 1.4

-3.5 -.3

14.9 -3.7

3.5 -.8

17.1 2.4

-7.6 12.6

1.6 1.9

11.7 3.9

6.1 .1

1.1 15.4

-4.0 -.7

20.9

Page 18: Hypothesis Testing

Hypothesis Testing w/ One Sample

Example #1: Sample Mean = 87.2/29 = 3.0069

s2 = (1757.8 – [(87.2)2/29])/ 28 = 53.41

s = 7.3085

t = (3.0069 - -.45)/(7.3085/√29) = 2.5472, p < .05 t > Critical t and in our rejection region, which is

above the population mean (since we’re only interested in people gaining weight), therefore we reject Ho and conclude that our treatment is more effective than no treatment at all

Page 19: Hypothesis Testing

Hypothesis Testing w/ One Sample

Often, if we’re reporting the results of our experiments to the public, or the results of an assessment (psychological or otherwise) to a client, we want to emphasize to them that our measurements are made with error, or that our samples include sampling error We can do this by including intervals around the

scores we report, indicating that the “true” score measured without error lies in this interval

Page 20: Hypothesis Testing

Hypothesis Testing w/ One Sample

This is what is known as a Confidence Interval In keeping with the p < .05 tradition, we are often

looking for the 95% confidence interval, or the scores that 95% of our distribution lie, but we can do this for any interval

They are calculated just like for z-scores, where we plug the t values into the formula and work backwards

Page 21: Hypothesis Testing

Hypothesis Testing w/ One Sample

For a 95% CI: Given your df (we’ll assume df = 9 for this

example) and type of test (assume a two-tailed test for now), look up your critical values of t from a t-table (t = ± 2.262)

Plug into your formula with your Sample Mean and s (which we’ll assume are 1.463 and .341, respectively), and solve for μ

Page 22: Hypothesis Testing

Hypothesis Testing w/ One Sample

For a 95% CI:± 2.262 = (1.463 – μ)/(.341/√10)

μ = ±2.262(.108) + 1.463 = ±.244 + 1.462

.244 + 1.463 = 1.707

-.244 + 1.463 = 1.219 CI.95 = 1.219 ≤ μ ≤ 1.707