View
2
Download
0
Category
Preview:
Citation preview
1
Chapter 7
Inference for
Distributions
§7.1: Inference for the Mean of a Population
Statistical inference involves using data collected in a sample to make statements about some population parameter.
Involves:1. Estimation (confidence intervals)
2. Tests of significance
Assumptions
1. Simple random sample
2. Either (1) the population is normalor
(2) the sample size is large enough for the central limit theorem to apply
3. The population standard deviation s is known.
nxZ
σµ−
=
Then the sampling distribution of is and the Z-score transformation gives
X ( )nNX σµ,~
2
General Concepts
Now suppose that the population standard deviation σ is unknown.
We still assume that we have a simple random sample and that the population is (at least approximately) normally distributed.
When σ is unknown the use of the Z procedures that we have discussed in Chapter 6 are not valid.
If the population standard deviation σ is unknown, then instead of using the standard normal distribution Z, we use a similar distribution called the Student’s t-distribution.
t-Distribution
Assumptions for t-distribution:
1. Simple random sample
2. Near normal population
i. If the sample size is very small (less than 15), then the distribution must be very close to normal.
ii. If the sample size is at least 15, the t procedures can be used except in cases in which the distribution is heavily skewed.
iii. For large samples (n > 40) the t-distribution is valid even for heavily skewed distributions (central limit theorem)
Properties of t-Distribution• The t distribution is a
family of similar probability distributions.
• A specific t distribution depends on a parameter known as the degrees of freedom.
• A t distribution with more degrees of freedom has less dispersion.
• The mean of the t distribution is zero.• As the number of degrees of freedom increases, the
difference between the t distribution and the standard normal probability distribution becomes smaller and smaller.
3
t-Distribution
The t-statistic is
The sampling distribution of this statistic is called the Student’s t-distribution with degrees of freedom n-1.
nsxt µ−
=
t-DistributionThe difference between the Z statistic and the t statistic is knowledge of the population standard deviation σ. If σ is known, then so too is the standard deviation of , , and hence use of the Z statistic is possible.If σ is unknown, then neither is the standard deviation of . Hence this standard deviation must be estimated.When the standard deviation of a statistic is estimated from sample data, the estimate is called the standard error of the statistic. In this case the standard error is
X nσ
X
( ) nsXSE =
Example
a. n = 12 p = .05
df = n - 1 = 12 - 1 = 11
t11 = 1.796
b. n = 28 p = .01 t = ???
df = n - 1 = 28 - 1 = 27
t27 = 2.473
c. n = 68 p = .05 t= ???
df = n - 1 = 68 - 1 = 67 (Use df = 60 or df = 80)
t60 = 1.671 t80 = 1.664
4
Confidence Interval for µ
A 100 C% confidence interval is given by
Situation: The population standard deviation σ is unknown
Assumptions: 1. Simple random sample
2. The population is near normal or the sample size is large enough for the central limit theorem to apply
where t* = the t value providing an area of (1-C)/2 in the upper tail of a t distribution with n –1degrees of freedom
s = the sample standard deviation
nstx *±
Example: Apartment Rents
A reporter for a student newspaper is writing an article on the cost of off-campus housing. A sample of 10 one-bedroom units within a half-mile of campus resulted in a sample mean of $550 per month and a sample standard deviation of $60.
Let us provide a 95% confidence interval estimate of the mean rent per month for the population of one-bedroom units within a half-mile of campus. We’ll assume this population to be normally distributed.
550 + 42.92or $507.08 to $592.92
We are 95% confident that the mean rent per month for the population of one-bedroom units within a half-mile of campus is between $507.08 and $592.92.
ns
*tx ±ns
*tx ±
Example: Apartment Rents
t Value df=n - 1 = 10 - 1 = 9, t* = 2.262.
1060
2.262550 ±
5
Exercises
1) Nine workers, chosen at random from a work force in a factory, have a mean wage of $125 a week with a standard deviation of $12. Assuming that the distribution of all workers' wages is normal, find a 98% confidence interval for the mean wages of all workers.
2) The monthly incomes (In $1.000) from a random sample of faculty at a university are shown below.
3.0 4.0 6.0 3.0 5.0 5.0 6.0 8.0
Compute a 90% confidence interval for the mean of the population. Give your answer in dollars.
Test for µ
Null hypothesis: H0: µ = µ0
Alternative hypothesis: Ha: µ > µ0Ha: µ < µ0Ha: µ ≠ µ0
Situation: The population standard deviation σ is unknown
1. Hypotheses
2. Test Statistic:
Which follows a t-distribution with n - 1 degrees of freedom.
nsx
tµ−
=
1. Upper one-sided test: Ha: µ > µ0p-value = P ( T > t )
2. Lower one-sided test: Ha: µ < µ0p-value = P ( T < t )
3. Two-sided test: Ha: µ ≠ µ0p-value = 2 P ( T > | t | )
3. P-Value
4. Conclusion
Reject H0 if the P-value < α
Test for µ
6
Example
a. Upper one-sided test, df = 8, t = 2.000
p-value = P( T8 > 2.000)
For p = .05, t8 = 1.860
For p = .025, t8 = 2.306
So .025 < p-value < .05
NOTE: Writing .05 < p-value < .025 is WRONG
Or, tcdf(2, 1E99, 8)=0.0403
Example
b. Lower one-sided test, df = 38 , t = -1.53
p-value =P( T40 < -1.53)
For p = .10, t40 = 1.303
For p = .05, t40 = 1.684
So .05 < p-value < .10
df = 38 use df = 40
= P( T40 > 1.53)
Or, tcdf(1.53, 1E99, 40)=0.0669
Example
c. Two-sided test, df = 15 , t = -2.680
p-value =2P( T15 > |-2.680|)
For p = .01, t15 = 2.602
For p = .005, t15 = 2.947
So .005 < P( T15 > 2.680) < .01
= 2P( T15 > 2.680)
So .01 < p-value < .02
Or, 2*tcdf(2, 1E99, 8)=0.0171
7
Example: Metro EMSA major west coast city provides one of the
most comprehensive emergency medical services in the world. Operating in a multiple hospital system with approximately 20 mobile medical units, the service goal is to respond to medical emergencies with a mean time of less than 4.8 minutes.
The director of medical services wants to test whether or not the service goal of less than 4.8 minutes is being achieved. He selected a sample of 20 emergency response times and found that the mean time is 4.6 minutes and the standard deviation 1.2 minutes. Test the claim at 0.05 level.
Example Parameter of interest =µ = mean time before
ambulance arrives1. H0:
Ha:µ = 4.8 minutesµ < 4.8 minutes
3. P-value: P(T19 < -0. 745)= P(T19 > 0. 745)
.20 < p-value < .25 Exact=0.2327
0.745201.2/4.84.6
nsµx
t −=−
=−
=
1.2s4.6,x20,n ===2. Test Statistic
There is insufficient evidence to conclude that the meantime before an ambulance arrives is less than 4.8 minutes.
4. Decision: Since p-value > .05, we fail to reject H0
Example: Banana PricesThe average retail price for bananas in 1998 was 51¢per pound, as reported by the U.S. Department ofAgriculture in Food cost Review. Recently, a randomsample of 15 markets gave the following prices forbananas in cents per pound.
56 53 55 53 50 57 58 54 48 4757 57 51 55 50
At 0.05 level, can you conclude that the current meanretail price for bananas is different from the 1998 mean of 51 ¢ per pound?
8
Example Parameter of interest = µ = mean retail price for bananas
1. H0:Ha:
µ = 51 ¢µ ≠ 51 ¢
3. P-value= 2P(T14 > 2.66)
0.01 < p-value < 0.02 Exact=0.0187
2.66153.5/5153.4
nsµx
t =−
=−
=
14df3.5,s53.4,x15,n ====2. Test Statistic
There is sufficient evidence to conclude that the meanretail price for bananas is not 51 ¢ .
4. Decision: Since p-value < .05, we reject H0
0.005<P(T14 > 2.66)<0.01
Exercises
1. An inspector from the Department of Weights and Measures weights 10 one-pound samples of peanut butter; he finds their mean weight is 15.8 oz. with standard deviation of 0.4 oz. Do the weights of packages of peanut butter sold by the shop from which these samples were taken differ from the announced weight?
Exercises
2. In the past the average age of employees of a large corporation has been 40 years. Recently, the company has been hiring older individuals. In order to determine whether there has been an increase in the average age of all the employees, a sample of 25 employees was selected. The average age in the sample was 45 years with a standard deviation of 5 years. Assume the distribution of the population is normal. At α = .05, test to determine whether or not the mean age of all employees is significantly more than 40 years.
9
• With a matched-sample design each sampled item provides a pair of data values.
• To compare the responses to the treatments in a matched pairs design.
• The parameter m is the mean difference in the two responses.
• The one-sample t procedure is applied to the observed differences.
Matched Pairs t Procedures
Example: Express Deliveries
A Chicago-based firm has documents that must be quickly distributed to district offices throughout the U.S. The firm must decide between two delivery services, UPX (United Parcel Express) and INTEX (International Express), to transport its documents. In testing the delivery times of the two services, the firm sent two reports to a random sample of ten district offices with one report carried by UPX and the other report carried by INTEX.
Do the data that follow indicate a difference in mean delivery times for the two services?
Delivery Time (Hours)District Office UPX INTEX DifferenceSeattle 32 25 7Los Angeles 30 24 6Boston 19 15 4Cleveland 16 15 1New York 15 13 2Houston 18 15 3Atlanta 14 15 -1St. Louis 10 8 2Milwaukee 7 9 -2Denver 16 11 5
Example: Express Deliveries
10
Let µ = the mean of the difference values for the two delivery services for the population of district offices.
Example: Express Deliveries
1. Hypotheses H0: µ = 0, Ha: µ ≠ 0
2. Test Statistic 9,10,9.2,7.2 ==== dfnsx 9,10,9.2,7.2 ==== dfnsx
2.94102.902.7
nsµx
t 0 =−
=−
= 2.94102.902.7
nsµx
t 0 =−
=−
=
3. P-value =2P(T9>2.94) 0.005<P(T9>2.94)<0.010.01<P-value<0.02 Exact=0.0165
4. Conclusion
There is a significant difference between the mean delivery times for the two services.
P-value <0.05, so reject H0.
The following data presents the number of computer units sold per day by a sample of 6 salespersons before and after a bonus plan was implemented. At0.05 level of significance, test to see if the bonusplan was effective. That is, did the bonus planactually increase sales?
Exercise
1287584After978673Before654321Saleperson
t procedures
Except in small samples, the assumption of a SRS from the population is more important than the assumption that the population distribution is normal.
Sample size is less than 15: use t procedures if the data are close to normal.Sample size is at least 15: the t procedures can beused except in the presence of strong outliers or strong skewness.
Sample size is large(n > 40): t Procedures can be used even for clearly skewed distributions
Using the t Procedures:
Robustness : A confidence interval or Hypothesis test is said to be robust if the confidence level or P-value does not change very much when the assumptions of the procedure are violated.
11
§7.2: Comparing Two Means
Now suppose we have two independent populations, and of interest is to make statistical inferences about the difference between the two population means: µ1 − µ2
Example:Suppose one population consists of all male students, and the second population consists of all female students.
We could be interested in making inferences about the difference between the mean IQ of male students and the mean IQ of female students.
Point Estimate
We take a simple random sample of n1 subjects from the first population (in our example a sample of n1 males) and an independent simple random sample of n2 subjects from the second population (in our example a sample of n2females).
A point estimate of µ1 − µ2 is the difference in the sample means:
21 xx −
Sampling Distribution
Assumptions:
(1) Suppose we have two independent simple random samples.
(2) (i) Either both populations are normally distributed:X1 ~ N(µ1, σ1) and X2 ~ N(µ2, σ2)
or (ii) The populations are possibly nonnormal but both sample sizes are large enough such that the central limit theorem applies
12
Sampling Distribution
This assumes σ1 and σ2 are both known.
The sampling distribution for the difference in the sample means, , is approximately normal with mean µ1 - µ2and standard deviation
21 xx −
( ) ( )2221
21 nσnσ +
⎟⎟⎠
⎞⎜⎜⎝
⎛+−−
2
22
1
21
2121 nσ
nσ
,µµN~xx
( ) ( )( ) ( )2
221
21
2121
nσnσ
µµxxZ
+
−−−=
Z-Score Transformation
Confidence Interval for µ1 − µ2
Suppose we want to estimate the difference between the population means: µ1 − µ2
If the assumptions satisfied then a 100 C% confidence interval for µ1 − µ2 is given by
( )2
22
1
21
21 nσ
nσ
zxx +±− ∗
ExampleIf a random sample of 50 non-smokers has a mean life of 76 years and a random sample of 65 smokers live 68 years. Find a 95% confidence interval for the difference of mean lifetime of non-smokers and smokers. Assume that the standard deviations of smokers and non-smokers are 8 and 9 years.
µ1 = mean lifetime of non-smokers andµ2 = mean lifetime of smokers
8,76,50 111 === σxn
9,68,65 222 === σxn
( )2
22
1
21
21 nσ
nσ
zxx +±− ∗
( )659
50896.16876
22
+±−
12.38 ±(4.88, 11.12) years
13
Test for µ1 − µ2
We hypothesize that the difference between the population means equals some specified value µ0 and we use data from our samples to test whether this value is reasonable or whether the mean difference is actually greater than µ0, less than µ0, or not equal to µ0.
Null Hypothesis: H0: µ1 − µ2 = µ0
Ha: µ1 − µ2 > µ0Alternative Hypothesis: Ha: µ1 − µ2 < µ0
Ha: µ1 − µ2 ≠ µ0
1. Hypotheses
In many problems µ0 = 0, and hence the hypotheses are often written as follows:
H0: µ1 − µ2 = 0 H0: µ1 = µ2
Ha: µ1 − µ2 > 0 Ha: µ1 > µ2
Ha: µ1 − µ2 < 0 Ha: µ1 < µ2
Ha: µ1 − µ2 ≠ 0 Ha: µ1 ≠ µ2
Test for µ1 − µ2
Assumptions:
(1) Suppose we have two independent simple random samples.
(2) (i) Either both populations are normally distributed:X1 ~ N(µ1, σ1) and X2 ~ N(µ2, σ2)
or (ii) The populations are possibly nonnormal but both sample sizes are large enough such that the central limit theorem applies
(3) The population standard deviations σ1 and σ2 are known.
Test for µ1 − µ2
14
2. Test Statistic
The p-value is calculated as we’ve seen in Ch 6
3. p-value
Reject H0 if p-value < α
4. Conclusion
2
22
1
21
21
nσ
nσ
xxz
+
−=
Test for µ1 − µ2
Example
The U.S. National Center for Health Statistics compilesdata on the length of stay by patients in short-termhospitals and publishes its findings in Vital and HealthStatistics. Independent samples of 39 male patients and 35 female patients gave sample means of 7.9 and 7.11 days respectively. At the 5% significance level, do the data provide sufficient evidence to conclude that , on the average, the length of stay in short-term hospitalsby males and females differ? Assume that σ1=5.4 andσ2=4.6 days.
Example
4.5,39,90.7 111 === σnX
6.4,35,11.7 222 === σnX
µ1 = mean length of stay in short-term hospitals by malesµ2 = mean length of stay in short-term hospitals by females
(1) H0: Ha:
µ1 = µ2 H0: µ1 − µ2 = 0µ1 ≠ µ2 Ha: µ1 − µ2 ≠ 0
(2)
(4) Conclusion: Since the p-value > .05, we do not reject H0.
There is no sufficient evidence to conclude that the meanlength of stay in short-term hospitals by males and females differ.
0.68
354.6
395.4
7.117.9z
22=
+
−=
(3) P-value= 2P(Z > |z|)=2P(Z>0.68)=2(0.2483)=0.4966
15
ExerciseThe registrar at AU is comparing the GPA of marriedand unmarried students. He finds that 100 married students have a mean GPA of 2.85, while a randomsample of 100 unmarried students has a mean GPA of2.73. At 0.01 level of significance, do married studentshave a higher GPA? Assume that σmarried=0.4 andσunmarried=0.3.
Comparing Two MeansSmall Samples and σ’s unknown
Suppose we have two independent populations with population means µ1 and µ2, respectively. We are interested in making statistical inferences (confidence interval and significance tests) about the difference in the population means: µ1 − µ2.
Earlier we assumed that the population standard deviationsσ1 and σ2 were known. Now we will discuss what to dowhen the population standard deviations are unknown.
If the population standard deviations are unknown, we calculate the sample standard deviations and use a t distribution instead of Z.
Confidence Interval for µ1 − µ2
( ) ( )( ) ( )2
221
21
2121
nσnσ
µµxxz
+
−−−=
When σ1 and σ2 are unknown, the Z statistic that results from the sampling distribution of is not appropriate.
21 xx −
16
Instead we must use a t-statistic
which follows a t distribution with degrees of freedom equal to the smaller of n1 - 1 and n2 - 1. (or approximated using the formula in page 536)
( ) ( )
2
22
1
21
2121
ns
ns
xxt+
−−−=
µµ
Confidence Interval for µ1 − µ2
When σ1 and σ2 are unknown, the Z statistic that results from the sampling distribution of is not appropriate.
21 xx −
If the assumptions are satisfied, then a 100 C% confidence interval for µ1 − µ2 can be determined using:
Where t* is the value t(k) density curve with area C between -t* and t*. The value of the degrees of freedom k is approximated by software or the smaller of n1 - 1 and n2–1.
( )2
22
1
21
21 ns
nstxx +±− ∗
Confidence Interval for µ1 − µ2
Example
In a sampling study conducted by the Clearview NationalBank, two independent samples of checking account balances for customers at two Clearview branch banksyielded the following results:
$12$92010Beechmont
$15$100012Cherry Grove
Sample Standard deviation
Sample Mean
Number of Checking Accounts
Bank Branch
Develop a 90% confidence interval for the difference between the mean checking account balances at the two branch banks.
17
Example
15,1000,12 111 === sxn 12,920,10 222 === sxn
( )2
22
1
21
21 ns
ns
txx +±− ∗
Parameter of interest: µ1−µ2
µ1 = average account balance at Cherry Grove branchµ2 = average account balance at Beechmont branch
df= min(12 -1, 10 - 1) = min(11, 9) = 9So t* =1.833
1012
1215
1.833920)(100022
+±−
10.5580 ±
($69.45, $90.55)
ExerciseIn a packing plant, a machine packs cartons with jars. Supposedly, a new machine will pack faster on the average than the machine currently used. The times it takes each machine to pack 10 cartons are recorded. The results in seconds are shown below:
43.23
42.13
Sample mean
0.75010Present
0.68510New
Sample Standard deviation
Sample size
Machine
Construct a 90% confidence interval for the difference between the mean time it takes the new machine to pack 10 cartons and the mean time it takes the present machine to pack 10 cartons.
follows a t distribution with degrees of freedom (df) equal to n1 + n2 – 2 where the pooled variance sp
2 is given by
When σ1 and σ2 are unknown and assumed equal then the statistic
( ) ( )
21p
2121
n1
n1s
µµxxt
+
−−−=
Sampling Distribution
2nn1)s(n1)s(n
s21
222
2112
p −+−+−
=2nn
1)s(n1)s(ns
21
222
2112
p −+−+−
=
18
Confidence interval with σ ‘s Unknown and equal is
Where t* is the value of t(n1+n2-2) density curvewith area C between -t* and t*.
Confidence Interval of µ1 − µ2Pooled procedure
( )21
p21 n1
n1
stxx +±− ∗( )21
p21 n1
n1
stxx +±− ∗
2nn1)s(n1)s(n
s21
222
2112
p −+−+−
=2nn
1)s(n1)s(ns
21
222
2112
p −+−+−
=
Specific Motors of Detroit has developed a new automobile known as the M car. 12 M cars and 8 J cars(from Japan) were road tested to compare miles-per-gallon (mpg) performance. The sample statistics are:
Sample #1 Sample #2M Cars J Cars
Sample Size n1 = 12 cars n2 = 8 carsMean = 29.8 mpg = 27.3 mpgStandard Dev. s1 = 2.56 mpg s2 = 1.81 mpgConstruct a 95% confidence interval for the difference in miles-per- gallon (mpg) performance assuming that the distributions of the populations are normal with equal variances.
Example: Specific Motors
x2x2x1x1
Let µ1= mean MPG for the population of M carsµ2= mean MPG for the population of J cars
Point estimate of µ1 − µ2 = = 29.8 - 27.3 = 2.5 mpg.
We will make the following assumptions:– The miles per gallon rating must be normally
distributed for both the M car and the J car.– The variance in the miles per gallon rating must
be the same for both the M car and the J car.
Using the t distribution with n1 + n2 - 2 = 18 degreesof freedom, the appropriate t value is t*= 2.101.
x x1 2−x x1 2−
Example: Specific Motors
19
• 95% Confidence Interval for the Difference Between Two Population Means:
= 2.5 + 2.2 or 0.3 to 4.7 miles per gallon.We are 95% confident that the difference between the mean mpg ratings of the two car types is from 0.3 to 4.7 mpg (with the M car having the higher mpg).
5.282812
7(1.81)11(2.56)2nn
1)s(n1)s(ns
22
21
222
2112
p =−+
+=
−+−+−
= 5.282812
7(1.81)11(2.56)2nn
1)s(n1)s(ns
22
21
222
2112
p =−+
+=
−+−+−
=
( )81
121
2.298*2.1012.5)n1
n1
(stxx21
p*
21 +±=+±− ( )81
121
2.298*2.1012.5)n1
n1
(stxx21
p*
21 +±=+±−
Example: Specific Motors
Significance Test on µ1 − µ2
Assumptions:
(1) Suppose we have two independent simple random samples.
(2) (i) Either both populations are normally distributed:X1 ~ N(µ1, σ1) and X2 ~ N(µ2, σ2)
or (ii) The populations are possibly nonnormal but both sample sizes are large enough such that the central limit theorem applies
(3) The population standard deviations σ1 and σ2 are unknown.
Significance Test on µ1 − µ2
We hypothesize that the difference between the population means equals some specified value µ0 (=0 in most cases) and want to test whether this value is reasonable or whether the mean difference is actually greater than µ0, less than µ0, or not equal to µ0.
Null Hypothesis: H0: µ1 − µ2 = µ0
Ha: µ1 − µ2 > µ0Alternative Hypothesis: Ha: µ1 − µ2 < µ0
Ha: µ1 − µ2 ≠ µ0
( H0: µ1= µ2)
( Ha: µ1> µ2)( Ha: µ1< µ2)( Ha: µ1 ≠ µ2)
Suppose the population standard deviations σ1 and σ2 are unknown.
1. Hypotheses
20
Significance Test on µ1 − µ2
If the assumptions are satisfied then the test statistic is:
( )( ) ( )2
221
21
21
nsns
xxt
+
−=
2. Test Statistic
which follows a t distribution with df=min( n1 – 1, n2 – 1).
The p-value of the test depends on the alternative hypothesis and is based on the t distribution.
3. P-value
The decision and conclusion are determined the same as for all other tests.
4. Conclusion
Significance Test on µ1 − µ2Pooled Procedure
( )( ) ( )21p
21
n1n1sxx
t+
−=
If the population standard deviations σ1 and σ2 are unknown and assumed equal (σ1=σ2), then the test statistic will be
which follows a t distribution with df=n1 + n2 – 2 where
2nn1)s(n1)s(n
s21
222
2112
p −+−+−
=2nn
1)s(n1)s(ns
21
222
2112
p −+−+−
=
Example: Faculty SalariesIndependent random samples of 25 faculty membersin public institutions and 30 faculty members inprivate institutions yielded the statistics in thefollowing table
1866.430Private2057.525Public
Standard deviation
Sample Mean
Sample size
At the 5% significance level, do the data provide sufficient evidence to conclude that mean salaries for faculty in public and private institutions differ?
21
Exampleµ1 = mean salaries for faculty in public institutions µ2 = mean salaries for faculty in private institutions
(1) H0: Ha:
µ1 = µ2 H0: µ1 − µ2 = 0µ1 ≠ µ2 Ha: µ1 − µ2 ≠ 0
(2)( )
( ) ( ) ( ) ( ) 1.71930182520
66.457.5
nsns
xxt
222
221
21
21 −=+
−=
+
−=
20,5.57,25 111 === sxn 18,4.66,30 222 === sxn
df=min(25-1, 30-1)=min(24,29)=24
(3) P-value=2 P(T24 >|-1.719|)= 2 P(T24 >1.719)
0.025 < P(T24 >1.719) < 0.05
0.05 < P-value < 0.1 Exact=0.0985
Exampleµ1 = mean salaries for faculty in public institutions µ2 = mean salaries for faculty in private institutions
(1) H0: Ha:
µ1 = µ2 H0: µ1 − µ2 = 0µ1 ≠ µ2 Ha: µ1 − µ2 ≠ 0
(2) t=-1.711
(3) 0.05 < P-value < 0.1
(4) Conclusion: Since the p-value > .05, we do not reject H0.
There is no sufficient evidence to conclude that mean salaries for faculty in public and private institutions differ.
ExampleA researcher was interested in comparing the amount oftime spent watching television by women and by men.Independent samples of 14 women and 17 men wereselected and each person was asked how many hours heor she had watched television during the previous week. The summary statistics are as follows
4.411.314Women4.716.917Men
Standard deviation
Sample Mean
Sample size
Do the data provide sufficient evidence to conclude that mean time for women is less than mean time for men? Perform a pooled t-test at the 5% significance level.
22
Exampleµ1 = mean time spent watching television by women µ2 = mean time spent watching television by men
(1) Η0: Ηa:
µ1 = µ2 H0: µ1 − µ2 = 0µ1 < µ2 Ha: µ1 − µ2 < 0
(2)
( )( ) ( ) ( ) ( )
3.39681/171414.568
16.913.1n1n1s
xxt
21p
21 −=+
−=
+−
=
4.4,3.11,14 111 === sxn 7.4,9.16,17 222 === sxn
df=14+17-2=29
(3) P-value=P(T29 <-3.3968)= P(T29 >3.3968)0.0005 < P(T29 > 3.3968) < 0.001
4.568s 20.866221714
1)(4.7)(171)(4.4)(14s p
222p =⇒=
−+−+−
= 4.568s 20.866221714
1)(4.7)(171)(4.4)(14s p
222p =⇒=
−+−+−
=
Exampleµ1 = mean time spent watching television by women µ2 = mean time spent watching television by men
(1) H0: Ha:
µ1 = µ2 H0: µ1 − µ2 = 0µ1 < µ2 Ha: µ1 − µ2 < 0
(2) t =-3.3968
(3) 0.0005 < P-value < 0.001
(4) Conclusion: Since the p-value < .05, we reject H0.
There is sufficient evidence to conclude that mean time for women is less than mean time for men (on average, women spend less time watching television than men).
Exercises
1. A random sample of 17 third graders who read poorly has a mean IQ of 98 with a standard deviation of 10; a random sample of 10 third graders who read well has mean IQ of 101 with a standard deviation of 9. At 0.05 level of significance is the mean IQ of good readers higher that the mean IQ of poor readers? Assume that the IQ scores for both groups is normally distributed with equal variances.
23
Exercises2. A high school teachers’ group is investigating summer
work patterns. It finds that the mean monthly income of 20 randomly selected teachers who teach in the summer is $600 with a standard deviation of $100, while a random sample of 10 teachers who will sell real estate during the summer is $700 with a standard deviation of $50. The teachers’ group believes the pay for both kinds is normally distributed with different variances. At 0.05 level of significance, is there any difference in the earning of the two groups?
Exercises3. Recently, a local newspaper reported that part time
students are older than full time students. In order to test the validity of its statement, two independent samples of students were selected. The following shows the ages of the students in the two samples. Using the following data, test to determine whether or not the average age of part time students is significantly more than full time students. Use an Alpha of 0.05. Assume the populations are normally distributed and have equal variances.
182019251721Part-time
20191822171819Full-time
Recommended