Upload
nikhilsingh
View
219
Download
3
Embed Size (px)
DESCRIPTION
Sampling theory in maths (notes)
Citation preview
Sampling Theory
F-test:(Test for equality of Population variances using F-distribution) Suppose we want to test whether two independent samples and have been drawn from the normal population with the same variance .Let Where and The test statistic defined above follows F-distribution with degrees of freedom considering the greater of and always in the numerator of .Note:Table value for degrees of freedom is different from the value for degrees of freedom.Working procedure1. Set the null hypothesis 2. Test statistic 3. Compare the calculated value of for the two given samples with the tabulated value of for degrees of freedom at any required level of significance.Problems1. Test the equality of standard deviations for the data given below at 5% level of significance: Soln:
( - calculated value of )The table value for degrees of freedom at 5% level of significance is .( - tabulated value of for 5% level of significance)As , is accepted.Therefore, the difference is not significant.Note:The numerator value (2.5) is more than the denominator value (1.55) and hence was computed for degrees of freedom, otherwise, we have to take test statistic as and then has to be determined for degrees of freedom.
2. In one sample of 8 observations the sum of the squares of deviations of the sample values from the sample mean was 84.4 and in another sample of 10 observations, it was 102.6. Test whether the difference in variance is significant at 5% level using F-test.Soln: Given Under the null hypothesis The table value for (7,9) degrees of freedom (d.f) at 5% level of significance is As , is accepted.Therefore, the difference is not significant at 5% level of significance.
3. Two random samples drawn from 2 normal populations are given below. Test whether the 2 populations have the same variance.Sample A28303233312934
Sample B293030242728-
Examine whether the samples have been drawn from normal population having the same variance. Soln: Given Similarly, If we take then, Numerator value is less than the denominator and hence we have to take Table value of for for degrees of freedom is .
As , is accepted.
Hence the two samples could have been drawn from the population having the same variance.4. The daily wages in rupees of skilled workers in two cities are as follows.CitySize of sample of workersS.D of wages in the samples
City A1625
City B1332
Test at 5% level the equality of variances of the wage distribution in the two cities.Soln: Given We have to take as
Table value for degrees of freedom is .
As , is accepted.
Similar problems for practice1. For two samples of sizes 8 and 12 the observed variances are 0.064 and 0.024. Test the hypothesis that the samples came from normal populations with variances equal.
2. In a sample of 8 observations the sum of the squared deviations of items from the mean was 94.5. In another sample of 10 observations the value was found to be 101.7. Test whether the difference is significant.
3. Two random samples drawn from two normal populations areA636568697172----
B63626566696970717273
Test whether the two populations have the same variance.
Examples on Fitting theoretical distribution to a given collection of observed data:
Example (1): Suppose five unbiased coins are tossed and numbers of heads are noted. The experiment is repeated 64 times and the following distribution is obtained.No. of heads012345Total
Frequencies36242641N=64
Let us try to fit a binomial distribution to this data.
Here As coins are unbiased, we have Expected frequencies64 Observed frequencies
03
16
224
326
44
51
Total16464
Example (2): Let us fit a Poisson distribution to the following data.01234Total
123591431200
Mean:
Expected frequencies200 Observed frequencies
0121.3123
160.6659
215.1614
32.523
40.541
Total1NN=200
-test to test the goodness of fit: Let be the observed frequencies and be the corresponding expected frequencies such that where is the number of members in the population.Suppose we intend to test the null hypothesis-: The theoretical frequency distribution is a good fit to the observed frequency distributionAgainst the alternative hypothesis: The theoretical frequency distribution is not a good fit to the observed frequency distribution.To test against , Chi-square test of goodness of fit is applied.Here, the test statistic is
Under this is a chi-square variate with degrees of freedom where is the number of terms in the (after pooling the frequencies which are less than 5 with the adjacent ones-Refer example) and c is the number of constraints. The theoretical frequencies are computed such that . This is one constraint. Apart from this, if any parameter is estimated from the oserved distribution, every such estimation would be a constraint. Thus, the value of would be one more than the number of parameters estimated from the observed distribution.Note (1): -test is one tailed (Right tailed).i.e. if then is accepted otherwise is rejected.Note (2): The chi-square test of goodness of fit is applicable subject to the following conditions.1. The observations should be independent.2. The total frequency N should be large.3. The theoretical frequencies should be 5 or more. If any is less than 5, it should be pooled with the adjacent frequency.4. If any parameter is estimated from the observed distribution, corresponding to every such estimation, one degree of freedom should be lessened.Problems1. The following data relates to the number of mistakes in each page of a book containing 180 pages.No. of mistakes per page012345 or moreTotal
No. of pages1303215210180
Test whether the Poisson distribution is a good fit to this observed distribution.Soln: : Poisson distribution is a good fit to the observed distributionThe alternative hypothesis is: Poisson distribution is not a good fit to the observed distributionTo test , we fit a Poisson distribution to the data.
Here the parameter is estimated by finding the mean from the observed distribution.
180.
00.6722121130
10.26664832
20.05551015
30.0055111182
4001
5000
Tot1180180
Here, the last three theoretical (expected) frequencies are less than 5. Therefore, they are pooled with the adjacent ones such that, finally all the frequencies are 5 or more.After pooling, we have
130121810.6694
32482565.3333
1811494.4545
Total10.4572
The test statistic is
(after pooling) 2 (1 for which is common in all cases and 1 for estimating the parameter from the observed distribution).
Thus, is a chi-square variate with degrees of freedom.
Now, from chi-square distribution table, the value of at 5% level of significance is
As, , is rejected.
Conclusion: Poisson distribution is not a good fit to the observed distribution.2. To an observed frequency distribution, binomial distribution is fitted after estimating from the observed data. The observed and theoretical frequencies are given below01234567Tot
33173128111296
17192724134196
Test whether binomial distribution is a good fit.Soln:
: Binomial distribution is a good fit.
0063081
137
21719
33127
42824
51113
6031054
721
Tot9696
The frequencies are pooled in such a way that none of the theoretical frequencies is less than 5. However, observed frequencies may be less than 5.
After pooling, we have
6840.5
171940.2105
3127160.5926
2824160.6667
111340.3077
3540.2
Total2.4775
The test statistic is
(after pooling) 2 (1 for which is common in all cases and 1 for estimating the parameter from the observed distribution).
Thus, is a chi-square variate with degrees of freedom.
Now, from chi-square distribution table, the value of at 5% level of significance is
As, , is accepted.
Conclusion: Binomial distribution is a good fit.
3. 10000 digits are randomly chosen from a telephone directory and the following data is obtained.DigitFrequency
0926
11207
21097
31066
41275
5833
61007
7872
8864
9853
Total10000
Test whether there is equi-distribution in the telephone director at 1% level of significance.
Soln:
: The digits are equi-distributed in the telephone directory.The expected (theoretical) frequencies corresponding to each of the digits should be equal.i.e.
926100054765.476
120710004284942.849
1097100094099.409
1066100043564.356
127510007562575.625
83310002788927.889
10071000490.049
87210001638416.384
86410001849618.496
85310002160921.609
Total222.142
The test statistic is
1(1 for ).
Thus, is a chi-square variate with degrees of freedom.
Now, from chi-square distribution table, the value of at 1% level of significance is
As, , is rejected.
Conclusion: In the telephone dictionary, the digits are not equi-distributed.
4. According to a theory in Genetics, the proportion of beans of four types A, B, C and D in a generation should be 9:3:3:1. In an experiment, among 1600 beans, the frequency of beans of each of the above four types were 882, 313, 287 and 118 respectively. Does the result support the theory?
Soln:
: The result of the experiment supports the theory.
Under , the expected frequencies should be in the ratio 9:3:3:1.
8829003240.36
3133001690.56
2873001690.56
1181003243.24
Total4.72
The test statistic is
1(1 for ).
Thus, is a chi-square variate with degrees of freedom.
Now, from chi-square distribution table, the value of at 5% level of significance is
As, , is accepted.
Conclusion: The result of the experiment supports theory.
5. In order to test whether a die is biased, it is thrown 72 times and the results are tabulated as follows:Result of throw123456Tot
Number of throws814159131372
What is your conclusion?
Soln:: The die is unbiased.Under , all the sides of the die are equiprobable. Therefore, their frequencies should be equal.So, the theoretical frequencies are
1812161.3333
2141240.3333
3151290.75
491290.75
5131210.0833
6131210.0833
Total3.33
The test statistic is
1(1 for ).
Thus, is a chi-square variate with degrees of freedom.
Now, from chi-square distribution table, the value of at 5% level of significance is
As, , is accepted.
Conclusion: The die is unbiased.
6. A survey of 64 families with 3 children each is conducted and the number of male children in each family is noted. The result are tabulated as follows:Male children0123Total
Families619291064
Apply chi-square test of goodness of fit to test whether male and female children are equiprobable.
Soln:
: Male and female children are equiprobable. (Probability of male child is 0.5)
Under , to the given data, binomial distribution can be fitted. (m=3 and p=0.5)
Expected frequencies64
08
124
224
38
Total164
6840.5
1924251.042
2924251.042
10840.5
Total3.084
The test statistic is
1(1 for . Note that neither m nor p is estimated).
Thus, is a chi-square variate with degrees of freedom.
Now, from chi-square distribution table, the value of at 5% level of significance is
As, , is accepted.
Conclusion: Male and female children are equiprobable.
Similar problems for practice1. Among 64 offsprings of a certain cross between Guinea pigs 34 were red, 10 were black and 20 were white. According to the genetic model these numbers should be in the ratio 9:3:4. Are the data consistent with the model at 5% level?
Hint: : Data are consistent. ; ; at d.f As, , is accepted.
2. The following table gives the number of train accidents in a country that occurred during the various days of the week. Find whether the accidents are uniformly distributed over the week.Hint: : Accidents are uniformly .If the accidents are to be uniformly distributed it is expected that accidents happen per day. for all the days of the week. at d.f As, , is accepted.
3. Five coins are tossed 320 times. The number of heads observed is given below. Examine whether the coin is unbiased.
Hint: : The coin is unbiased (p=1/2) . ; ;
at d.fAs, , is rejected.
4. A survey of 320 families with 5 children each revealed the following information.No. of boys543210
No. of girls012345
No. of families1456110884012
Is the result consistent with the hypothesis that male and female births are equally probable?
Hint: : Male and female births are equiprobable. (Probability of male child is 0.5) ; ; No. of male births
01210
14050
288100
3110100
45650
51410
at d.fAs, , is accepted.
5. Fit a poisson distribution for the following data and test the goodness of fit.x0123456Tot
f27370307721390
Hint: : Poisson distribution is a good fit.
0273236.4
1705.9 after adding 0.3 so that total becomes 390118.2
23029.5
374.9
41770.6
520.1
610
total390389.7
After pooling, we have
273236.4
70118.2
3029.5
175.9
at d.f 2 (1 for which is common in all cases and 1 for estimating the parameter from the observed distribution).
As, , is rejected.
Page 22