33
Statistics and Probability Solved Assignments Semester Spring 2010 Assignment 1 Question 1: (Marks: 2+2+2+4=10 (a) Give an answer of the followings: For a series, mean is 5 and mode is 2, find median of the series Given that Mean =5 and mode = 2 Now we will find the median by using the empirical relationship among the three measures i.e. ( 29 mod 3 2 1 (mod 2 ) 3 1 1 (2 2 5) 12 4 3 3 e median mean median e mean median = - = + = + × = = What is aim of collecting numerical data for a statistical study? The main purpose of a statistical study is to make inference about population on the basis of sample data. So to get descriptive information from sample, we need data. And collection of numerical data provides the BASIS for the analysis of data to carry out further steps. Write down the functions of statistics. 1. Statistics assists in summarizing the larger set of data. 2. Statistics assists in the efficient design of laboratory and field experiments as well as surveys.

Statistics and Probability - Solved Assignments - Semester Spring 2010

Embed Size (px)

DESCRIPTION

Statistics and Probability - Solved Assignments - Semester Spring 2010

Citation preview

Page 1: Statistics and Probability - Solved Assignments - Semester Spring 2010

Statistics and Probability

Solved Assignments

Semester Spring 2010

Assignment 1 Question 1: (Marks: 2+2+2+4=10

(a) Give an answer of the followings:

• For a series, mean is 5 and mode is 2, find median of the series

Given that

Mean =5 and mode = 2

Now we will find the median by using the empirical relationship among the three measures

i.e.

( )

mod 3 2

1(mod 2 )

31 1

(2 2 5) 12 43 3

e median mean

median e mean

median

= −

= +

= + × = =

• What is aim of collecting numerical data for a statistical study?

The main purpose of a statistical study is to make inference about population on the basis of sample data. So to get descriptive information from sample, we need data. And collection of numerical data provides the BASIS for the analysis of data to carry out further steps.

• Write down the functions of statistics.

1. Statistics assists in summarizing the larger set of data.

2. Statistics assists in the efficient design of laboratory and field experiments as well as surveys.

Page 2: Statistics and Probability - Solved Assignments - Semester Spring 2010

3. Statistics assists in a sound and effective planning in any field of inquiry.

(b)

A paint retailer has had numerous complaints from customers about under-filled paint cans. As a result retailer started to inspect the incoming shipments. A recent shipment contained 2,440 gallon-size cans. The retailer sampled 50 cans and weighted each on a scale capable of measuring weight up to four decimal places and properly filled cans weight 10 pounds.

Now for this problem

1. Describe a population 2. Describe a variable of interest 3. Describe the data type of variable 4. Describe a sample

Sol:

Reading the question statement, we know that

a) The population is the set of units of interests to the retailer, which is the shipment of 2,440 cans of paint.

b) The weight of paint cans is the variable, the retailer wishes to evaluate. c) In this case retailer has to measure the weight, and the weight is continuous quantitative

variable. d) The sample is the subset of population. In this case, it is the 50 cans of paint selected by

the retailer.

Question 2: Marks: 2+2+6=10

(a) How collection of data is performed with the help of enumerators.

Under this method, the information is gathered by employing trained enumerators who assist the

informants in making the entries in the schedules or questionnaires correctly. This method gives the

most reliable information if the enumerator is well-trained, experienced and tactful.

(b) Average height of the students in a school is 5.2 inches. A sample of 12 students showed the following heights in inches.

5.0, 5.3, 5.2, 4.9, 4.11, 5.0, 5.5, 5.4, 5.1, 5.0, 5.2, 4.10

Calculate the sampling error.

Sol:

Page 3: Statistics and Probability - Solved Assignments - Semester Spring 2010

As µ=5.2 and sample mean of the data is

59.814.98

12

xx

n= = =∑

Sampling error x= − µ

=4.98-5.5=-0.22

(c) Find the missing frequencies and complete the following table.

As the relative frequency= class frequency/total

=2/15

So, First class has 2 frequency and in cumulative first class frequency is the first cumulative frequency so first cumulative will also be 2

Now if we add 1 and 2 we will get 3 which is third cumulative frequency.

The difference between 7 and 3 is 4, so 4 will be the 3rd class frequency

Add 7 and 3 will give 10 which is the 4th cumulative frequency

And last cumulative frequency is the total no of all the frequencies the difference between 10 and 15 will generate 5 which is the last class frequency

x f C.f Relative Cumulative frequency

2 2/15

4 1

6 7

8 3

10 15 1

Page 4: Statistics and Probability - Solved Assignments - Semester Spring 2010

BY dividing all the also frequencies we can obtain the relative frequencies.

Question 3: Marks: =2+8=10

a) Can we find out the Median from the following data? If yes, write the reason (No need to calculate the median).

Wages of workers in a factory

Monthly Income (Rs.) NO. of Workers

Less than 2000/- 100

2000-2999/- 300

3000-3999/- 250

4000-4999/- 50

5000 & above 1200

Sol:

Yes we can find the median from the data as median is the most appropriate measure of average when data is in open ended class intervals.

(b) Compute Mean, Median and Mode from the following data.

x f c.f CumulativRelative

frequency

2 2 2 2/15

4 1 3 3/15

6 4 7 7/15

8 3 10 10/15

10 5 15 15/15

No. of students 1 2 3 5 6

f 15 10 5 15 5

Page 5: Statistics and Probability - Solved Assignments - Semester Spring 2010

Sol:

Mean= 155 / 50

3.1

fxX

f= =

=

∑∑

Since n/2 =50/2=25 is an integer so, median will be the averages of (n/2)th value and {(n+2)/2}th value,

and

2( )

250 2

( )2

52( )226

nmedian th value

th value

th value

th value

+=

+=

=

=

Now we check the 25th value and the 26th value in the cumulative frequency column and found that these values lie corresponds to 2 & 3 respectively. So

Median= (2+3)/2

No. of students(x) f fx c.f

1 15 15 15

2 10 20 25

3 5 15 30

5 15 75 45

6 5 30 50

Total 50 155

( )250

( )2

25

nmedian th value

th value

th value

=

=

=

Page 6: Statistics and Probability - Solved Assignments - Semester Spring 2010

=2.5

Mode

As the data is discrete, so mode would be that value; which occur maximum no. of times in the data set and here we have two modes 1 and 5, as they both occur equal no. of times in the data set i.e. 15 times.

Page 7: Statistics and Probability - Solved Assignments - Semester Spring 2010

Assignment 2 Question 1: (Marks: 4x2=8)

Give the answer of short questions.

a) Why Quartile deviation is better than the Rang?

Range is only the difference between the minimum and maximum value. It gives no information about the distribution between two ends of series and it is affected by outliers (highly extreme values). Hence it can draw misleading/false picture of the observation.

The quartile deviation is superior to range as it is not affected by extremely large or small observations. It covers the central 50% of values. It is also used in situations where extreme observations are thought to be unrepresentative.

b) How standard deviation is better than mean absolute deviation?

Both are used to measure the dispersion of the data set and involve each and every data-value in their computation. But in mean deviation, while using the absolute values we neglect the fact that some deviations are negative and some are positive. We introduce a kind of artificiality in Mean Deviation and because of that the further theoretical development or application of the concept is impossible.

This problem is overcome by computing the standard deviation. This problem is overcome by computing the Standard Deviation. We square the deviations in Standard Deviation rather than taking absolute values of the deviations.

That’s why standard deviation is much preferred and widely used measure of dispersion.

c) What is the uselessness of Chebyshev’s Theorem?

A limitation of the Chebychev's theorem is that it gives no information at all about the probability of observing a value within one standard deviation of the mean. That is when the

value of constant “k” is one. Although huge amount of data fall withinµ σ± , this can not be explained by this theorem.

Page 8: Statistics and Probability - Solved Assignments - Semester Spring 2010

d) If coefficient of skewness = 0, then what would you say about the skewness of the

distribution?

If the coefficient of skweness = 0, then it is a symmetrical distribution. That’s mean, median and mode of distribution is equal.

Question 2: (Marks: 4+8=12)

a) Show that the range is greatly affected by the extreme values; interpret the result.

996 999 9 997 995 1000 1014 1002 1001

Solution:

Given that

996 999 9 997 995 1000 1014 1002 1001

Then

Range=Xm-X0

=1014-9

=1005

Interpretation:

Observing the values closely, we find that value ‘9’ is significantly smaller than the rest of values in the data set. And since range depends on this value too, this single value has caused the range of the data set to be wider and it is presenting a misleading picture about the whole data.

b) The mean and the standard deviation of a set of values is 50 and 10 respectively. Compute

2X S± and 3X S± . Interpret the results in the light of (i) empirical rule (ii) Chebyshe’s inequality.

Page 9: Statistics and Probability - Solved Assignments - Semester Spring 2010

Solution:

From the given information

2 50 2(10) (30,70)X S± = ± =

3 50 3(10) (20,80)X S± = ± =

(i) Empirical Rule:

• According to empirical rule, in a normal distribution, the interval 2X S± contains 95.45% values. So here we can say that the 95.45% of the data lies in the interval the (30, 70).

• According to empirical rule, in a normal distribution, the interval 3X S± contains 95.45% values. So we can say that the 99.73 % values lie within interval (20, 80).

(ii) Chebychev’s inequality:

• According to Chebychev’s inequality, the interval 2X S± contains at least

2 2

1 1 31 1

2 4k − = − =

= 75% of the observations. So we can say that by this rule, 75%

values of given data lies in the interval (30, 70).

• According to Chebychev’s inequality, the interval 3X S± contains at least

2 2

1 1 81 1

3 9k − = − =

= 88.89% of the observations. So we can say that by this rule,

88.99% values of given data lies in the interval (20, 80). Question 3: (Marks: 5+5=10)

a. Find the first two moments about mean from the following data.

X= 34, 70, 42, 54, 40, 68, 56, 38, 36, 72

Solution:

To find the moments about mean we have to find the mean of the data.

Page 10: Statistics and Probability - Solved Assignments - Semester Spring 2010

X XX − 2)( XX −

34 -17 289

36 -15 225

38 -13 169

40 -11 121

42 -9 81

54 3 9

56 5 25

68 17 289

70 19 361

72 21 441

0 2010

Mean:

51051

10

XX

n

Σ= = =

b) Calculate Bowley’s coefficient of skewness from the following information.

Firs moment is given by ( )

0n

xxm i

1 =−= ∑

Second moment is given by

( )2

2

2010201

10

ix xm

n

−=

= =

Page 11: Statistics and Probability - Solved Assignments - Semester Spring 2010

1Q = 34.087156

3Q = 44.962963

Xɶ = 39.606382

Solution:

Bowley’s co-efficient of skew ness:

34.087156 44.962963 2(39.606382)

44.962963 34.0871560.162645

10.8758070.014954752

Sk

Sk

Sk

+ −=−

−=

= −

1 3

3 1

( 2 )k

Q Q MedianS

Q Q

+ −=−

Page 12: Statistics and Probability - Solved Assignments - Semester Spring 2010

Assignment 3 Question 1: Marks: 3+3+4=10

a) For a particular data with five pair of values:

2 26, 10, 37Y Y XY= = =∑ ∑ ∑

The fitted line is y = -1+0.5x

Find the standard error of estimate (yxs )

Solution: b) Two equations of the least square regression lines are given by

Y= 2.64 + 10.83 X And

X= -1.91 + 6.18 Y Are these lines possible for any data set? Explain your answer:

Solution:

These lines are possible only if the square root of the product of two slopes “r” lies between -1 and +1. The correlation coefficient “r” in this case is given blow.

2

2yx

Y a Y b XYs

n

− −=

−∑ ∑ ∑

( ) ( ) ( ) ( )26 1 10 0.5 37

5 2

− − −=

26 10 18.5

3

+ −=

17.5

5.833 2.4153

= = =

Page 13: Statistics and Probability - Solved Assignments - Semester Spring 2010

10.83 6.18

66.93 8.18 1

yx xyr b b

r

r

= ×

= ×

= = >

So these lines are not possible for any data sets.

c) Two dice are rolled. Make a sample space also find the probability that

i. The sum of the outcomes is equal to 10. ii. The sum of the outcomes is equal to 7. iii. The sum of the outcomes is equal to 1. Solution: S = {(1, 1), (1, 2), (1, 3), (1, 4) (1, 5), (1, 6),

(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}

n(S) = 36

Let A be the event that sum of the outcomes is equal to 10.

A = {(4, 6), (5, 5), (6, 4)}

( 10) 3P (Sum is A) = 0.0833

( ) 36

n sum

n S= =

Let B be the event that sum of the outcomes is equal to 7.

B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}

( 7) 6P(B) = 0.167

( ) 36

n sum

n S= = = 6/36

Let C be the event that sum of the outcomes is equal to 1.

C = {φ }

( 1) 0P(C) = 0

( ) 36

n sum

n S= =

Question 2: Marks: 4+6=10

Page 14: Statistics and Probability - Solved Assignments - Semester Spring 2010

a) If S= {1, 2, 3, 4, 5, 6}, A = {1, 2, 3, 4} and B = {3, 4, 5, 6}, then verify whether A and B are independent?

Solution:

AS S= {1, 2, 3, 4, 5, 6}, A = {1, 2, 3, 4} and B = {3, 4, 5, 6}, then

For independent events

( ) ( ) ( )P A B P A P B∩ = ×

So we will check this condition

A B = {3, 4}

P (A B) = 2/6

P (A) = 4/6

P (B) = 4/6

∩∩

Since, P (A) x P (B) = 4/6 x 4/6 P (A) x P (B) = 4/9 P (A) x P (B) P (A B)

Hence A and B are not independent.

≠ ∩

b) Indicate whether the following statement is true or false for three mutually exclusive events A, B and C. Justify your answer.

1 2 1 1 1( ) , ( ) ( )

6 3 6 4 6P A P B and P C= × = × =

Solution:

Given that

1( )

6P A =

And

2 1. ( )

3 61 3 3

( )6 2 123

( )12

P B

P B

P B

=

⇒ = × =

⇒ =

Page 15: Statistics and Probability - Solved Assignments - Semester Spring 2010

Now

( )

( )

1 1.

4 61 4 4

6 1 6

P C

P C

=

= × =

For three events to be mutually exclusive there sum must be equal to one

1 3 4 13( ) ( ) ( ) ( ) ( ) ( ) 1

6 12 6 12P A P B P C+ + = + + = ≠

Hence we can say that the given statement is not true.

Question 3: Marks: 2+8=10

a) If we draw a card from an ordinary deck of 52 playing cards. Can king and diamond be

mutually exclusive events? Give reason to support your answer.

Solution: The both events can not be mutually exclusive because if we draw a card from an

ordinary deck of 52 playing cards it can be both a king and a diamond. So they are not

mutually exclusive events.

b) A marble is drawn at random from a box containing 10 red, 30 white, 20 blue and 15 orange marbles.

Find the probability that the drawn marble is

i. orange or red ii. not – ‘red or blue’ iii. not blue iv. red, white or blue.

Solution:

Red marbles White marbles Blue marbles Orange marbles Total 10 30 20 15 75

Total number of possible ways to draw a marble = 75

1 75=ℂ

i. P(marble is orange or red) = 15 10 1

0.3375 3

+ = =

ii. P(marble is not – ‘red or blue’) = 30 15 45 3

0.6075 75 5

+ = = =

iii. P(marble is not blue’) = 10 30 15 55 11

0.7375 75 15

+ + = = =

Page 16: Statistics and Probability - Solved Assignments - Semester Spring 2010

iv. P(marble is red, white or blue) = 10 30 20 60 4

0.8075 75 5

+ + = = =

Page 17: Statistics and Probability - Solved Assignments - Semester Spring 2010

Assignment 4 Question 1: Marks: 3+7=10

a)

Find mean from the following probability distribution.

Sol:

The mean of this distribution is:

µ = E(X) = ∑XP(X) = 5.925 ≅ 5.9.

No. of Petals X P(X)

x1 = 3 0.05 x2 = 4 0.10 x3= 5 0.20 x4 = 6 0.30 x5 = 7 0.25 x6 = 8 0.075 x7 = 9 0.025 Total 1

No. of Petals X P(X) XP(X)

x1 = 3 0.05 0.15 x2 = 4 0.10 0.4 x3= 5 0.20 1 x4 = 6 0.30 1.8 x5 = 7 0.25 1.75 x6 = 8 0.075 0.6 x7 = 9 0.025 0.225 Total 1 5.925

Page 18: Statistics and Probability - Solved Assignments - Semester Spring 2010

b) A random variable X has the following probability distribution:

X P(X) -2 0.1 -1 k 0 0.2 1 2k 2 0.3 3 3k

Find (i) K (ii) P(X<2) (iii) P (X≥2).

Sol

X P(X) P(X) -2 0.1 0.1 -1 k 0.0667 0 0.2 0.200 1 2k 0.1333 2 0.3 0.3000 3 3k 0.2000 Total 0.6+6k 1.000

( ) 0.6 6

( ) 1

6 1 0.6 0.4

6 0.4

0.4 / 6

0.0667

P X k

AS

P X

k

k

k

k

∑ = +

∑ == − ==

==

(ii) P (X<2) = P(X=-2) + P (X=-1) + P (X=0) + P (X=1) P(X<2) = 0.100 + 0.0667 + 0.2 + 0.1333 P(X<2 ) = 0.5000

Page 19: Statistics and Probability - Solved Assignments - Semester Spring 2010

(iii) P(X≥2) = 1- P (X<2) = 1 - 0.5 = 0.5

Question 2: Marks: 2+2+6=10

a) If E(x) =4, and E(y) =1, then Find E (2x+5y).

Sol

E (2x+5y) = 2 E(x) + 5 E(y)

= 2 (4) + 5 (1)

= 8 + 5

= 13

b) Form the following table of x and y , Find h(0).

Sol

Joint Probability Distribution Y

X

0 1 2 P(X = xi)

g(x)

0 3/28 6/28 1/28

1 9/28 6/28 0

2 3/28 0 0

P(Y = yj) h(y)

Page 20: Statistics and Probability - Solved Assignments - Semester Spring 2010

2

0

(0) ( ,0)

3 9 3(0)

28 28 2815

(0) 0.535728

x

h f x

h

h

=

=

= + +

= =

c) Let X and Y are two discrete r.v.’s with the following joint probability distribution:

Find E(X), E(Y).

Sol

x

y

1 2 h(y)

1 0.10 0.15 0.25

2 0.20 0.30 0.50

3 0.10 0.15 0.25

x

y

1 2

1 0.10 0.15 2 0.20 0.30 3 0.10 0.15

Page 21: Statistics and Probability - Solved Assignments - Semester Spring 2010

g(x) 0.4 0.60 1

( ) ( )( ) ( )

1 0.4 2 0.60 1.6

1 0.25 2 0.50 3 0.25 2

E x xg x

E y yh y

= = × + × =

= = × + × + × =∑∑

Question 3: Marks: 10

Let x and y have the joint probability distribution given by

( ), 2,4,5; 1,2,366

xyf x y x y= = =

Find

(i) Joint Probability distribution table (ii) Marginal probability function of X and Y, (iii) Are X and Y are independent.

Solution:

Joint Probability distribution table

y

x

1

2

3

2

2/66

4/66

6/66

4

4/66

8/66

12/66

Page 22: Statistics and Probability - Solved Assignments - Semester Spring 2010

5

5/66

10/66

15/66

(ii)

Marginal probability function of x:

( ) ( )3

1

2 3, 2,4,5

66 66 66 66 11y y

xy x x x xg x f x y for x

=

= = = + + = =∑ ∑

Marginal probability function of y:

( ) ( )5

2

2 4 5, 1,2,3

66 66 66 66 6x x

xy y y y yh y f x y for y

=

= = = + + = =∑ ∑

(iii)

For independence f(x,y) = g(x).h(y)

( ) ( ) ( ). ,11 6 66

x y xyNow g x h y f x y= × = =

So, x and y are independent.

Page 23: Statistics and Probability - Solved Assignments - Semester Spring 2010

Assignment 5 Question 1: Marks: 2+3+5=10

a) When you consider poisson distribution as the limiting form of the binomial distribution? Solution: It is a limiting approximation to the binomial distribution, when p, the probability of success is very small but n, the number of trials is so large that the product np = µ is of a moderate size. b) The mean and standard deviation of the population is 30 and 5 respectively. The probability distribution of the parent population is unknown, find the mean and standard error of the

sampling distribution of X when n=50 Solution: Given is

30, 5 50and nµ σ= = =

As we know that

30X

X

µ µµ=

⇒ =

And standard error is given by

. ( ) XS E Xn

σσ= =

Putting values 5 5

0.7077.0750

= = =

c) Ten vegetables cans, all of the same size, have lost their labels. It is known that 5 contain tomatoes and 5 contain corns. If 5 are selected at random, what is the probability that all contain tomatoes? What is the probability that 3 or more contain tomatoes? Solution:

Given data can be arrange as

Page 24: Statistics and Probability - Solved Assignments - Semester Spring 2010

Tomatoes cans Total cans Corn cans Selected cans

K= 5 N= 10 N-K= 5 n=5

Let X denote the number of tomatoes cans then hypergeometric distribution is given by

( )

k N k

x n xP X x

N

n

− − = =

Probability that ALL contains tomatoes:

Probability that 3 or more contain tomatoes:

( 3) ( 3) ( 4) ( 5)P X P X P X P X≥ = = + = + =

5 5 5 55 53 2 5 04 1

10 10 105 5 5

100 25 1

252 252 252126 1

252 2

C C C CC C

C C C= + +

= + +

= =

Question 2: Marks: 3+7=10

a) Define sampling with replacement and sampling without replacement.

Solution:

Sampling with replacement: Sampling is said to be with replacement when from a population a sampling unit is drawn, observed and then returned to the population before another unit is drawn.

In sampling with replacement, an element can be selected more than once.

5 55 0

105

1( 5) 0.00397

252

C CP X

C= = = =

Page 25: Statistics and Probability - Solved Assignments - Semester Spring 2010

Sampling without replacement: Sampling is said to be without replacement when from a population a sampling unit is drawn and not returned to the population before another unit is drawn.

In sampling without replacement an element can be selected only once.

b) A finite population consists of values 6, 6, 9, 15 and 18. Calculate the sample means for all

possible random samples of size n=3, that can be drawn from this population without

replacement. Make the sampling distribution of sample mean and find the mean and variance of

this distribution.

Solution:

Given data is

Population: 6, 6, 9, 15 and 18.

N=5, n=3

Number of possible samples (without replacement) 53 10N

nC C= =

Now the sample and their means are as below

No Samples /x x n=∑

1

2

3

4

5

6

7

6,6,9

6,6,15

6,6,18

6,9,15

6,9,18

6,15,18

6,9,15

7

9

10

10

11

13

10

Page 26: Statistics and Probability - Solved Assignments - Semester Spring 2010

8

9

10

6,9,18

6,15,18

9,15,18

11

13

14

Now for the sampling distribution of x

x f ( )f x ( )xf x

( )2x f x

7

9

10

11

13

14

1

1

3

2

2

1

1/10

1/10

3/10

2/10

2/10

1/10

7/10

9/10

30/10

22/10

26/10

14/10

49/10

81/10

300/10

242/10

338/10

196/10

Total 10 1 108/10 1206/10

( )( ) ( ) ( )22 2 2

108 /10 10.8

( ) 1206 /10 10.8 3.96

x

X

xf x

x f x xf x

µ

σ

= = =

= − = − =

∑∑ ∑

Question 3: Marks: 2+2+6=10

a) Find the value of maximum ordinate of the standard normal curve correct to four decimal places.

Solution:

Since the standard normal probability density function is symmetric about zero, its maximum ordinate is at Z=0

Page 27: Statistics and Probability - Solved Assignments - Semester Spring 2010

b) If Z is a standard normal variable with mean 0 and variance 1, then find the Lower quartile.

Solution:

1

1

11

( ) 0.25

( ) 0.25

( ) (0.25)

P Z Q

Q

Q

φφ −

< ==

=

Using the table

1 0.6745Q = −

-OR- Alternatively, We can find this way:

As we know that

1 0.6745Q µ σ= −

Putting value

1

1

0 0.6745(1)

0.6745

Q

Q

= −=

c) Let 1 2 3, ,X X X be a random sample of size 3 from a population with mean 2and varianceµ σ

Consider the following two estimators of the mean

1 2 31

1 2 32

32

4

X X XT

X X XT

+ +=

+ +=

Which estimator should be preferred?

2(0) / 21 1

2.50720.3989

−= =

=

Page 28: Statistics and Probability - Solved Assignments - Semester Spring 2010

Solution:

First we examine which one among T1 & T2 is unbiased. If ONLY one of them is the unbiased we can prefer it as a better estimator. If both of them are unbiased then we have to compare their variances. The estimator with least variance will be the preferred.

So let’s first see unbiasedness:

T1 is sample meanX , which we know is unbiased.

And for T2

1 2 32

2

2( )

4

1 4( ) ( 2 )

4 4

X X XE T E

E T Eµµ µ µ µ

+ + =

= + + = =

So T2 is also unbiased.

Since both estimator are unbiased, NOW we have to check there variances.

[ ]

[ ]

1 2 31 1 2 3

2 22 2 2

1 2 32 1 2 3

2 22 2 2

1( ) ( ) ( ) ( )

3 9

31

9 9 3

2 1( ) ( ) 4 ( ) ( )

4 16

6 314

16 16 8

X X XVar T Var Var X Var X Var X

X X XVar T Var Var X Var X Var X

σ σσ σ σ

σ σσ σ σ

+ + = = + +

= + + = =

+ + = = + +

+ + = =

Comparing both variances

1 2

1 3

3 8( ) ( )

so

Var T Var T

<

<

Hence we conclude that since T1 is unbiased as well as has low variance, so this estimator T1 is better then T2.

Page 29: Statistics and Probability - Solved Assignments - Semester Spring 2010

Assignment 6 Question 1: Marks: 5x2=10

Give the answer of short questions.

1) How can we determine the three possible locations of rejection region?

Solution:

1 0:If H θ θ<

(i) Then the test is left-tailed test, and the rejection region is located in the left tail of the distribution.

(ii)1 0:If H θ θ>

Then the test is right-tailed test, and the rejection region is located in the right tail of the distribution.

(iii)1 0:If H θ θ≠

Then the test is two-tailed test, and the rejection region is located equally in both tails of the distribution.

2) If α = 0.10, how many intervals would be expected to contain µ ?

Solution:

We would expect about 90% of all such confidence intervals to contain µ and 10% to missµ , in

the repeated sampling.

3) What does role the sample mean play in a two-sided confidence interval for µ based

on a random sample from a normal distribution?

Solution:

The sample mean is the mid point of the confidence interval but has no effect on the length of the confidence interval.

4) In which situation we may replace 2 2by Sσ ?

Solution:

In case of a large sample, drawn from a population with unknown population

Page 30: Statistics and Probability - Solved Assignments - Semester Spring 2010

variance2σ , we may replace2 2by Sσ .

5) If an automobile is driven on the average no more than 16000 Km per year, then formulate the null and alternative hypothesis.

Solution:

0

1

: 16000

: 16000

H km

H km

µµ

≤>

Question 2: Marks: 2+2+6=10

a) The average yield of corn of variety A exceeds the average yield of variety B by at least 200 Kg per acre, formulate null and alternative hypothesis.

Solution:

b) When we use one-sided test and two-sided test?

Solution:

If the value of parameter is fully specified (i.e. 1 0:H µ µ≠ ), we use two tailed test. If the

parameter of the distribution is not specified (1 0 1 0: :H or Hµ µ µ µ> < ) then we use one sided

test.

c) In a poll of college students in a large university, 300 of 400 students living in students residences (hostels) approved a certain course of action, whereas 200 of 300 students not living in students’ residences approved it. Compute the 90% confidence interval for the difference of proportions.

Solution:

From the data, sample proportions are:

0

1

: 200

: 200A B

A B

H kg

H kg

µ µµ µ

− ≥− <

Page 31: Statistics and Probability - Solved Assignments - Semester Spring 2010

1

2

300ˆ 0.75400200ˆ 0.67300

P

P

= =

= =

90% C.I. for p1-p2:

( ) 1 1 2 21 2

21 2

ˆ ˆ ˆ ˆˆ ˆ

p q p qp p Z

n nα− ± +

( ) ( ) 1 1 2 21 2

1 2

ˆ ˆ ˆ ˆˆ ˆ 1.645

p q p qp p

n n− ± +

Question 3: Marks: 5+5=10

a) The Punjab Highway Department is studying the traffic pattern on the G.T. Road near Lahore. As part of the study, the department needs to estimate the average number of vehicles that pass the Ravi Bridge each day. A random sample of 65 days gives x = 5010

and s = 650. Find the 90 percent confidence interval estimate for µ, the average number of vehicles per day.

Solution:

= 5010,s = 650, n = 65 and Z0.05 = 1.645.

( ) ( )( ) ( )( )300

33.067.0

400

25.075.0645.108.0 +±or

x

or 0.08 (1.645) (0.0347)

or 0.08 0.057

or 0.023 to 0.137

±±

Page 32: Statistics and Probability - Solved Assignments - Semester Spring 2010

2

1 0.90

0.1

0.0520.05 1.645Z Zα

ααα

− ==

=

= =

The 90% confidence interval for µ is

2

sx z

nα±

650

5010 1.64565

±

or 5010 ± 132.62

or 4877.38 to 5142.62

or, rounding the above two figures correct to the nearest whole number, we have :

4877 to 5142

b) Mr. Ali wants to run election for City Government. After a strong election campaign, Mr. Ali’s staff conducts their own poll over the weekend prior to the election. The results show that for a random sample of 500 voters 290 will vote for Mr. Ali. Develop a 95 percent confidence interval for the population proportion who will vote for Mr. Ali using 0.05α = .

Solution:

From the data the sample proportion is 290/500 = 0.58.

The 95% Confidence Interval for p is:

( ) 6505010 1.645

65±

( )n

p̂1p̂zp̂ 2/

−± α

Page 33: Statistics and Probability - Solved Assignments - Semester Spring 2010

The end points of the confidence interval are 0.537 and 0.623. The lower point of the confidence interval is greater than 0.50. So, we conclude that the proportion of voters in the population supporting Mr. Ali is greater than 50 percent. He will win the election, based on the polling results.

( )

( )623.0,537.0

043.058.0500

58.0158.096.158.0

=±=

−±=