Problem set 2 sol (1)

Embed Size (px)

Citation preview

  • 7/23/2019 Problem set 2 sol (1)

    1/5

    PS II 2015 Problem Set 2 : sampling distribution Solutions

    1. Discuss whether normal tables can be of use in each of the following situations:

    a) Weights of a group of adults are approximately normally distributed with mean 70 kgs andstandard deviation 25 kgs. We want to know the probability that the average weight of 10

    randomly selected people is more than 100 kgs.

    b) Salaries at a large corporation have mean of 40, 000 and standard deviation of 20,000.

    We want to know the probability that a randomly selected employee makes more than

    50,000.

    c) A club has 50 members, 10 of which want the president to be deposed. We want to compute

    the probability that if we select 20 members of the club at random, 20% or more in our sample

    would want the president to be deposed.

    The answer is a) only since for approximate normal population, the sample mean is also

    approximately normal regardless of the sample size, and hence the normal table can be used.

    For b) the shape of the population distribution may not be normal and the sample size is 1

    (only one person is selected)so the CLT does not apply and hence the normal table cant beused.

    For c), p = 10/50 = 0.2, n = 20; hence np = 4 10 and n(1-p) = 260*.9 = 234 >10, CLT holds and hence X can be

    approximated by a Normal distribution. Hence, 5X-260 can also be approximated by a normal

    distribution.b) What is the approximate probability that your total profit over 52 weeks is more than 0?

    As mentioned above, the sampling distribution of 5X-260 can be approximated by a normal.

    with mean 5*26-260 = -130 and variance 25*23.4 = 585 i.e. 5X-260 ~ N(-130, 585). [mean

    and variance of X being 260*.1 = 26 and 260*.1*.9 = 23.4 respectively]. Thus, the required

    probability of it being greater than 0 is approximately

    P(5X-260 > 0) = 130585P Z> = P(Z>5.37) = 0, where Z is the N(0,1) variable.

    c) What is the probability that in a week your profit is more than 0? What is the exact samplingdistribution of the number of weeks your profit is more than 0? Can the sampling distribution

  • 7/23/2019 Problem set 2 sol (1)

    2/5

    of the number of weeks your profit is more than 0 can be approximated by a normal

    distribution?

    The number of successes in a week (say S) ~ Bin(5,0.1). Hence, the weekly profit will be 5S-5

    which will be positive only if the number of successes is > 1 (or at least 2). The probability of

    this happening is = P(S 2)= 1- P(S 0) = 1P(W=0) = 1 - P(profit is never more than 0) =521 (1 .08146) = .9879

    3. A waiter believes that his tips from various customers have a right skewed distribution with

    a mean of 100 and standard deviation of 25.

    a) With the above information, can we obtain the exact probability, or a reasonable

    approximation, that the average tip from 15 customers will be at least 130? Why or why not?

    Since the population distribution is notNormal and the sample size is less than 30, we cannot

    apply the CLT (and hence the normal approximation to the sampling distribution of X ).

    Hence, we cannot find a solution to this problem at least with the tools we have.

    b) What is the approximate probability that the average tip from 35 customers will be at least

    130?

    Since n 30, we can apply the CLT as follows P(X 130) = P(Z (130-100)/25/35) = P(Z 7.1)

    c) What is the approximate probability that the average tip from 35 customers will be between

    90 and 150?

    This will be P(90 X 150) = P((90-100)/25/35 Z (150-100)/25/35)

    = P(-2.37 Z 11.83) = P(Z 11.83) P(Z -2.37) = 1 - 0.0089 =0 .99

    4. Suppose the probability that Barry Bonds, a famous baseball player, gets a hit at bat is 0.3.

    a) If Barry has 400 bats in a single season, what is the mean and standard error of the sampling

    distribution of the sample proportion of hits at bat?

    The mean of the sample proportion will be the population proportion i.e. 0.3 while the standard

    error will be .3.7/400 = .023 .As 400*.3 and 400*.7 are both greater than 10, the

    distribution of p can be approximated by normal.

  • 7/23/2019 Problem set 2 sol (1)

    3/5

    b) What is the probability that Barry will have at least 35 % hits at bat?

    This will be P( p .35) = P(Z (.35-.30)/ .023) = P(Z 2.17) = 0.015

    c) What is the probability that Barry will have at most 65% hits at bat?

    This will be P( p .65) = p(Z (.65-.30)/.023) = P(Z15.22) = 1

    d) What is the probability that Barry will have between 40% and 70% hits at bat?

    This will be P(.40 < p

  • 7/23/2019 Problem set 2 sol (1)

    4/5

    Suppose I go to Falafel on S days out of 80 days. Then, S ~ Bin(80, .25). Each time I go to

    Falafal, I spend Rs 30. Hence, my total expenditure in S days will be 30S. Hence my average

    daily expenditure in 80 days will be 30S/80 = T, say. So,

    E(T) = 30/80*E(S) = 30/80*(80*.25) = 30*.25 = 7.5

    Similarly, Var(T) = (30/80)(30/80)(80*.25*.75) = 2.109

    f) What is the sampling distribution of my average daily expenditure at Falafal over 80 days?

    As mentioned above, the exact sampling distribution of my daily average expenditure over 80

    days will be (30/80)S where S ~ Bin(80, .25).

    However, since np and n(1-p) are both greater than 10, CLT holds and hence the above quantity

    approximately follows a Normal distribution with mean 7.5 and standard error 1.452.

    g) What is the probability that my total expenditure at Falafal over 80 days is more than Rs.880?

    Total expenditure at Falafal = 80T.

    So, P(80T > 880) = P(T>11) = P(Z > (11-7.5)/1.452) = P(Z > 2.41) = .008

    6.A popular news magazine wanted to write an article onhow much Indians know about

    geography. They devised a test that lists 100 cities in India, all of them mentioned in the news

    magazine in the last year. Each respondent was supposed to tell the state in which the city

    can be found. Some examples were: (Bhopal, Silliguri, Madurai etc) Each correct answer

    earned one point, for a maximum of 100. The random sample of 5000 people had a

    distribution of scores that was normally distributed with mean 62 and standard deviation 12.

    a) The central ninety-five percent of the people in this sample can identify how many states

    correctly ?

    38-86 b) 50-86 c) 50-74 d) 26-98

    By the Empirical rule, approximately 95% of the observations will fall within 2 standard

    deviations of the mean i.e between (622*12, 62+2*12) = (38, 86).

    b) What percentage of those sampled scored between 50 and 74 points?

    a) 68.5% b) 95% c) 90% d) 82%

    (50, 74) correspond to the 1 standard deviation interval about the mean because 62-12 = 50and 62+12 = 74. By the Empirical rule, approximately 68% of the observations lie within this

    interval. The closest answer is 68.5%.

    As an alternative approach, you can do P(50 < X < 74) = P((54-62)/12< Z< (74-

    62)/12) = P(-1 < Z < 1). The area between -1 and 1 under the Z curve is 0.8413-0.1587 =

    0.6826 = 68.3 % which is close to 68.5%. This is NOT a question about the sample mean

    number of scores. So, we use the standard deviation, NOT the standard error.

    c) What kinds of scores will the top 5% of people achieve?

    a)

    78 or better b) 81.74 or better c) 90.25 or better d) 98 or better

  • 7/23/2019 Problem set 2 sol (1)

    5/5

    This is the score X with 0.05 area to its right under the normal curve or with area 0.95 to its

    left. The Z score with area 0.05 to its right is 1.645. So, the required X score will be 62 +

    12*1.645 = 81.74.

    d) Correctly matching 45 of 100 cities to states is considered a poor performance. What

    percentage of respondents in this sample scored this low ?

    a)

    9.93% b) 7.78% c) 6.55% d) 5%

    We want P(X < 45) = P(Z < (45-62)/12) = P(Z < -1.42). The area under -1.42 under the Z

    curve is 0.0778 = 7.78%.