10
Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 1.) What is statistical power (in English)? What are the three factors that affect it? Statistical power is the probability of a hypothesis test detecting significance when significance actually exists—in other words, it is the probability of rejecting the null and being right. Power can be expressed as: Power = (Reject H 0 | H 1 is true) or Power = 1 – β when β is a Type II error. Power is affected by three things: sample size (to increase your power, you should increase your n—because a larger n means less variance); significance level (to increase your power, you should also increase your significance level); and the distance of the alternative hypothesis from the null (the further the alternative is from the null, the easier it is to detect significance). A power under .5 is terrible, since this means that you have a greater probability of making a Type II error (failing to reject the null when the alternative is in fact true), but once you pass that threshold, the higher your power, the better. (The book maintains that anything under .8 is not an adequate level of power, but there is no “accepted standard” the way there is re: significance levels.) 2.) The LSAT is a standardized test for admission to US law schools. The mean national score, on a scale of 120-180, is 151, with a standard deviation of 9.6 points. An especially paranoid researcher/aspiring law student took the October 2013 LSAT and thought it to be, relatively speaking, extremely difficult. He hypothesized that the people who took the October 2013 LSAT would do less well as a result. What are the researcher’s null and alternative hypotheses? H 0 : μ = 151 H 1 : μ < 151 At what value should the researcher reject the null hypothesis at the .01 level? The .1 level? Our critical value will be a z-score, because we know the standard deviation of the population of LSAT test takers. For the .01 level: qnorm(.01) = -2.326 (critical value is negative because we’re conducting a one-sided test hypothesizing that the mean will be less than 151). For the .1 level: qnorm(.1) = -1.28

midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 1.) What is statistical power (in English)? What are the three factors that affect it? Statistical power is the probability of a hypothesis test detecting significance when significance actually exists—in other words, it is the probability of rejecting the null and being right. Power can be expressed as: Power = 𝑷(Reject H0 | H1 is true) or Power = 1 – β when β is a Type II error. Power is affected by three things: sample size (to increase your power, you should increase your n—because a larger n means less variance); significance level (to increase your power, you should also increase your significance level); and the distance of the alternative hypothesis from the null (the further the alternative is from the null, the easier it is to detect significance). A power under .5 is terrible, since this means that you have a greater probability of making a Type II error (failing to reject the null when the alternative is in fact true), but once you pass that threshold, the higher your power, the better. (The book maintains that anything under .8 is not an adequate level of power, but there is no “accepted standard” the way there is re: significance levels.) 2.) The LSAT is a standardized test for admission to US law schools. The mean national score, on a scale of 120-180, is 151, with a standard deviation of 9.6 points. An especially paranoid researcher/aspiring law student took the October 2013 LSAT and thought it to be, relatively speaking, extremely difficult. He hypothesized that the people who took the October 2013 LSAT would do less well as a result. � What are the researcher’s null and alternative hypotheses? H0: µ = 151 H1: µ < 151 � At what value should the researcher reject the null hypothesis at the .01 level? The .1 level? Our critical value will be a z-score, because we know the standard deviation of the population of LSAT test takers. For the .01 level: qnorm(.01) = -2.326 (critical value is negative because we’re conducting a one-sided test hypothesizing that the mean will be less than 151). For the .1 level: qnorm(.1) = -1.28

Page 2: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 � The October results come in and the researcher decides to collect a random sample of 1,000 October test takers. In order for the researcher to find significance at the .05 level, what would be the highest value this sample’s mean score could take? qnorm(.05) = -1.645

−𝟏.𝟔𝟒𝟓 ≥   𝑿!𝟏𝟓𝟏𝟗.𝟔𝟏,𝟎𝟎𝟎

         →              𝑿− 𝟏𝟓𝟏 ≤ −𝟏.𝟔𝟒𝟓 𝟗.𝟔𝟏,𝟎𝟎𝟎

       →      𝑿 ≤  −.𝟓+ 𝟏𝟓𝟏   →

                                                                                                                                                                                                                                                                                                                                                                                                                                             𝑿 =𝑿 ≤ 𝟏𝟓𝟎.𝟓     In order to find significance at the .05 level, the mean score for the researcher’s sample of October test-takers must have dropped by at least half a point. � Is this test sensitive enough to detect a decrease of 1 point in the mean score? Why or why not? Power = P(𝑿 ≤ 𝟏𝟓𝟎.𝟓  |  µ = 150) 𝑿!𝟏𝟓𝟎𝟗.𝟔𝟏,𝟎𝟎𝟎

≤ 𝟏𝟓𝟎.𝟓!𝟏𝟓𝟎𝟗.𝟔𝟏,𝟎𝟎𝟎

                               →                  𝒛     ≤ 𝟏.𝟔𝟒𝟕

pnorm(1.647) = .95 The probability of correctly rejecting the null if indeed there is a 1-point drop in the mean score is 95%, a very high level of power. � What would happen to the researcher’s power if he decided to test his hypothesis at the .1 significance level? What about if he was only able to collect a random sample of 500 people? What if he decided to hypothesize that the October mean score decreased by just half a point? If the researcher decided to test his hypothesis at the .1 level, his power would increase (because it would be easier to detect significance if the standard you set is less stringent). If the researcher was only able to collect a 500-person sample, he would see his power decrease (because the variance of his sample would be slightly larger).

Page 3: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 If the researcher hypothesized that the mean score had decreased by just half a point, his power would decrease (because the significance of an alternative is harder to detect the closer that alternative is to the null). � The researcher doesn’t want to wreak mayhem on the LSAT message boards, so he considers telling his friends that he doesn’t have enough evidence to say that the October LSAT was any harder than previous versions. But let’s assume that his randomly-collected sample does show a 1-point decrease in the mean score; what is the probability of the researcher genuinely coming to the aforementioned conclusion? This question asks for the probability of a Type II error, or β (i.e., that the researcher fails to reject the null if indeed there has been a 1-point drop in the means core for October test-takers). P(β ) = 1 – power 1 - .95 = .05 There is only a 5% chance of the researcher genuinely thinking, based on the results of his test, that there is no difference in the exams if a difference does in fact exist. In other words, there’s only a 5% chance that if the difference exists, the researcher wouldn’t be able to find it. 3.) Suppose H0: µ =100. H1: µ ≠ 100. N=75. � After performing a hypothesis test, the researcher finds enough evidence to reject H0 with a p-value of .034. Does the 99% confidence interval contain the value 100? What about the 95% confidence interval? 90%? The 99% confidence interval DOES contain 100, since .032 > .01 and thus we cannot reject the null hypothesis that µ =100. The 95% confidence interval DOES NOT contain 100, since .032 < .05 and thus we can reject the null hypothesis that µ =100. The 90% confidence interval DOES NOT contain 100, since .032 < .1 and thus we can reject the null hypothesis that µ =100.

Page 4: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 � Another researcher performs another test and finds a p-value of .105. Does the value of 100 lie within the 95% confidence interval? What about the 90% confidence interval? The 95% confidence interval DOES contain the value 100, since .105 > .05 and thus we cannot reject the null that µ =100. The 90% confidence interval DOES contain the value 100, since .105 > .1 and thus, despite how close the value lies to the limit of the C.I., we cannot reject the null that µ =100. 4.) What kind of test would you perform in the following scenarios? � You’re back in 2012, and you think that Mitt Romney’s infamous “47 percent” comment has had a very significant negative effect on how people perceive the candidate. You’ve already surveyed 250 randomly-selected Americans for a political science project, asking each respondent to rank Romney on a classic Feeling Thermometer scale. After his comment, you call those same respondents and ask them the same question again. Here you would use a paired t-test, since you know your 𝛔 remains constant (because you’re conducting a panel study on the same sample of people, where the “treatment” is simply the passage of time.) � You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected people to formally test this. Here you would use a standard z-test, since (because of 2012 election data), you know the proportion of people who chose Obama over any other candidate in the population of voters. � You feel pretty sure, after Romney’s infamous “binders full of women” comment, that there will be a significant disparity between how the candidate is viewed by women and how he’s viewed by men. Three days after the debate during which Romney made the aforementioned remark, you collect a random sample of 250 men and 250 women and ask them to rank the candidate on a classic FT scale. Here you would have to perform a Welch’s t-test; that is, you cannot pool the variances of your two samples to conduct a difference-of-means t-test. There is good reason to believe

Page 5: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 that the variance of male opinions toward Romney differs from the variance of female opinions toward Romney. Romney holds conservative positions on classically “women’s issues” like abortion and the availability of contraception, so it would be fair to assume that the male population might have a larger variance of feelings toward the candidate than the female population would. � You have good reason to believe there’s a relationship between having an income over $100,000 and positive perception of Mitt Romney. You collect a random sample of 1000 people and ask, among other questions: What is your annual income? and What score would you give Mitt Romney on a classic FT scale? Here you would use a chi-squared test to determine if there is evidence for a significant relationship between income and opinion toward Romney. � The percentage of A grades a political science professor gave out in the Fall 2012 semester in a large lecture course was paltry, with only 9% of students scoring in the A range. But you think that the professor has mellowed over the past year due to positive developments in his personal life. You wait until all the Fall 2013 grades have been reported and then survey a random sample of 75 students from the Blackboard class roster, asking them about their grades. Here you could plausibly conduct a difference-of-means t-test since, while you’re analyzing different samples of students, there’s no real reason to believe that this year’s crop of students differs substantially from last year’s. (Unless, of course, you make the claim that word had gotten around about how tough this particular professor is and, as a result, his class attracted only the best, most motivated students for Fall 2013!) Note: While the “9%” statistic might make it seem like you’d perform a z-test, there’s no reason to think you couldn’t just convert the students’ grades to a standard GPA 1-4 scale and go from there. (If there were a question like this on the midterm, of course, Prof. Boas would provide the mean scores and the standard deviations of each sample!) � You’re conducting an experiment to find out whether a certain medication actually works in reducing blood pressure. You take a random sample of 1,000 Americans and survey their blood pressure levels. You then divide the people into treatment and control groups and, 6 months after the treatment group has started taking the medication, you test that group’s blood pressure levels again. If you wanted to compare the blood pressure levels of people in the treatment group versus

Page 6: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 the control group, you’d perform a difference-in-means t-test. You have no reason to believe the variances of the samples differ, since the nature of a treatment (in this case, use of the medication) is that it’s randomly applied across the group of participants. The beauty of experiments, of course, lies in their ability to control for all other possible causes for a difference in means between the samples—and thus identify the treatment condition as either effective or non-effective, ceteris paribus. If you wanted to compare the people in what would be the treatment group pre-medication to the treatment group post-medication, you could conduct a paired t-test. � You want to see if there’s a difference in how well men and women do on the LSAT, so you collect a random sample of 400 male test-takers from the July 2013 cycle and a random sample of 300 women from the same cycle. Here you could also conduct a difference-of-means t-test, since you have two samples but no real reason to believe that the variance of LSAT performance differs between the male and female populations. 5.) Recall that last scenario. Suppose you find that the mean LSAT score in your sample of 400 men was 150 (with a standard deviation of 9.65). The mean LSAT score in your sample of 300 women was 150.4 (with a standard deviation of 9.7). Can we reasonably conclude that men and women differ in LSAT test-taking skills? Women: N1 = 300; µ1 = 150.4; s1 = 9.7 Men: N2 = 400; µ2 = 150; s2 = 9.65 H0: µ1 = µ2 H1: µ1 ≠ µ2 First, pool the variances of both samples: Women: 9.72 = 94.09 Men: 9.652 = 93.1125

𝐒𝒑𝟐 =  𝟑𝟎𝟎!𝟏 𝟗𝟒.𝟎𝟗 !(𝟒𝟎𝟎!𝟏)(𝟗𝟑.𝟏𝟏𝟐𝟓)

𝟑𝟎𝟎!  𝟒𝟎𝟎!𝟐= 𝟗𝟑.𝟓𝟑

Page 7: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 Solve for the t-statistic: 𝐭𝟔𝟗𝟖 =    

𝟏𝟓𝟎.𝟒!𝟏𝟓𝟎 !(𝟎)

𝟗𝟑.𝟓𝟑  ( 𝟏𝟑𝟎𝟎!  

𝟏𝟒𝟎𝟎)

= .5415

Find the corresponding p-value, keeping in mind that this is a 2-sided test: 2 * pt(-.5415, 698) = .588 This is a very large p-value; we do not have enough evidence to say that men and women differ in their LSAT test-taking abilities, at any level of significance. 6.) What are the parameters of a chi-squared distribution? What would you expect to happen to the variance if degrees of freedom increased? The chi-squared distribution has just one parameter, degrees of freedom (ν). If the degrees of freedom increased, the variance of the distribution would increase. 7.) Recall the example about perception of Mitt Romney and income. Suppose you gather data on 375 randomly-selected Americans and get the following results:

Income < $100,000/year Income > $100,000/year Positive Romney FT (> 50) 54 136 Negative Romney FT (< 50) 102 83

� Convert this table to proportions and analyze it. Does there appear to be a relationship between perception of Romney and income level? What would your null and alternative hypotheses be? N=375; thus, each cell’s proportion may be calculated by the count / the total (e.g., 54/375):

Page 8: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013

Income < $100,000/year Income > $100,000/year Positive Romney FT (> 50) .144 .363 Negative Romney FT (< 50) .272 .222

* Proportions may not all sum to 1 because of rounding This table tells us that 14.4% of your sample both has an income under $100,000 and has a positive opinion of Mitt Romney; etc. To allow for categorical comparisons, simply calculate each cell’s count / row or column total (i.e., the margin command in R). E.g., 54/190 = .284

Income < $100,000/year Income > $100,000/year Positive Romney FT (> 50) .284 .716 Negative Romney FT (< 50) .55 .45

This table tells us that of the people who have a positive view of Romney, 71.6% of them make over $100,000; etc. Simply by eyeballing this table, we see that a bigger income disparity exists between people who think positively of Romney, while a slightly higher proportion of people who think negatively of Romney have relatively “low” incomes. Thus, we hypothesize that a relationship exists between opinion toward Romney and income. Formally, our hypotheses would be: H0: Opinion toward Romney ⊥ Income H1: Opinion toward Romney ⊥ Income � Formally test for a relationship between the independent and dependent variable, at the .05 significance level. What do you conclude? First, find the row and column totals:

Income < $100,000/year Income > $100,000/year TOTAL Positive Romney FT (> 50) 54 136 190 Negative Romney FT (< 50) 102 83 185

TOTAL 156 219 375

Page 9: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 We already have our observed values, so now we need to calculate our expected values (what we would expect to find if there were no relationship; i.e., what we would expect to find if “Opinion toward Romney ⊥ Income”): EXPECTED COUNTS:

Income < $100,000/year

Income > $100,000/year

TOTAL

Positive Romney FT (> 50)

(190)(185)/375 = 93.73

(190)(219)/375 = 110.96

190

Negative Romney FT (< 50)

(185)(156)/375 = 76.96

(185)(219)/375 = 108.04

185

TOTAL

156

219

375

Now calculate the χ2 test statistic: (𝐨𝐛𝐬𝐞𝐫𝐯𝐞𝐝!𝐞𝐱𝐩𝐞𝐜𝐭𝐞𝐝)𝟐

𝐞𝐱𝐩𝐞𝐜𝐭𝐞𝐝𝐜𝐞𝐥𝐥𝐬

(𝟓𝟒!𝟗𝟑.𝟕𝟑)𝟐

𝟗𝟑.𝟕𝟑  +     (𝟏𝟑𝟔!𝟏𝟏𝟎.𝟗𝟔)

𝟐

𝟏𝟏𝟎.𝟗𝟔  +     (𝟏𝟎𝟐!𝟕𝟔.𝟗𝟔)

𝟐

𝟕𝟔.𝟗𝟔  +     (𝟖𝟑!𝟏𝟎𝟖.𝟎𝟒)

𝟐

𝟏𝟎𝟖.𝟎𝟒 = 36.44

Calculate degrees of freedom: (R-1)(C-1) = (2-1)(2-1) = 1 Find the corresponding p-value: 1 – pchisq(36.44, 1) = 1.572878e-09 This is a very tiny p-value; we can conclude, at any conventional level of significance, that there is a relationship between opinion toward Romney and income. 8.) I didn’t compute the chi-squared statistics for these problems since they were taken directly from HW #6; however, if anyone wants to compare their results with mine, just shoot me an email and I’ll do the problems!

Page 10: midterm review 10.31Oct 31, 2013  · You think that if another presidential election were held tomorrow, Obama would lose (assume term limits don’t apply!) and you survey 1000 randomly-selected

Claire Leavitt P.O. 841: Quantitative Methods Midterm #2 Review 11/3/2013 8.) Formally test for a relationship between the variables from the Brazil data set (homework #6). Remember, n=1204. � Is there a significant relationship between approval of the president and satisfaction with democracy, according to the data presented in the table below?

Dissatisfied Satisfied Disapprove 348 90

Approve 380 268 � What about between perceptions toward the government’s progress on reducing corruption and satisfaction with democracy, according to the data presented in the table below?

Dissatisfied Satisfied No progress 533 207

Progress 206 141