18
Class 09 Exam1 Prep Things you should know

Class 09 Exam1 Prep

  • Upload
    arden

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Class 09 Exam1 Prep. Things you should know. Exam Details. 75 minutes Allowed Any book, the course website and all files linked to it, your notes, excel, word, calculators. Not Allowed Communication with others (of any kind using any device) Websites other than the course. - PowerPoint PPT Presentation

Citation preview

Page 1: Class 09 Exam1 Prep

Class 09 Exam1 Prep

Things you should know

Page 2: Class 09 Exam1 Prep

Exam Details

• 75 minutes• Allowed– Any book, the course website and all files linked to it,

your notes, excel, word, calculators.• Not Allowed– Communication with others (of any kind using any device)– Websites other than the course.

• Short answers, lots of partial credit. Some easy. Some difficult.

Page 3: Class 09 Exam1 Prep

Class 01 Jefferson Had Girls

• We used BINOMDIST(false) to find the probabilities of 0,1,2,3,4 girls in four children IF the coin flip model is true.

• We constructed a table of Observed vs Expected Counts for the 31,595 Danish Families.

• You should be able to finish the hypothesis test.• You should be able to do the hypothesis test for a

p other than 0.5.

Page 4: Class 09 Exam1 Prep

Class 02 Prob, Pmfs, Binomial

• The first probability problem– What if the situation were slightly different?

• How probability works for two either/or events.– Athlete and Gender, Disease and Test.– Either build a 2x2 table of counts…or draw a prob

tree…or use formulas.– What if one of the events had three outcomes?

• Positive, Negative, Inconclusive• Strain A, Strain B, no disease• University Athlete, Intramural Athlete, Non-Athlete

Page 5: Class 09 Exam1 Prep

Class 02 Prob, Pmfs, Binomial

• BINOMDIST(X,n,p,false)– Applying the rules of probability to calculate the

probability of X successes in n independent trials.– This is a PMF– Characteristics of Pmfs

• A schedule assigning the unit of prob to the possible values of X.• Mean, Mode, Median, Std Deviation, Variance.

– Binomial Mean is n*p– Binomial Standard Deviation is [n*p*(1-p)]^.5– Most of our problems had p=.5. What if p was different?

Page 6: Class 09 Exam1 Prep

Class 03 Lady Tasting Tea• Hypothesis Testing

– Formulate H0 and Ha– Pick alpha (usually 0.05)– Identify and calculate the test statistics

• So far it has either been number correct (proportion correct) or calculated chi-squared.

– Calculate the p-value…the prob of observing a test statistic more extreme than the one observed if H0 is true.• So far we’ve used the binomial, the normal, the chi-squared to

calculate p-values.– Reject H0 (in favor of Ha) if p-value is less than 0.05. Say the

result is statistically significant. (Our result is rare if H0 is true).

Page 7: Class 09 Exam1 Prep

Class 04 Wunderdog and Normal

• As n gets big, the binomial looks like the normal.

• NORMDIST(X,μ,σ,TRUE)– As an approximation to the binomial set μ=n*p

and σ=[n*p*(1-p)]^.5– Normal is a PDF– Normal is a family of distributions….but all have

identical properties…see next slide.

Page 8: Class 09 Exam1 Prep

EMBS Fig 6.4, p 249

You can use

Normdist(X,0,1,true)Norminv(p,0,1)

For more detail

Page 9: Class 09 Exam1 Prep

Normal continued• NORMDIST(X,μ,σ,false)

– The height of the normal density curve. We have never used it…and never will.

• NORMINV(p,μ,σ)– finds the x value such that P(X<x) = p

• Lots of variations of questions you can use the normal to answer.– What is p?– What X gives a specified p?– What is σ?

• Lots of decisions to make (what target lorex should use)

Page 10: Class 09 Exam1 Prep

Class 06 Descriptive Statistics• Each of the characteristics of a prob distribution (mean,

median, mode, var, stdev, skew) has a corresponding summary descriptive statistic

• Know how to calculate summary descriptive statistics and what they tell you.– They are only useful for NUMERICAL (not categorical) scaled

variables.• EXCEPTION: If there are only two categories (Male/Female),

(Right/Wrong), (Athlete/Not) then you can change to a 1/0 number and descriptive statistics make sense.

• Descriptive statistics make some sense for birth months…especially in the soccer context where birth month number predicts tournament participation.

Page 11: Class 09 Exam1 Prep

Class 07 Roulette and GOF

• Observations of a categorical variable with multiple categories (n=904) are summarized using a table of counts.– Use Countif or pivottable to calculate the counts.

• If we have an H0 that leads to expected counts, then we can perform a chi-squared GOF test.– H0: All categories are equally probable (roulette

and soccer birth month)• Know how to perform and interpret a GOF test.

Page 12: Class 09 Exam1 Prep

Class 07 Lorex GOF

• Start with n=144 observations of a numerical variable.

• Create (somewhat arbitrary) BINS, and a table of summary counts.

• H0: the data came from N(10.2,.16)– Under H0, we can calculate the expected counts in each

bin.• This is another use of the chi-squared GOF test.– Numerical data, bins and observed counts, H0 and

expected counts, GOF test.

Page 13: Class 09 Exam1 Prep

Class 08 2-tailed tests and sample proportions

• P-value is always the probability of observing a test statistic AS EXTREME as ours under H0.– The interpretation of “extreme” depends on Ha.

• Ha: p>.5, extreme means X ≥ 8.• Ha: p≠.5, extreme means X ≥ 8 or X ≤ 2

– The calculation of the p-value depends on Ha.• The p-value for a 2-tailed test will usually be twice

that from a one-tailed test.– It is more difficult to find statistical significance when

you are less sure what you are looking for.

Page 14: Class 09 Exam1 Prep

There is often another way…

• X is Binomial(n,p)• X is N(n*p,[n*p*(1-p)]^.5)• = X/n is N(p,[p*(1-p)/n]^.5)

– Going from binomial to normal is an approximation good when n is big

– The last two ARE EXACLTY THE SAME.

Page 15: Class 09 Exam1 Prep

Suppose n=100, p=.5

X is N(50,5)

35 40 45 50 55 60 65

0.35 0.40 0.45 0.50 0.55 0.60 0.65

X is N(0.5,0.05)

[n*p*(1-p)]^.5

[ 𝑝∗(1−𝑝)𝑛 ]

1 /2

Page 16: Class 09 Exam1 Prep

When interpreting Statistics….

• It’s all about the n!– 60% correct is unimpressive if n=10 and very

impressive if n=100.– Even with the chi-squared GOF test, n matters• Distance = (15-10)^2/10 = 2.5• Distance = (150-100)^2/100 = 25 n is the total

count (907 for roulette,

288 for soccer, 144 for Lorex)

Page 17: Class 09 Exam1 Prep

Example H0 Ha Test StatisticDistribution of

test statistic given H0

p-value pvalue conclusion

LTT p=0.5 p>0.5 X=number correct Binomial =1-BINOMDIST(7,10,0.5,TRUE) 0.055 fail to rejectWunderdog p=0.5 p>0.5 X=number correct Binomial =1-BINOMDIST(86,149,0.5,TRUE) 0.024 rejectWunderdog p=0.5 p>0.5 X=number correct Normal =1-NORMDIST(87,74.5,6.10,TRUE) 0.020 reject

Wunderdog p=0.5 p>0.5 pbar = sample proportion correct Normal =1-NORMDIST(0.586,0.5,0.041,TRUE) 0.020 reject

Buttered toast p=0.5 p ≠.5 X=number butter up Binomial =1-binomdist(28,48,.5,true)

+ binomdist(19,48,.5,true) 0.097 fail to rejectRoulette Wheel All 38 p's = 1/38 they do

not calculated chi-squared chi-squared with 37 dof =CHIDIST(31.2,37) 0.737 fail to reject

Lorex Data came from N(10.2,.16)

they did not calculated chi-squared chi-squared with

7 dof =CHIDIST(7.577,7) 0.371 fail to reject

[149*.5*.5]^.5

[.5*.5/149]^.5

Hypothesis Tests we have come to love…….

Page 18: Class 09 Exam1 Prep

Pfeifer’s Pfoibles• How many will she get correct?• Do I have the disease?

– Give me probabilities as answers even though I did not explicitly ask for them.

• H0 is a statement about the process, not the outcomes.• If p-value < 0.05 say ….

– The result is statistically significant– We reject the null hypothesis.

• If p-value > 0.05 say …– The result is NOT statistically significant.– We fail to (or cannot) reject H0.