26
Lecture 4W: Explaining Regression Results Describing Results in Everyday English

Lecture 4W-InterpretingRegression

Embed Size (px)

Citation preview

Page 1: Lecture 4W-InterpretingRegression

Lecture 4W: Explaining Regression ResultsDescribing Results in Everyday English

Page 2: Lecture 4W-InterpretingRegression

Explaining Regression Results•Some things are technical, precise

▫Everyone who does same command should get same table

•Other things more open to interpretation▫Should we care about our results?

Page 3: Lecture 4W-InterpretingRegression

Regression Table: Tons of Info•Covered regression coefficients Monday•Will cover rest of output today

Page 4: Lecture 4W-InterpretingRegression

Picking Up From Monday•Start by looking at bottom table

▫Column for coefficients usually first •y = a + bx + E•conrinc = -11679.16 + 1036.512 *

prestg10 + E

Page 5: Lecture 4W-InterpretingRegression

Protip: Focus on IV More Than Constant

•Easy to overemphasize constant, particularly when it’s negative▫Constant usually matters less

Page 6: Lecture 4W-InterpretingRegression

Describing Coefficients in Words•Number under coeff for prestg10.

(1036.512)▫Slope of the line.

•If someone scores 1 point higher on occupational prestige, how much more money would we expect them to earn, given these results?

Page 7: Lecture 4W-InterpretingRegression

Explaining the 1036.512 Coefficient

•If someone scores 1 point higher on occupational prestige, how much more money would we expect them to earn, given these results?▫We would expect them to earn 1 * 1036.512

= $1036.512 more per year.•If Mary’s occupational prestige is 10

points higher than Jess, how much more would we expect her to earn?▫10 * 1036.512 = $10365.12 more per year.

Page 8: Lecture 4W-InterpretingRegression

A “One-Unit Increase”•Common way to describe a regression

coefficient in English.▫A one unit increase in occupational

prestige leads to a 1036.512 unit increase in annual income.

▫Better than saying “the regression coefficient is 1036.512.”

•Don’t round until every calculation is done

Page 9: Lecture 4W-InterpretingRegression

Why One Unit Increase vs. Ten?•Start with one unit since it is the easiest

to calculate.

•Only time you calculate just to show you can is homework problems.▫When we talk about interpreting results

and making arguments, up to you to pick resonable number of units as part of the argument

Page 10: Lecture 4W-InterpretingRegression

Predicting Scores•We can also use this equation to predict

how much money someone would make, based on their occupational prestige.▫conrinc = -11679.16 + 1036.512 * prestg10

+ E•Assume occupational prestige = 50.

▫Income = -11679.16 + 1036.512 * 50 = $40146.44

•What about someone with 80 occ. prestige?▫Income = -11679.16 + 1036.512 * 80 =

$71241.80

Page 11: Lecture 4W-InterpretingRegression

Prediction and Exceptions•Reminder: most people not exactly on the

regression line▫Exceptions do not invalidate the pattern

Page 12: Lecture 4W-InterpretingRegression

Constants can be weird•Imagine using age to predict income.

Toddlers may be predicted to have negative income.▫It’s because there are no toddlers in our

sample.▫The constant may be nonsense if we never

see x = 0 in our sample.

Page 13: Lecture 4W-InterpretingRegression

Statistical Significance in Regression

•Goal is to see whether change in independent variable leads to change in dependent variable▫Is relationship relatively unlikely to appear

just from random chance?

•Null hypothesis: regression coefficient = 0

Page 14: Lecture 4W-InterpretingRegression

Calculating Statistical Significance•Each variable has it’s own standard error

term.•Use standard error to get a t statistic for

each term.▫We don’t care about constant though

Page 15: Lecture 4W-InterpretingRegression

Computing SE for Regression Coeff.

•Where σε2 is the variance in error term εi

•sx2 is the sample variance of x, sx is the

sample standard deviation of x.

2

2

2

22

2

22

2

222

)1()( )(

)(1

)()(

xib

ii

ia

snxxbV

xxx

nxxnx

aV

Page 16: Lecture 4W-InterpretingRegression

SE Formula Implications•In general, lower SE shows better

estimates▫A worse regression model means bigger

error term, higher SE for any variable▫Large N reduces SE

Page 17: Lecture 4W-InterpretingRegression

P>|t| is p-value for a Variable•Read across to get the appropriate p-

value•Would we reject the null hypothesis?

▫Yes, p < 0.05

Page 18: Lecture 4W-InterpretingRegression

What Does p-value Tell Us?•A low p-value tells us a relationship is

unlikely to happen by random chance.▫We can be very confident that people with

higher prestige jobs tend to make more money.

•However, p-value does not tell us whether the relationship has any real world meaning.

Page 19: Lecture 4W-InterpretingRegression

Is the following important?•If we survey everyone in LA, people born

in January may make $10/year more than December babies.▫With millions in the data set, p < 0.001

•But should we care about $10 a year?

•Common problem when people who know a little stats encounter “big data”

Page 20: Lecture 4W-InterpretingRegression

Statistical vs. Substantive Significance•Ideally we want both.•Statistical significance is based strictly on

p-values.•Substantive significance is based on our

knowledge of the world. What is worth telling people about?▫These judgments won’t come from a

statistics class!▫Often worth discussing substantively

significant variables that don’t quite reach p < .05

Page 21: Lecture 4W-InterpretingRegression

Two Main Criteria for Substantive SignificanceEffect Size Personal Interest

• Always need to explain regression coefficient in a sentence (or more).

• Is number large enough for us to care about relationship?▫ If not, need to offer a

reasonable benchmark

• Is number nonsense?

• Could be intellectual or personal interest

• May feel any statistically significant relationship is important

Page 22: Lecture 4W-InterpretingRegression

Balancing Effect Size and Interest

Interest in Variable/Relationship

EffectSize

Not Sig.

Significant

Page 23: Lecture 4W-InterpretingRegression

Common Problems: Strike Zone Metaphor

Too low: not enough explanation

Too high: of thresholdfor effect size

Outside:Too

Much Spin

Outside:Too

Much Spin

JUST THROWSTRIKES!

Page 24: Lecture 4W-InterpretingRegression

Describe our results•Would you say the effect age has on

income is substantively significant? Why or why not?

Page 25: Lecture 4W-InterpretingRegression

Is There Substantive Significance?•Arguments for yes:

▫It is statistically significant and▫Some may feel $573 more per year is

enough•Arguments for no:

▫Some may feel $573 is not enough.▫It doesn’t make sense. People retire!

•Hold off claims about other variables

Page 26: Lecture 4W-InterpretingRegression

Note on r-squared•We were initially scheduled to cover r-

squared today, but I wanted to spend more time on substantive significance because it is the hardest concept.

•r-squared appears on HW 2 but will be pushed to Monday’s lecture. If necessary I will pare down on other concepts.