Linear Regression To accompany Hawkes lesson 12.2 Original content by D.R.S

Preview:

Citation preview

Linear Regression

To accompany Hawkes lesson 12.2Original content by D.R.S.

Linear Regression

• Input: A bunch of data points in our sample.• Requirement: We have found that the

variables do have significant linear correlation.• Output: “The Line of Best Fit is • It’s an equation of a line that models the

relationship.

Living with Inconsistencies

• Traditional algebra class line: – Slope is the number that multiplies the .– The y-intercept is the (keep the sign.)

• Calculator may use either or • Hawkes talks about

• Sometimes there are hats: and

Slope:

Regression, Inference, and Model Building

12.2 Linear Regression

HAWKES LEARNING SYSTEMS

math courseware specialists

y-Intercept:

When calculating the slope, round your answers to three decimal places.

When calculating the y-intercept, round your answers to three decimal places.

The Horses Example again

• Some horses were measured– Height (in hands?), Girth (inches), Length (inches),

Weight (pounds)– Put these data

values into your TI-84 lists L1, L2, L3, L4.

• Original data source and idea for this problem is “Elementary Statistics” by Johnson & Kuby, 10th Edition, © Brooks-Cole-Thomson, Page 702.

Recall: “Is Girth related to Weight?”

• We wondered: is the girth of a horse related to its weight? Significantly so?

• We determined “Yes, significant relationship!”– Hypothesis test – Null hypothesis , no relationship.– Alternative Hypothesis , significant rel.– We rejected the null hypothesis.– Using the TI-84 LinRegTTest program.– Because the p value of test, < level of signif’nce

Recall: “Is Girth significantly related to Weight?”

• Here’s how we do the Hypothesis Test for

• Let’s suppose that level of significance , requiring strong evidence.

• STATS, TESTS, F:LinRegTTest– Shortcut instead of scrolling: ALPHA F directly.– But it might be option E on TI-83/Plus.

Recall: LinRegTTest inputs

• Here are the inputs:

• Xlist and Ylist – where you put the data– Shortcut: 2ND 2 puts L2

• Freq: 1 (unless…)

• β & : ≠ 0– This is the Alternative

Hypothesis

• RegEq: VARS, right arrow to Y-VARS, 1, 1– Just put it in for later

• Highlight “Calculate”• Press ENTER

LinRegTTest Outputs, first screen

• t= the t statistic value• p = the p value of that t•

• Once you determine that “yes, significant relationship”, then it is valid to construct the equation

• Confusion to avoid: algebra class said , diff. letters

• Screen #1 has the

LinRegTTest Outputs, second screen

• Last time, and were of interest.

• was left for “advanced” regression

• Screen #2 has the • You could form the

equation by hand, but don’t bother with that, there’s an easier place.

Getting the Linear “Model” Equation

• Recall your inputs:

• The calculator deposited the complete equation into your Y1.

• Press Y= to see it.

• Recall what and are– List L2, Girth is (inches)

– List L4, Weight, is (lbs.)

You can make predictions

• Data and equation • “How much will a horse with 90-inch girth weigh?”

• Plug in • Do it like this: VARS, Y-

VARS, 1, 1, ( 90 ) ENTER

𝒚

Beware of “out-of-bounds” usageof the equation

• Data and equation • “How much will a horse with 70-inch girth weigh?”

• Our line only covers to .• We have no guarantee

of the linear relationship being valid or significant beyond that range, so be cautious!

𝒚

Scatter Plot and Regression Line

• 2ND STAT PLOT • ZOOM 9:ZoomStat

Scatter Plot and Regression Line

• ZOOM 9:ZoomStat • Note that the data points do fall close to the line.

• WINDOW (could be cleaned up manually)

TI-84 Inputs and Outputsfor the Girth and Length question

Inputs• (Data already in lists)

OutputsFirst screen

Second screen

Recall: Girth and Length conclusions

Conclusions• At the level of significance,

there is NOT a significant linear relationship.

• Therefore DO NOT USE THE EQUATION, IT IS NOT VALID, AND THE PROBLEM STOPS HERE.

OutputsFirst screen

Second screen

Recall: Girth and Length conclusions

Conclusions• If instead we had level of

significance, we would have considered these variables to have a significant relationship.

• And then we could have used the equation to make predictions for girth and length.

OutputsFirst screen

Second screen

The Theory behind it.

Bluman, Chapter 10 19

Best fit means that the sum of the squares of the vertical distance from each point to the line is at a minimum.

(This slide is mostly from Bluman’s 5th edition, © McGraw Hill)

Your textbook has the awful formulasto determine and in . That’s what the calculator usesto come up with its results.

Recommended