Upload
lesley-owens
View
215
Download
3
Embed Size (px)
Citation preview
12a. Regression Analysis, Part 1
CSCI N207 Data Analysis Using Spreadsheet
Lingma [email protected]
Department of Computer and Information Science, IUPUI
StudentReading Aptitude
Reading Hours
1 20 52 5 13 5 24 35 75 30 86 35 87 10 38 5 29 15 510 40 9
Multivariate Analysis - Correlation
0 5 10 15 20 25 30 35 40 450123456789
10
Reading Aptitude and Read-ing Hours
Aptitude
Hours
Scatter chart with a trend line:
Multivariate Analysis - Correlation
0 5 10 15 20 25 30 35 40 450123456789
10
Reading Aptitude and Read-ing Hours
Aptitude
Hours
Scatter chart with a trend line:
• With a trend line, are we able to roughly estimate the reading aptitude if a person reads 6 hours a week? If so, what is the estimation?
StudentReading Aptitude
Reading Hours
1 20 52 5 13 5 24 35 75 30 86 35 87 10 38 5 29 15 510 40 911 25 612 33 7.813 46 10
Regression and Prediction• Regression refers to a mathematical method
for determining the best equation to reproduce a data set.
• Linear regression is a regression method that applies a straight line (linear model) for analysis.
• How do we generate a formula that represents a line with which we can use to predict a data without having to use a chart?
• We use regression analysis to …– … predict new X and Y values – … aid our understanding of data behavior
Reviewing the Linear Equation• The equation for a line is:
bmXY Dependent
VariableIndependent Variable
Slope
y-intercept
Slope and y-intercept
0
2
4
6
8
10
12
0 5 10 15 20 25
Y
X
Y = 0.4X + 2Y = 0.8X + 4
Y = 0x + 5
m and b• m, the Slope is a ratio, defined as:
• ∆: change of
run
riseor as
X
Y
Example – Determining Slope
Data Points
Value
X1 1
Y1 2.4
X2 20
Y2 10
4.019
6.7120
4.21012
12
m
m
XX
YYm
X
Ym
Example of Determining Y-Intercept
2810
204.01022
22
22
b
b
mXYb
bmXY
bmXY
• X1=1, Y1=2.4, X2=20, Y2=10, m=0.4Example 1: Example 2:
24.04.2
14.04.211
11
11
b
b
mXYb
bmXY
bmXY
Equation: Y = 0.4X + 2
Practice
• Find the equation for the line below.p1(5,1), p2(10,3)
4 5 6 7 8 9 10 110
0.5
1
1.5
2
2.5
3
3.5
f(x) = 0.4 x − 1
Reading Aptitude and Reading Hours