Upload
eugenia-carr
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Transformations to Transformations to Achieve LinearityAchieve LinearityCreated by Mr. Hanson
ObjectivesObjectivesCourse Level Expectations
◦CLE 3136.2.3 Explore bivariate dataCheck for Understanding (Formative/Summative Assessment)◦3136.2.7 Identify trends in bivariate
data; find functions that model the data and that transform the data so that they can be modeled.
Common Models for Curved Common Models for Curved DataData1. Exponential Model
y = a bx
2. Power Modely = a xb
Linearizing Exponential Linearizing Exponential DataData• Accomplished by taking ln(y)• To illustrate, we can take the
logarithm of both sides of the model.y = a bx
ln(y) = ln (a bx)ln(y) = ln(a) + ln(bx)ln(y) = ln(a) + ln(b)x
Need Help? Click here.
A + BX
Linearizing the Power Linearizing the Power ModelModel• Accomplished by taking the
logarithm of both x and y.• Again, we can take the logarithm of
both sides of the model.y = a xb
ln(y) = ln (a xb)ln(y) = ln(a) + ln(xb)ln(y) = ln(a) + b ln(x)
Note that this time the logarithm remains attached to both y AND x.
A + BX
Why Should We Linearize Why Should We Linearize Data?Data?Much of bivariate data analysis is
built on linear models. By linearizing non-linear data, we can assess the fit of non-linear models using linear tactics.
In other words, we don’t have to invent new procedures for non-linear data. HOORAY!
!HOORAY!
!
Procedures for testing Procedures for testing modelsmodels
Example: Starbucks Example: Starbucks GrowthGrowth
This table represents the number of Starbucks from 1984-2004.Put the data in your calculator
Year in L1Stores in L2
Construct scatter plot.
year stores ln_year ln_stores <new>
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
84 1 4.43082 0
87 15 4.46591 2.70805
88 18 4.47734 2.89037
89 22 4.48864 3.09104
90 29 4.49981 3.3673
91 32 4.51086 3.46574
92 49 4.52179 3.89182
93 107 4.5326 4.67283
94 153 4.54329 5.03044
95 251 4.55388 5.52545
96 339 4.56435 5.826
97 397 4.57471 5.98394
98 474 4.58497 6.16121
99 249 4.59512 5.51745
100 1366 4.60517 7.21964
101 1208 4.61512 7.09672
102 1177 4.62497 7.07072
103 1339 4.63473 7.19968
104 1112 4.64439 7.01392
Note that the data appear to Note that the data appear to be non-linear.be non-linear.
0
200
400
600
800
1000
1200
1400
year
70 75 80 85 90 95 100 105
Transformation timeTransformation time
Transform the data
Let L3 = ln (L1)Let L4 = ln (L2)
Redraw scatterplotDetermine new LSRL
year stores ln_year ln_stores <new>
84 1 4.43082 0
87 15 4.46591 2.70805
88 18 4.47734 2.89037
89 22 4.48864 3.09104
90 29 4.49981 3.3673
91 32 4.51086 3.46574
92 49 4.52179 3.89182
93 107 4.5326 4.67283
94 153 4.54329 5.03044
95 251 4.55388 5.52545
96 339 4.56435 5.826
97 397 4.57471 5.98394
98 474 4.58497 6.16121
99 249 4.59512 5.51745
100 1366 4.60517 7.21964
101 1208 4.61512 7.09672
102 1177 4.62497 7.07072
103 1339 4.63473 7.19968
104 1112 4.64439 7.01392
OriginalOriginal
0
200
400
600
800
1000
1200
1400
year
70 75 80 85 90 95 100 105
Exponential (x, ln y)Exponential (x, ln y)
0
1
2
3
4
5
6
7
8
year
70 75 80 85 90 95 100 105
ln_stores = -20.7 + 0.2707year; r2 = 0.90
OriginalOriginal
0
200
400
600
800
1000
1200
1400
year
70 75 80 85 90 95 100 105
Power (ln x, ln y)Power (ln x, ln y)
0
1
2
3
4
5
6
7
8
4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65
ln_year
ln_stores = -102.4 + 23.61ln_year; r2 = 0.88
Remember: Inspect Residual Remember: Inspect Residual Plots!!Plots!!
Exponential
Power
0
1
2
3
4
5
6
7
8
4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65
ln_year
-2
-1
0
1
2
4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65
ln_year
70 75 80 85 90 95 100 105
-2.0
-1.0
0.0
1.0
70 75 80 85 90 95 100 105
year
NOTE: Since both residual plots show curved patterns, neither model is completely appropriate, but both are improvements over the basic linear model.
R-squared (A.K.A. R-squared (A.K.A. Tiebreaker)Tiebreaker)
If plots are similar, the decision should be based on the value of r-squared.Power has the highest value (r2 = .94), so it is the most appropriate model for this data (given your choices of models in this course).
Writing equation for Writing equation for modelmodel
Once a model has been chosen, the LSRL must be converted to the non-linear model.This is done using inverses.In practice, you would only need to convert the best fit model.
Conversion to ExponentialConversion to Exponential
LSRL for transformed data (x, ln y)
ln y = -20.7 + .2707xeln y = e-20.7 + .2707x
eln y = e-20.7 (e.2707x)y = e-20.7 (e.2707)x
Conversion to PowerConversion to Power
LSRL for transformed data (ln x, ln y)
ln y = -102.4 + 23.6 ln xeln y = e-102.4 + 23.6 ln x
eln y = e-102.4 (e23.6 ln x)y = e-102.4 x23.6
View non-linear model with View non-linear model with non-linear datanon-linear data
0
200
400
600
800
1000
1200
1400
year
70 75 80 85 90 95 100 105
stores = y ear
AssignmentAssignment
U.S. Population Handout◦Rubric for assignment