Transformations to Achieve Linearity Created by Mr. Hanson

Transformations to Transformations to Achieve LinearityAchieve LinearityCreated by Mr. Hanson

ObjectivesObjectivesCourse Level Expectations

◦CLE 3136.2.3 Explore bivariate dataCheck for Understanding (Formative/Summative Assessment)◦3136.2.7 Identify trends in bivariate

data; find functions that model the data and that transform the data so that they can be modeled.

Common Models for Curved Common Models for Curved DataData1. Exponential Model

y = a bx

2. Power Modely = a xb

Linearizing Exponential Linearizing Exponential DataData• Accomplished by taking ln(y)• To illustrate, we can take the

logarithm of both sides of the model.y = a bx

ln(y) = ln (a bx)ln(y) = ln(a) + ln(bx)ln(y) = ln(a) + ln(b)x

Need Help? Click here.

A + BX

Linearizing the Power Linearizing the Power ModelModel• Accomplished by taking the

logarithm of both x and y.• Again, we can take the logarithm of

both sides of the model.y = a xb

ln(y) = ln (a xb)ln(y) = ln(a) + ln(xb)ln(y) = ln(a) + b ln(x)

Note that this time the logarithm remains attached to both y AND x.

A + BX

Why Should We Linearize Why Should We Linearize Data?Data?Much of bivariate data analysis is

built on linear models. By linearizing non-linear data, we can assess the fit of non-linear models using linear tactics.

In other words, we don’t have to invent new procedures for non-linear data. HOORAY!

!HOORAY!

!

Procedures for testing Procedures for testing modelsmodels

Example: Starbucks Example: Starbucks GrowthGrowth

This table represents the number of Starbucks from 1984-2004.Put the data in your calculator

Year in L1Stores in L2

Construct scatter plot.

year stores ln_year ln_stores <new>

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

84 1 4.43082 0

87 15 4.46591 2.70805

88 18 4.47734 2.89037

89 22 4.48864 3.09104

90 29 4.49981 3.3673

91 32 4.51086 3.46574

92 49 4.52179 3.89182

93 107 4.5326 4.67283

94 153 4.54329 5.03044

95 251 4.55388 5.52545

96 339 4.56435 5.826

97 397 4.57471 5.98394

98 474 4.58497 6.16121

99 249 4.59512 5.51745

100 1366 4.60517 7.21964

101 1208 4.61512 7.09672

102 1177 4.62497 7.07072

103 1339 4.63473 7.19968

104 1112 4.64439 7.01392

Note that the data appear to Note that the data appear to be non-linear.be non-linear.

0

200

400

600

800

1000

1200

1400

year

70 75 80 85 90 95 100 105

Transformation timeTransformation time

Transform the data

Let L3 = ln (L1)Let L4 = ln (L2)

Redraw scatterplotDetermine new LSRL

year stores ln_year ln_stores <new>

84 1 4.43082 0

87 15 4.46591 2.70805

88 18 4.47734 2.89037

89 22 4.48864 3.09104

90 29 4.49981 3.3673

91 32 4.51086 3.46574

92 49 4.52179 3.89182

93 107 4.5326 4.67283

94 153 4.54329 5.03044

95 251 4.55388 5.52545

96 339 4.56435 5.826

97 397 4.57471 5.98394

98 474 4.58497 6.16121

99 249 4.59512 5.51745

100 1366 4.60517 7.21964

101 1208 4.61512 7.09672

102 1177 4.62497 7.07072

103 1339 4.63473 7.19968

104 1112 4.64439 7.01392

OriginalOriginal

0

200

400

600

800

1000

1200

1400

year

70 75 80 85 90 95 100 105

Exponential (x, ln y)Exponential (x, ln y)

0

1

2

3

4

5

6

7

8

year

70 75 80 85 90 95 100 105

ln_stores = -20.7 + 0.2707year; r2 = 0.90

OriginalOriginal

0

200

400

600

800

1000

1200

1400

year

70 75 80 85 90 95 100 105

Power (ln x, ln y)Power (ln x, ln y)

0

1

2

3

4

5

6

7

8

4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65

ln_year

ln_stores = -102.4 + 23.61ln_year; r2 = 0.88

Remember: Inspect Residual Remember: Inspect Residual Plots!!Plots!!

Exponential

Power

0

1

2

3

4

5

6

7

8

4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65

ln_year

-2

-1

0

1

2

4.25 4.30 4.35 4.40 4.45 4.50 4.55 4.60 4.65

ln_year

70 75 80 85 90 95 100 105

-2.0

-1.0

0.0

1.0

70 75 80 85 90 95 100 105

year

NOTE: Since both residual plots show curved patterns, neither model is completely appropriate, but both are improvements over the basic linear model.

R-squared (A.K.A. R-squared (A.K.A. Tiebreaker)Tiebreaker)

If plots are similar, the decision should be based on the value of r-squared.Power has the highest value (r2 = .94), so it is the most appropriate model for this data (given your choices of models in this course).

Writing equation for Writing equation for modelmodel

Once a model has been chosen, the LSRL must be converted to the non-linear model.This is done using inverses.In practice, you would only need to convert the best fit model.

Conversion to ExponentialConversion to Exponential

LSRL for transformed data (x, ln y)

ln y = -20.7 + .2707xeln y = e-20.7 + .2707x

eln y = e-20.7 (e.2707x)y = e-20.7 (e.2707)x

Conversion to PowerConversion to Power

LSRL for transformed data (ln x, ln y)

ln y = -102.4 + 23.6 ln xeln y = e-102.4 + 23.6 ln x

eln y = e-102.4 (e23.6 ln x)y = e-102.4 x23.6

View non-linear model with View non-linear model with non-linear datanon-linear data

0

200

400

600

800

1000

1200

1400

year

70 75 80 85 90 95 100 105

stores = y ear

AssignmentAssignment

U.S. Population Handout◦Rubric for assignment

Documents

Transformations to Achieve Linearity Created by Mr. Hanson