Applications The General Linear Model. Transformations

Applications

The General Linear Model

Transformations

Transformations to Linearity • Many non-linear curves can be put into a linear

form by appropriate transformations of the either– the dependent variable Y or – some (or all) of the independent variables X1, X2, ... ,

• This leads to the wide utility of the Linear model. • We have seen that through the use of dummy

variables, categorical independent variables can be incorporated into a Linear Model.

• We will now see that through the technique of variable transformation that many examples of non-linear behaviour can also be converted to linear behaviour.

Intrinsically Linear (Linearizable) Curves 1 Hyperbolas

y = x/(ax-b)

Linear form: 1/y = a -b (1/x) or Y = 0 + 1 X

Transformations: Y = 1/y, X=1/x, 0 = a, 1 = -b

positive curvature b>0

y=x/(ax-b)

negative curvature b< 0

2. Exponential

y = ex = x

Linear form: ln y = ln + x = ln + ln x or Y = 0 + 1 X

Transformations: Y = ln y, X = x, 0 = ln, 1 = = ln

Exponential (B > 1)

Exponential (B < 1)

3. Power Functions

y = a xb

Linear from: ln y = lna + blnx or Y = 0 + 1 X

Power functionsb>0

0 < b < 1

Power functionsb < 0

b < -1b = -1

-1 < b < 0

Logarithmic Functionsy = a + b lnx

Linear from: y = a + b lnx or Y = 0 + 1 X

Transformations: Y = y, X = ln x, 0 = a, 1 = b

b > 0b < 0

Other special functionsy = a e b/x

Linear from: ln y = lna + b 1/x or Y = 0 + 1 X

Transformations: Y = ln y, X = 1/x, 0 = lna, 1 = b

b > 0 b < 0

The Box-Cox Family of Transformations

0ln(x)

xdtransformex

The Transformation Staircase

1 2 3 4

Graph of ln(x)

0 20 40 60 80 100 120 140 160 180

ln( )newx x

0 20 40 60 80 100 120 140 160 180

The effect of the transformation ln( )newx x

• The ln-transformation is a member of the Box-Cox family of transformations with = 0

• If you decrease the value of the effect of the transformation will be greater.

• If you increase the value of the effect of the transformation will be less.

The effect of the ln transformation• It spreads out values that are close to zero• Compacts values that are large

ln(x)newx

The Bulging Rule

y upy up

y downy down

x down

Non-Linear Models

Nonlinearizable models

Mechanistic Growth Model

Non-Linear Growth models • many models cannot be transformed into a linear model

The Mechanistic Growth Model

Equation: kxeY 1

or (ignoring ) “rate of increase in Y” = Ykdx

The Logistic Growth Model

or (ignoring ) “rate of increase in Y” = YkY

Equation:

10864200.0

Logistic Growth Model

k=1/2k=1k=2

The Gompertz Growth Model:

or (ignoring ) “rate of increase in Y” =

Equation: kxeeY

10864200.0

Gompertz Growth Model

Polynomial Regression models

Polynomial Models

y = 0 + 1x + 2x2 + 3x

Linear form Y = 0 + 1 X1 + 2 X2 + 3 X3

Variables Y = y, X1 = x , X2 = x2, X3 = x3

0 0.5 1 1.5 2 2.5 3

Suppose that we have two variables

1. Y – the dependent variable (response variable)

2. X – the independent variable (explanatory variable, factor)

Assume that we have collected data on two variables X and Y. Let

(x1, y1) (x2, y2) (x3, y3) … (xn, yn)

denote the pairs of measurements on the on two variables X and Y for n cases in a sample (or population)

1. independent random variables.

2. Normally distributed.

3. Have the common variance, .

4. The mean of yi is:

The assumption will be made that y1, y2, y3

…, yn are

kikiiii xxxx 3

Each yi is assumed to be randomly generated

from a normal distribution with

mean and standard deviation .

kikiiii xxxx 3

0 20 40 60 80 100

The Model

nixxxy ikikiii ,,2 ,1 2

The matrix formulation

n xxxx

The Normal Equations

ˆ yXβXX

Example In the following example two quantities are being measured

X = amount of an additive to a chemical process

Y = the yield of the process

X Y X Y X Y

4 35.0 44 93.9 84 65.48 70.1 48 88.1 88 98.012 61.4 52 88.4 92 104.316 106.9 56 76.6 96 128.020 93.2 60 76.0 100 150.824 100.1 64 68.128 90.8 68 75.032 99.4 72 81.236 103.2 76 68.540 86.2 80 78.1

Graph X vs Y

0 20 40 60 80 100

The Model – Cubic polynomial (degree 3)

nixxxy iiiii ,,2 ,1 33

Comment:

A cubic polynomial in x can be fitted to y by defining the variables

X1 = x, X2 = x2, and X3 = x3

Then fitting the linear model 3322110 XXXy

Response Surface Models

Extending polynomial regression models to k independent variables

Response Surface models (2 independent vars.)

Dependent variable Y and two independent variables x1 and x2. (These ideas are easily extended to more the two independent variables)

The Model (A cubic response surface model)

21421322110 xxxxxY

225 xxxxxxx

Compare with a linear model:

22110 xxY

The response surface model

can be put into the form of a linear model :

Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 + 5 X5 + 6 X6 + 7 X7 + 8 X8 + 9 X9+

by defining

21421322110 xxxxxY

225 xxxxxxx

, , , , , 225214

2132211 xXxxXxXxXxX

316 and , , xXxxXxxXxX

More Generally, consider the random variable Y with

1. E[Y] = g(U1 ,U2 , ... , Uk)

= 11(U1 ,U2 , ... , Uk) + 22(U1 ,U2 , ... , Uk) + ... + pp(U1 ,U2 , ... , Uk)

2. var(Y) = 2

• where 1, 2 , ... ,p are unknown parameters

• and 1 ,2 , ... , p are known functions of the nonrandom variables U1 ,U2 , ... , Uk.

• Assume further that Y is normally distributed.

iii UUU ,...,, 21

Now suppose that n independent observations of Y,

(y1, y2, ..., yn) are made

corresponding to n sets of values of (U1 ,U2 , ... , Uk) :

(u11 ,u12 , ... , u1k),

(u21 ,u22 , ... , u2k),...

(un1 ,un2 , ... , unk).

Let xij = j(ui1 ,ui2 , ... , uik) j =1, 2, ..., p; i =1, 2, ..., n. Then

jkiiijji xuuuyE

1121 ),...,,(

),...,,( where 21 kiiijij uuuxE XβXy

Polynomial Regression Model: One variable U.

Quadratic Response Surface Model: Two variables U1, U2.

21 Quadratic

Linear

constant1

Trigonometric Polynomial Models

y = 0 + 1cos(2f1x) + 1sin(2f1x) + … +

kcos(2fkx) + ksin(2fkx)

Linear form Y = 0 + 1 C1 + 1 S1 + … + k Ck + k Sk

Variables Y = y, C1 = cos(2f1x) , S2 = sin(2f1x) , …

Ck = cos(2fkx) , Sk = sin(2fkx)

General set of models XpXpXpY kk1100

The Normal equations: given data

nn yxyxyx ,,,,,, 2211

xpxpxpxpxpxpxp

Two important Special Cases

kk XXpXXpXXpXp ,,,1 2

Polynomial Models

,2cos,2sin,1 12110 XfXpXfXpXp

Trig-polynomial Models

Orthogonal Polynomial Models

Definition

Consider the values x0, x1, … , xn and the polynomials

are orthogonal relative to x0, x1, … , xn if:

If in addition , they are called orthonormal

kkkkk xxxp

. allfor ,00

lmxpxpn

jjm allfor ,1

Consider the model XpXpXpY kk1100

This is equivalent to a polynomial model.

Rather than the basis for this model being

The basis is

,polynomials of degree 0, 1, 2, 3, etc

etc ,,,.1 32 XXX

etc ,,, 3210 XpXpXpXp

The Normal Equations given the data nn yxyxyx ,,,,,, 2211

xpxpxpxpxpxpxp

solution with

Derivation of Orthogonal Polynomials

With equally spaced data points

Suppose x0 = a, x1 = a + b, x2 = a + 2b, … , xn = a + nb

xp allfor 1

Thus 0

ofDerivation

jjj bja

jbannn

2 and ,0

2 and 1

baxxp jj

01 2 Thus

xxp 11

101 ofDerivation

2 find to 1

bK 1 use we,

njKnjK1

2 122 i.e.

202 find weNow xxxp

jjj bjabja

bjabjanj

jj bjabjaxp

20 ,, find toused becan equations threeThese

0 becomes 1.Equation 0

bjabja

021110

jbjabanjbann

jn jbjabajba

babnan

)12(1 and

1 since

01266366Finally 22

20 nnbabnabna

:becomes 2.Equation

bjabjanj

22or n

bjanjnj

011122 Now00

nnnnnnjnjn

najbnabjbjanj0

nanjbnajbn

bnannn

231226

1bnnbn

jnbabjb0

jbabjanjbjanj0

2 122 nanjabnan

bjanj0

212 Thus

22 nnn

nbabnn

122 nan

nnabna

bjanj0

212 Thus

22 nnn

nbabnn

122 nan

nnabna

222322232 428336

1bnabnbnabnnbnb

222 666 naabnna

abnnnbn

124136

1 222 nnnbabnnbn

22 6223 naabnan

abnabnnbnbn

1 22232

:becomes 2.Equation

1.0 nbn

n 0226

abnnnbn

1by Dividing

02 toleads 22

21 abn

are equations twoThe

01266366 1 22

20 nnbabnabna

02 2 22

21 abn

21 2 2 from abn

01266236

becomes 12

0 nnbabnaabn

220 1266236or nnbabnaabn

222 126623 nnbabnaabn

2222220 1266121236 nnbabnaaabnnb

2222 12663 nnbaabnnb

222222 2663 nbnbaabnnb

2222 66 nbaabnnb

0 61 Thus abnannb

21 2 and abn

261 61 abnannb

222 xxabn

202 Thus xxxp

222612

2 21 xaxabnannb

bjapxp j 22 Thus

222612

2 21 bjabjabnannb

2222612

2 1 jbnjbnnb 2

2 1 jnjnnb 22

61 where1 bKjnjnnK

usingby , find, nowcan We 222 bK

11 i.e.0

jj jnjnnKxp

2261 1

jnjnnxp

1 Thus

jnjnnxp

303 find tocontinue we xxxxp

jjj xpxp

30 ,,, find toused becan equationsfour These

jjj xpxp

etc ,,, find tocontinued is process The 654 xpxpxp

solution with

To do the calculations we need the values of:

These values depend only on 1. n = the number of observations2. i = the degree of the polynomial, and3. j = the index of xj.

Orthogonal Linear Contrasts for Polynomial Regression

k P o l y n o m i a l 1 2 3 4 5 6 7 8 9 1 0 a

3 L i n e a r - 1 0 1 2 Q u a d r a t i c 1 - 2 1 6 4 L i n e a r - 3 - 1 1 3 2 0 Q u a d r a t i c 1 - 1 - 1 1 4 C u b i c - 1 3 - 3 1 2 0 5 L i n e a r - 2 - 1 0 1 2 1 0 Q u a d r a t i c 2 - 1 - 2 - 1 2 1 4 C u b i c - 1 2 0 - 2 1 1 0 Q u a r t i c 1 - 4 6 - 4 1 7 0 6 L i n e a r - 5 - 3 - 1 1 3 5 7 0 Q u a d r a t i c 5 - 1 - 4 - 4 - 1 5 8 4 C u b i c - 5 7 4 - 4 - 7 5 1 8 0 Q u a r t i c 1 - 3 2 2 - 3 1 2 8 7 L i n e a r - 3 - 2 - 1 0 1 2 3 2 8 Q u a d r a t i c 5 0 - 3 - 4 - 3 0 5 8 4 C u b i c - 1 1 1 0 - 1 - 1 1 6 Q u a r t i c 3 - 7 1 6 1 - 7 3 1 5 4

Orthogonal Linear Contrasts for Polynomial Regression

k P o l y n o m i a l 1 2 3 4 5 6 7 8 9 1 0 a

8 L i n e a r - 7 - 5 - 3 - 1 1 3 5 7 1 6 8 Q u a d r a t i c 7 1 - 3 - 5 - 5 - 3 1 7 1 6 8 C u b i c - 7 5 7 3 - 3 - 7 - 5 7 2 6 4 Q u a r t i c 7 - 1 3 - 3 9 9 - 3 - 1 3 7 6 1 6 Q u i n t i c - 7 2 3 - 1 7 - 1 5 1 5 1 7 - 2 3 7 2 1 8 4 9 L i n e a r - 4 - 3 - 2 - 1 0 1 2 3 4 2 0 Q u a d r a t i c 2 8 7 - 8 - 1 7 - 2 0 - 1 7 - 8 7 2 8 2 7 7 2 C u b i c - 1 4 7 1 3 9 0 - 9 - 1 3 - 7 1 4 9 9 0 Q u a r t i c 1 4 - 2 1 - 1 1 9 1 8 9 - 1 1 - 2 1 1 4 2 0 0 2 Q u i n t i c - 4 1 1 - 4 - 9 0 9 4 - 1 1 4 4 6 8 1 0 L i n e a r - 9 - 7 - 5 - 3 - 1 1 3 5 7 9 3 3 0 Q u a d r a t i c 6 2 - 1 - 3 - 4 - 4 - 3 - 1 2 6 1 3 2 C u b i c - 4 2 1 4 3 5 3 1 1 2 - 1 2 - 3 1 - 3 5 - 1 4 4 2 8 5 8 0 Q u a r t i c 1 8 - 2 2 - 1 7 3 1 8 1 8 3 - 1 7 - 2 2 1 8 2 8 6 0 Q u i n t i c - 6 1 4 - 1 - 1 1 - 6 6 1 1 1 - 1 4 6 7 8 0

The Use of Dummy Variables

• In the examples so far the independent variables are continuous numerical variables.

• Suppose that some of the independent variables are categorical.

• Dummy variables are artificially defined variables designed to convert a model including categorical independent variables to the standard multiple regression model.

Example:Comparison of Slopes of k Regression Lines with Common Intercept

Situation:• k treatments or k populations are being compared.• For each of the k treatments we have measured

both – Y (the response variable) and – X (an independent variable)

• Y is assumed to be linearly related to X with – the slope dependent on treatment

(population), while – the intercept is the same for each treatment

The Model:k) , ... 2, 1, (i ient for treatm )(

1 XY i

30201000

Graphical Illustration of the above Model

yTreat 1

Treat 2

Treat 3

Treat k

Common Intercept

Different Slopes

• This model can be artificially put into the form of the Multiple Regression model by the use of dummy variables to handle the categorical independent variable Treatments.

• Dummy variables are variables that are artificially defined

In this case we define a new variable for each category of the categorical variable.

That is we will define Xi for each category of treatments as follows:

otherwise0

i treatmentreceivessubject theifXX i

Then the model can be written as follows:

The Complete Model:

otherwise0

kk XXXY )(

In this case

Dependent Variable: Y

Independent Variables: X1, X2, ... , Xk

In the above situation we would likely be interested in testing the equality of the slopes. Namely the Null Hypothesis

(q = k – 1)

)1(10 : kH

The Reduced Model:

Independent Variable:

X = X1+ X2+... + Xk

Example:

In the following example we are measuring – Yield Y

as it depends on – the amount (X) of a pesticide.

Again we will assume that the dependence of Y on X will be linear.

(I should point out that the concepts that are used in this discussion can easily be adapted to the non-linear situation.)

• Suppose that the experiment is going to be repeated for three brands of pesticides:

• A, B and C. • The quantity, X, of pesticide in this

experiment was set at 3 different levels: – 2 units/hectare, – 4 units/hectare and – 8 units per hectare.

• Four test plots were randomly assigned to each of the nine combinations of test plot and level of pesticide.

• Note that we would expect a common intercept for each brand of pesticide since when the amount of pesticide, X, is zero the four brands of pesticides would be equivalent.

The data for this experiment is given in the following table:

A 29.63 28.16 28.45

31.87 33.48 37.21

28.02 28.13 35.06

35.24 28.25 33.99

B 32.95 29.55 44.38

24.74 34.97 38.78

23.38 36.35 34.92

32.08 38.38 27.45

C 28.68 33.79 46.26

28.70 43.95 50.77

22.67 36.89 50.21

30.02 33.56 44.14

0 1 2 3 4 5 6 7 8

Pesticide X (Amount) X1 X2 X3 Y

A 2 2 0 0 29.63A 2 2 0 0 31.87A 2 2 0 0 28.02A 2 2 0 0 35.24B 2 0 2 0 32.95B 2 0 2 0 24.74B 2 0 2 0 23.38B 2 0 2 0 32.08C 2 0 0 2 28.68C 2 0 0 2 28.70C 2 0 0 2 22.67C 2 0 0 2 30.02A 4 4 0 0 28.16A 4 4 0 0 33.48A 4 4 0 0 28.13A 4 4 0 0 28.25B 4 0 4 0 29.55B 4 0 4 0 34.97B 4 0 4 0 36.35B 4 0 4 0 38.38C 4 0 0 4 33.79C 4 0 0 4 43.95C 4 0 0 4 36.89C 4 0 0 4 33.56A 8 8 0 0 28.45A 8 8 0 0 37.21A 8 8 0 0 35.06A 8 8 0 0 33.99B 8 0 8 0 44.38B 8 0 8 0 38.78B 8 0 8 0 34.92B 8 0 8 0 27.45C 8 0 0 8 46.26C 8 0 0 8 50.77C 8 0 0 8 50.21C 8 0 0 8 44.14

The data as it would appear in a data file. The variables X1, X2 and X3 are the

“dummy” variables

Fitting the complete model :ANOVA

df SS MS F Significance F

Regression 3 1095.815813 365.2719378 18.33114788 4.19538E-07

Residual 32 637.6415754 19.92629923

Total 35 1733.457389

CoefficientsIntercept 26.24166667

X1 0.981388889

X2 1.422638889

X3 2.602400794

Fitting the reduced model :ANOVA

Regression 1 623.8232508

623.8232508

19.11439978

0.000110172

Residual 34 1109.634138

32.63629818

Total 35 1733.457389

Coefficients

Intercept 26.24166667

X 1.668809524

The Anova Table for testing the equality of slopes

common slope zero

1 623.8232508 623.8232508 31.3065283 3.51448E-06

Slope comparison 2 471.9925627 235.9962813 11.84345766 0.000141367

Residual 32 637.6415754 19.92629923

Total 35 1733.457389

Example:Comparison of Intercepts of k Regression Lines with a Common Slope (One-way Analysis of Covariance)

both Y (then response variable) and X (an independent variable)

• Y is assumed to be linearly related to X with the intercept dependent on treatment (population), while the slope is the same for each treatment.

• Y is called the response variable, while X is called the covariate.

The Model:k) , ... 2, 1, (i ient for treatm 1

)(0 XY i

30201000

Graphical Illustration of the One-wayAnalysis of Covariance Model

Treat 1Treat 2

Treat 3

Treat k

Common Slopes

Equivalent Forms of the Model:

ient for treatm 1i XXY

ient for treatmmean adjusted i

ient for treatm 1i XXY

responsemean adjusted overall i 2)

ient for treatmeffect adjusted i

• This model can be artificially put into the form of the Multiple Regression model by the use of dummy variables to handle the categorical independent variable Treatments.

In this case we define a new variable for each category of the categorical variable.

That is we will define Xi for categories I

i = 1, 2, …, (k – 1) of treatments as follows:

otherwise0

i treatmentreceivessubject theif1iX

Then the model can be written as follows:

The Complete Model:

otherwise0

XXXXY kk 11122110

In this case

Independent Variables:

X1, X2, ... , Xk-1, X

In the above situation we would likely be interested in testing the equality of the intercepts. Namely the Null Hypothesis

(q = k – 1)

0: 1210 kH

The Reduced Model:

Independent Variable: X

Example:

In the following example we are interested in comparing the effects of five workbooks (A, B, C, D, E) on the performance of students in Mathematics. For each workbook, 15 students are selected (Total of n = 15×5 = 75). Each student is given a pretest (pretest score ≡ X) and given a final test (final score ≡ Y). The data is given on the following slide

The data

Pre Post Pre Post Pre Post Pre Post Pre Post

43.0 46.4 43.6 52.5 57.5 61.9 59.9 56.1 43.2 46.055.3 43.9 45.2 61.8 49.3 57.5 50.5 49.6 60.7 59.759.4 59.7 54.2 69.1 48.0 52.5 45.0 46.1 42.7 45.451.7 49.6 45.5 61.7 31.3 42.9 55.0 53.2 46.6 44.353.0 49.3 43.4 53.3 65.3 74.5 52.6 50.8 42.6 46.548.7 47.1 50.1 57.4 47.1 48.9 62.8 60.1 25.6 38.445.4 47.4 36.2 48.7 34.8 47.2 41.4 49.5 52.5 57.742.1 33.3 55.1 61.9 53.9 59.8 62.1 58.3 51.2 47.160.0 53.2 48.9 55.0 42.7 49.6 56.4 58.1 48.8 50.432.4 34.1 52.9 63.3 47.6 55.6 54.2 56.8 44.1 52.774.4 66.7 51.7 64.7 56.1 62.4 51.6 46.1 73.8 73.643.2 43.2 55.3 66.4 39.7 52.1 63.3 56.0 52.6 50.844.5 42.5 45.2 59.4 32.3 49.7 37.3 48.8 67.8 66.847.1 51.3 37.6 56.9 59.5 67.1 39.2 45.1 42.9 47.257.0 48.9 41.7 51.3 46.2 55.2 62.1 58.0 51.7 57.0

Workbook EWorkbook A Workbook B Workbook C Workbook D

The Model:

( )0 1 for workbook ( , , , , )iY X i i A B C D E

Graphical display of data

0 20 40 60 80Pretest Score

Workbook A

Workbook B

Workbook C

Workbook D

Workbook E

Some comments

1. The linear relationship between Y (Final Score) and X (Pretest Score), models the differing aptitudes for mathematics.

2. The shifting up and down of this linear relationship measures the effect of workbooks on the final score Y.

The Model:( )0 1 for workbook ( , , , , )iY X i i A B C D E

30 20 10 0 0

Graphical Illustration of the One-way Analysis of Covariance Model

Common Slopes

The data as it would appear in a data file.

Pre Final Workbook43 46.4 A

55.3 43.9 A59.4 59.7 A51.7 49.6 A

53 49.3 A48.7 47.1 A45.4 47.4 A42.1 33.3 A

60 53.2 A32.4 34.1 A74.4 66.7 A43.2 43.2 A44.5 42.5 A47.1 51.3 A

57 48.9 A43.6 52.5 B45.2 61.8 B54.2 69.1 B45.5 61.7 B43.4 53.3 B

54.2 56.8 D51.6 46.1 D63.3 56 D37.3 48.8 D39.2 45.1 D62.1 58 D43.2 46 E60.7 59.7 E42.7 45.4 E46.6 44.3 E42.6 46.5 E25.6 38.4 E52.5 57.7 E51.2 47.1 E48.8 50.4 E44.1 52.7 E73.8 73.6 E52.6 50.8 E67.8 66.8 E42.9 47.2 E51.7 57 E

The data as it would appear in a data file with Dummy variables, (X1 , X2, X3, X4 )added

Pre Final Workbook X1 X2 X3 X443 46.4 A 1 0 0 0

55.3 43.9 A 1 0 0 059.4 59.7 A 1 0 0 051.7 49.6 A 1 0 0 0

53 49.3 A 1 0 0 048.7 47.1 A 1 0 0 045.4 47.4 A 1 0 0 042.1 33.3 A 1 0 0 0

60 53.2 A 1 0 0 032.4 34.1 A 1 0 0 074.4 66.7 A 1 0 0 043.2 43.2 A 1 0 0 044.5 42.5 A 1 0 0 047.1 51.3 A 1 0 0 0

57 48.9 A 1 0 0 043.6 52.5 B 0 1 0 045.2 61.8 B 0 1 0 0

37.3 48.8 D 0 0 0 139.2 45.1 D 0 0 0 162.1 58 D 0 0 0 143.2 46 E 0 0 0 060.7 59.7 E 0 0 0 042.7 45.4 E 0 0 0 046.6 44.3 E 0 0 0 042.6 46.5 E 0 0 0 025.6 38.4 E 0 0 0 052.5 57.7 E 0 0 0 051.2 47.1 E 0 0 0 048.8 50.4 E 0 0 0 044.1 52.7 E 0 0 0 073.8 73.6 E 0 0 0 052.6 50.8 E 0 0 0 067.8 66.8 E 0 0 0 042.9 47.2 E 0 0 0 051.7 57 E 0 0 0 0

Here is the data file in SPSS with the Dummy variables, (X1 , X2, X3, X4 )added. The can be added within SPSS

Fitting the complete model

The dependent variable is the final score, Y.The independent variables are the Pre-score X and the four dummy variables X1, X2, X3, X4.

The OutputVariables Entered/Removedb

X4, PRE,X3, X1, X2

a . Enter

Model1

VariablesEntered

VariablesRemoved Method

All requested variables entered.a.

Dependent Variable: FINALb.

Model Summary

.908a .825 .812 3.594Model1

R R SquareAdjustedR Square

Std. Errorof the

Estimate

Predictors: (Constant), X4, PRE, X3, X1, X2a.

The Output - continuedANOVAb

4191.378 5 838.276 64.895 .000a

891.297 69 12.917

5082.675 74

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

Coefficientsa

16.954 2.441 6.944 .000

.709 .045 .809 15.626 .000

-4.958 1.313 -.241 -3.777 .000

8.553 1.318 .416 6.489 .000

5.231 1.317 .254 3.972 .000

-1.602 1.320 -.078 -1.214 .229

(Constant)

Model1

B Std. Error

UnstandardizedCoefficients

Standardized

Coefficients

t Sig.

Dependent Variable: FINALa.

The interpretation of the coefficients

Coefficientsa

16.954 2.441 6.944 .000

.709 .045 .809 15.626 .000

-4.958 1.313 -.241 -3.777 .000

8.553 1.318 .416 6.489 .000

5.231 1.317 .254 3.972 .000

-1.602 1.320 -.078 -1.214 .229

(Constant)

Model1

B Std. Error

Standardized

Coefficients

t Sig.

The common slope

Coefficientsa

16.954 2.441 6.944 .000

.709 .045 .809 15.626 .000

-4.958 1.313 -.241 -3.777 .000

8.553 1.318 .416 6.489 .000

5.231 1.317 .254 3.972 .000

-1.602 1.320 -.078 -1.214 .229

(Constant)

Model1

B Std. Error

Standardized

Coefficients

t Sig.

The intercept for workbook E

Coefficientsa

16.954 2.441 6.944 .000

.709 .045 .809 15.626 .000

-4.958 1.313 -.241 -3.777 .000

8.553 1.318 .416 6.489 .000

5.231 1.317 .254 3.972 .000

-1.602 1.320 -.078 -1.214 .229

(Constant)

Model1

B Std. Error

Standardized

Coefficients

t Sig.

The changes in the intercept when we change from workbook E to other workbooks.

1. When the workbook is E then X1 = 0,…, X4 = 0 and

0 1 1 2 2 3 3 4 4 1Y X X X X X

The model can be written as follows:

The Complete Model:

0 1Y X 2. When the workbook is A then X1 = 1,…, X4 = 0 and

0 1 1Y X hence 1 is the change in the intercept when we change form workbook E to workbook A.

0 1 2 3 4i.e. : 0H

Testing for the equality of the intercepts

0 1Y X

The reduced model

The dependent variable in only X (the pre-score)

Fitting the reduced model

The dependent variable is the final score, Y.The independent variables is only the Pre-score X.

The Output for the reduced model

Variables Entered/Removedb

PREa . EnterModel1

VariablesEntered

VariablesRemoved Method

All requested variables entered.a.

Model Summary

.700a .490 .483 5.956Model1

Std. Errorof the

Estimate

Predictors: (Constant), PREa.

Lower R2

The Output - continuedANOVAb

2492.779 1 2492.779 70.263 .000a

2589.896 73 35.478

5082.675 74

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

Increased R.S.SCoefficientsa

23.105 3.692 6.259 .000

.614 .073 .700 8.382 .000

(Constant)

Model1

B Std. Error

Standardized

Coefficients

t Sig.

The F Test

Reduction in R.S.S

MSE for complete modelq

ANOVAb

2492.779 1 2492.779 70.263 .000a

2589.896 73 35.478

5082.675 74

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

The Reduced model

The Complete modelANOVAb

4191.378 5 838.276 64.895 .000a

891.297 69 12.917

5082.675 74

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

The F test

reduced ANOVASum of Squares df Mean Square F Sig.

Regression 2492.77885 1 2492.77885 70.2626 4.56272E-13Residual 2589.89635 73 35.47803219Total 5082.6752 74

Complete ANOVASum of Squares df Mean Square F Sig.

Sum of Squares df Mean Square F Sig.

slope 2492.77885 1 2492.77885 192.9791 1.13567E-21equality of int. 1698.599121 4 424.6497803 32.87437 2.46006E-15Residual 891.297229 69 12.91735115Total 5082.6752 74

Test equality of slope

0 1i.e. : 0H

Testing for zero slope

0 1 1 2 2 3 3 4 4Y X X X X

The reduced model

The dependent variables are X1, X2, X3, X4 (the dummies)

The Reduced model

The Complete modelANOVAb

4191.378 5 838.276 64.895 .000a

891.297 69 12.917

5082.675 74

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

ANOVAb

1037.475 4 259.369 4.488 .003a

4045.200 70 57.789

5082.675 74

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

Predictors: (Constant), X4, X3, X2, X1a.

The F testReduced Sum of Squares df Mean Square F Sig.

Regression 1037.4752 4 259.3688 4.488237 0.002757501Residual 4045.2 70 57.78857143Total 5082.6752 74

Complete Sum of Squares df Mean Square F Sig.Regression 4191.377971 5 838.2755942 64.89532 9.99448E-25Residual 891.297229 69 12.91735115Total 5082.6752 74

Zero slope Sum of Squares df Mean Square F Sig.Regression 1037.4752 4 259.3688 20.0791 5.30755E-11zero slope 3153.902771 1 3153.902771 244.1602 2.3422E-24Residual 891.297229 69 12.91735115Total 5082.6752 74

The Analysis of Covariance

• This analysis can also be performed by using a package that can perform Analysis of Covariance (ANACOVA)

• The package sets up the dummy variables automatically

Here is the data file in SPSS . The Dummy variables are no longer needed.

In SPSS to perform ANACOVA you select from the menu –Analysis->General Linear Model->Univariatee

This dialog box will appear

You now select:1. The dependent variable Y (Final Score)2. The Fixed Factor (the categorical

independent variable – workbook)3. The covariate (the continuous independent

variable – pretest score)

Compare this with the previous computed table

Tests of Between-Subjects Effects

Dependent Variable: FINAL

4191.378a 5 838.276 64.895 .000

837.590 1 837.590 64.842 .000

3153.903 1 3153.903 244.160 .000

1698.599 4 424.650 32.874 .000

891.297 69 12.917

219815.6 75

5082.675 74

SourceCorrected Model

Intercept

WORKBOOK

Corrected Total

Type IIISum ofSquares df

MeanSquare F Sig.

R Squared = .825 (Adjusted R Squared = .812)a.

Sum of Squares df Mean Square F Sig.

slope 2492.77885 1 2492.77885 192.9791 1.13567E-21equality of int. 1698.599121 4 424.6497803 32.87437 2.46006E-15Residual 891.297229 69 12.91735115Total 5082.6752 74

The output: The ANOVA TABLE

This is the sum of squares in the numerator when we attempt to test if the slope is zero (and allow the intercepts to be different)

Tests of Between-Subjects Effects

Dependent Variable: FINAL

4191.378a 5 838.276 64.895 .000

837.590 1 837.590 64.842 .000

3153.903 1 3153.903 244.160 .000

1698.599 4 424.650 32.874 .000

891.297 69 12.917

219815.6 75

5082.675 74

SourceCorrected Model

Intercept

WORKBOOK

Corrected Total

Type IIISum ofSquares df

MeanSquare F Sig.

R Squared = .825 (Adjusted R Squared = .812)a.

The output: The ANOVA TABLE

The Use of Dummy Variables

Example:Comparison of Slopes of k Regression Lines with Common Intercept

both – Y (the response variable) and – X (an independent variable)

• Y is assumed to be linearly related to X with – the slope dependent on treatment

(population), while – the intercept is the same for each treatment

The Model:k) , ... 2, 1, (i ient for treatm )(

1 XY i

30201000

Graphical Illustration of the above Model

yTreat 1

Treat 2

Treat 3

Treat k

Common Intercept

Different Slopes

The Complete Model:

otherwise0

kk XXXY )(

Example:Comparison of Intercepts of k Regression Lines with a Common Slope (One-way Analysis of Covariance)

both Y (then response variable) and X (an independent variable)

• Y is assumed to be linearly related to X with the intercept dependent on treatment (population), while the slope is the same for each treatment.

• Y is called the response variable, while X is called the covariate.

The Model:k) , ... 2, 1, (i ient for treatm 1

)(0 XY i

30201000

Graphical Illustration of the One-wayAnalysis of Covariance Model

Treat 1Treat 2

Treat 3

Treat k

Common Slopes

The Complete Model:

otherwise0

XXXXY kk 11122110

Another application of the use of dummy variables

• The dependent variable, Y, is linearly related to X, but the slope changes at one or several known values of X (nodes).

Xnodes

The model

0 1 2 1 2 1 2

0 1 2 1 2 3 2 3 2 3

x X x X xY

x x X x X x

Xx1 x2 xk

0 1 1 2 1 1 2

0 1 1 2 2 1 3 2 2 3

x X x x X xY

x x x X x x X x

Now define

X X xX

2 1 1 2

X X x x X x

x x x X

3 2 2 3

X X x x X x

x x x X

Then the model

can be written

0 1 1 2 1 1 2

0 1 1 2 2 1 3 2 2 3

x X x x X xY

x x x X x x X x

0 1 1 2 2 3 3Y X X X

An ExampleIn this example we are measuring Y at time X.

Y is growing linearly with time.

At time X = 10, an additive is added to the process which may change the rate of growth.

The data

X 0.0 1.0 2.0 3.0 4.0 5.0 6.0Y 3.9 5.9 6.4 6.3 7.5 7.9 8.5X 7.0 8.0 9.0 10.0 11.0 12.0 13.0Y 10.7 10.0 12.4 11.0 11.5 13.9 17.6X 14.0 15.0 16.0 17.0 18.0 19.0 20.0Y 18.2 16.8 21.8 23.1 22.9 26.2 27.7

0 5 10 15 20

Now define the dummy variables

10 if 10

0 if 10

10 if 10

The data as it appears in SPSS – x1, x2 are the dummy variables

We now regress y on x1 and x2.

The OutputModel Summary

.990a .980 .978 1.0626Model1

Std. Errorof the

Estimate

Predictors: (Constant), X2, X1a.

ANOVAb

1015.909 2 507.954 449.875 .000a

20.324 18 1.129

1036.232 20

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

Predictors: (Constant), X2, X1a.

Dependent Variable: Yb.

Coefficientsa

4.714 .577 8.175 .000

.673 .085 .325 7.886 .000

1.579 .085 .761 18.485 .000

(Constant)

Model1

B Std. Error

Standardized

Coefficients

t Sig.

Dependent Variable: Ya.

0 5 10 15 20

Testing for no change in slope

Here we want to test

H0: 1 = 2 vs HA: 1 ≠ 2

The reduced model is

Y = 0 + 1 (X1+ X2) +

= 0 + 1 X +

Fitting the reduced model

We now regress y on x.

The OutputModel Summary

.971a .942 .939 1.7772Model1

Std. Errorof the

Estimate

Predictors: (Constant), Xa.

ANOVAb

976.219 1 976.219 309.070 .000a

60.013 19 3.159

1036.232 20

Regression

Residual

Model1

Sum ofSquares df

MeanSquare F Sig.

Predictors: (Constant), Xa.

Dependent Variable: Yb. Coefficientsa

2.559 .749 3.418 .003

1.126 .064 .971 17.580 .000

(Constant)

Model1

B Std. Error

Standardized

Coefficients

t Sig.

Dependent Variable: Ya.

Graph – fitting a common slope

0 5 10 15 20

The test for the equality of slope

Reduced Model Sum of Squares df Mean Square F Sig.

Complete Model Sum of Squares df Mean Square F Sig.

Regression 1015.908579 2 507.9542895 449.8753 0Residual 20.32380204 18 1.129100113Total 1036.232381 20

equality of slope Sum of Squares df Mean Square F Sig.

slope 976.2194805 1 976.2194805 864.5996 1.14256E-16equality of slope 39.6890984 1 39.6890984 35.15109 1.30425E-05Residual 20.32380204 18 1.129100113Total 1036.232381 20

Applications The General Linear Model. Transformations

Documents

Abstract Vector Spaces, Linear Transformations, and …nita/LAVectorSpaces.pdf · Abstract Vector Spaces, Linear Transformations, and Their Coordinate Representations Contents

Lecture 5: linear and affine transformations

06. Linear Transformations Linear... · 2014-12-29 · 6.1. Matrices as Transformations All Linear Transformations from Rn to Rm Are Matrix Transformations The matrix A in this theorem

Chapter 4 LINEAR TRANSFORMATIONS AND THEIR MATRICES

Lecture 5: linear and affine transformations · Linear transformations A ne transformations Transformations in 3D De nition Examples Finding matrices Compositions of transformations

Linear Transformations - Loyola University Marylandmath.loyola.edu › ~chidyagp › sp20 › slides › linear_transformations.pdf · Linear transformations preserve lines, unlike

Lesson 8: Composition of Linear Transformations

The branch numbers of linear transformations in encryption ... · The branch numbers of linear transformations in encryption algorithms А.V.Еrokhin,F.M.Malyshev,A.E.Trishin

Linear algebra linear transformations …tao/resource/general/115a.3...• This course is an introduction to Linear algebra. Linear algebra is the study of linear transformations and

Abstract Vector Spaces, Linear Transformations, …math.colorado.edu/~nita/LAVectorSpaces.pdfAbstract Vector Spaces, Linear Transformations, and Their Coordinate Representations Contents

Introduction to Matrices and Linear Transformations

Chap. 6 Linear Transformations

On the Generalization Effects of Linear Transformations in ...stanford.edu › ~senwu › publications › data_augmentation_draft.pdfily of linear transformations and study their

2. SPATIAL TRANSFORMATIONS · Digitales Video 2 Spatial Transformations ... yields the following linear mapping functions for X and Y. ... linear transformations do not account for

Chapter 2 Matrices and Linear Transformations

Non-linear Transformations Oliver James

Transformations of Linear Functions

Chap7 Linear Transformations and Polynomials

1.2 Signal transformations involving linear transformations of the independent variable

· Web viewElementary row transformations-Rank – Echelon form, normal form – Consistency of System of Linear equations. Linear transformations. Hermitian,