GlLiGeneral Linear Least-Squares and Nonlinear Regressioncau.ac.kr/~jjang14/NAE/Chap14.pdf · •...

Preview:

Citation preview

P t 4Part 4Chapter 14Chapter 14

G l LiGeneral LinearLeast-Squares and

Nonlinear Regression

All images copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

PowerPoints organized by Dr. Michael R. Gustafson II, Duke UniversityRevised by Prof. Jang, CAU

Chapter ObjectivesChapter Objectives

• Knowing how to implement polynomial regression.• Knowing how to implement multiple linear

regression.• Understanding the formulation of the general

linear least-squares model.• Understanding how the general linear least-

squares model can be solved with MATLAB using either the normal equations or left division.

• Understanding how to implement nonlinear regression with optimization techniques.

Polynomial RegressionPolynomial Regression• The least-squares q

procedure from Chapter 13 can be readily extended to fit data to aextended to fit data to a higher-order polynomial. Again, the idea is to g ,minimize the sum of the squares of the estimate residualsresiduals.

• The figure shows the same data fit with:same data fit with:a) A first order polynomialb) A second order polynomial

Process and Measures of FitProcess and Measures of Fit• For a second order polynomial, the best fit would mean p y ,

minimizing:Sr ei

2

i1

n

yi a0 a1xi a2xi2 2

i1

n

• In general, this would mean minimizing:

Sr ei2

n

yi a0 a1xi a2xi2 amxi

m 2n

• The standard error for fitting an mth order polynomial to ndata points is:

i1 i1

Sdata points is:

because the mth order polynomial has (m+1) coefficients.

sy / x Sr

n m1 p y ( )

• The coefficient of determination r2 is still found using:

r2 St Sr

SSt

Example 14.1 (1/4)p ( / )

Q. Fit a second-order polynomial to the data p yin the first two columns below.

0 2.1 544.44 0.14332

ix iy 220 1 2i i iy a a x a x 2( )iy y

123

7.713.627.2

314.47140.03

3.12

1.002861.081600.80487

45

40.961.1

239.221272.11

0.619590.09434

152 6 2513 39 3 74657 152.6 2513.39 3.74657

Example 14 1 (2/4)Example 14.1 (2/4)S 2

0 1 20

2

2 ( )ri i i

S y a a x a xaS

20 1 2

2 3

( ) i i in a x a x a y 2

0 1 21

2 2

2 ( )

2 ( )

ri i i i

r

S x y a a x a xaS x y a a x a x

2 30 1 2

2 3 4 20 1 2

i i i i i

i i i i i

x a x a x a x y

x a x a x a x y

0 1 22

2 ( )i i i ix y a a x a xa

42 15 979i im x x

06 15 55 152.6a

The simultaneous linear equations are

2 2

3

6 152.6 585.6

2.5 55 2488.8i i i

i i i

n y x y

x x x y

1

2

15 55 225 585.655 225 979 2488.8

aa

325.43 225 iy x

Example 14 1 (3/4)Example 14.1 (3/4)Using MATLAB

>> N=[ 6 15 55; 15 55 225; 55 225 979];

g

>> N [ 6 15 55; 15 55 225; 55 225 979];>> N=[ 152.6 585.6 2488.8];>> a = N\ra =

2.47862.35931.8607

The least-squares quadratic equations for this case isq q q22.4786 2.3593 1.8607y x x

Example 14 1 (4/4)Example 14.1 (4/4)The standard error of the estimate based on the regression polynomial is

/3.74657 1.1175

6 2 1y xs

The coefficient of determination :

6 2 1

2 2513.39 3.74657 0.99851r

2513.39

0 99925 0.99925r The correlation coefficient

99.851% of the original uncertainty has been explained by the fit (model).-> the equation represents an excellent fit.

General Linear Least SquaresGeneral Linear Least Squares

Li l i l d lti l li• Linear, polynomial, and multiple linear regression all belong to the general linear least squares model:linear least-squares model:

y a0z0 a1z1 a2z2 amzm e

where z0, z1, …, zm are a set of m+1 basis functions and e is the error of the fit.functions and e is the error of the fit.

• The basis functions can be any function data but cannot contain any of thedata but cannot contain any of the coefficients a0, a1, etc.

Solving General Linear Least gSquares Coefficients

Th ti• The equation:

can be re-written for each data point as a matrix

y a0z0 a1z1 a2z2 amzm e

pequation:

where {y} contains the dependent data {a} contains

y Z a e where {y} contains the dependent data, {a} contains the coefficients of the equation, {e} contains the error at each point, and [Z] is: z01 z11 zm1

Z z02 z12 zm2

z0n z1n zmn

with zji representing the the value of the jth basis function calculated at the ith point.

Solving General Linear Least gSquares Coefficients

ll l• Generally, [Z] is not a square matrix, so simple inversion cannot be used to solve for {a}. Instead the sum of the squares of the estimate residuals issum of the squares of the estimate residuals is minimized: Sr ei

2

i1

n

yi ajzjij0

m

2

i1

n

• This quantity can be minimized by taking its partial derivative with respect to each of the coefficients and

i 1 j 0 i 1

setting the resulting equations to zero. The outcome of this minimization yields in the following matrix form:

yZaZZ TT

22 ˆ

1 1 i it y yS S S y prediction therepresent ˆ where

221 1 i it r r

t t i i

y yS S SrS S y y

aZyyyi ˆ

and.fit squares-least theof

Example 14.3 (1/3)p ( / )

Q. Use matrix operations to repeat Ex. Q Use a ope a o s o epea14.1

>> x= [0 1 2 3 4 5]' ; % enter the data to fit>> y= [2.1 7.7 13.6 27.2 40.9 61.1]' ;>> z = [ones(size(x)) x x.^2] %[z] matrixz =

11 0 01 1 11 2 41 3 91 3 91 4 161 5 25

Example 14.3 (2/3)p ( / )

>> ' % [ ]T[ ]>> z'*z % [z]T[z]ans =

6 15 5515 55 25515 55 25555 225 979

>> a = (z'*z)\(z‘*y) % Solve for the coefficients of eq>> a (z *z)\(z *y) % Solve for the coefficients of eq. ans =

2.47862.3593

yZaZZ TT

2.35931.8607

Example 14.3 (3/3)p ( / )

>> Sr = sum((y-z*a).^2) % SrSr =Sr =

3.7466>> r2 = 1-Sr/sum((y-mean(y)).^2) % r2

r2 =r2 = 0.9985

>> syx = sqrt(Sr/(length(x)-length(a))) % sy/x

syx =syx 1.1175

22

2

ˆ1 1 i it r r y yS S Sr

S S

sy / x Sr

n m1 2t t i i

S S y yy n m1

Nonlinear RegressionNonlinear Regression• As seen in the previous chapter, not all fits areAs seen in the previous chapter, not all fits are

linear equations of coefficients and basis functions.

• One method is to perform optimization techniques to directly determine the least-q ysquares fit.– First write a function that returns the sum of the

f th ti t id l f fit d thsquares of the estimate residuals for a fit and then use MATLAB’s fminsearch function to find the values of the coefficients where a minimum occurs.

10 (1 )a xy a e e 1

2

0 1 01

( , ) (1 )i

na x

ii

f a a y a e

[x, fval] =fminsearch(fun, x0, options, p1, p2, …)

Nonlinear Regression in gMATLAB Example (option)

Gi d d t f d t F f i d d t l it• Given dependent force data F for independent velocity data v, determine the coefficients for the fit:

F a0va1

• First - write a function called fSSR.m containing the following:

i ( )

F a0v

function f = fSSR(a, xm, ym)yp = a(1)*xm.^a(2);f = sum((ym-yp).^2);((y yp) );

• Then, use fminsearch in the command window to obtain the values of a that minimize fSSR:

a = fminsearch(@fSSR, [1, 1], [], v, F)where [1, 1] is an initial guess for the [a0, a1] vector, [] is a placeholder for the optionsis a placeholder for the options

Nonlinear Regression vs. Linear Regression Empolying Transformations• Although the model coefficients are different• Although the model coefficients are different,

it is difficult to judge which fit is superior.N li i i i l d t• Nonlinear regression: original data

• Linear Regression Empolying Transformations: the transformed data

Recommended