28
Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

  • View
    234

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Curve Fitting 1

Curve Fitting By

Least-Squares Regression

andSpline Interpolation

Part 7

Page 2: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Curve fitting 2

Curve Fitting:

Given a set of points:- experimental data- tabular data- etc.

Fit a curve (surface) to the points so that we can easily evaluate f(x) at any x of interest.

If x within data range interpolating (generally safe)

If x outside data rangeextrapolating (often dangerous)

Page 3: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Curve fitting 3

Curve Fitting:

Two main methods will be covered:1. Least-Squares Regression

• Function is "best fit" to data.• Does not necessarily pass through points. • Used for scattered data (experimental)• Can develop models for analysis/design.

2. Interpolation• Function passes through all (or most) points. • Interpolates values of well-behaved (precise) data or for

geometric design.

Page 4: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Curve Fitting & Interpolation

2. We now discuss Interpolation & ExtrapolationThe function passes through all (or at least most) points.

Curve Fitting:

1. We have discussed Least-Squares Regression where the function is "best fit" to points but does not necessarily pass through the points.

extrapolationinterpolation

Page 5: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression: General Procedure 5

Page 6: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression 6

Curve Fitting by Least-Squares Regression:

Objective:Obtain low order approximation (curve or surface) that "best fits" data

Note:• Because the order of the approximation is < the number

of data points, the curve or surface can not pass through all points.

• We will need a consistent criterion for determining the "best fit."

Typical Usage:Scattered (experimental) dataDevelop empirical models for analysis/design.

Page 7: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression 7

Least-Squares Regression:

1. In laboratory, apply x, measure y, tabulate data.

2. Plot data and examine the relationship.

y

xxi

yi

Page 8: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression 8

Least-Squares Regression:

1. In laboratory, apply x, measure y, tabulate data.

2. Plot data and examine the relationship.

y

xxi

yi

Page 9: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression 9

Least-Squares Regression:

3. Develop a "model" – an approximate relationship between y and x:

y = m x + b

4. Use the model to predict or estimate y for any given x.

5. "Best fit" of the data requires:• Optimal way of finding parameters (e.g., slope and

intercept of a straight line.• Perhaps optimize the selection of the model form

(i.e., linear, quadratic, exponential, ...).• That the magnitudes of the residual errors do not vary in

any systematic fashion. [In statistical applications, the residual errors should be independent and identically distributed.]

Page 10: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression 10

Least-Squares Regression

Given: n data points: (x1,y1), (x2,y2), … (xn,yn)Obtain: "Best fit" curve:

f(x) =a0 Z0(x) + a1 Z1(x) + a2 Z2(x)+…+ am Zm(x)

ai's are unknown parameters of modelZi's are known functions of x.

We will focus on two of the many possible types of regression models:

Simple Linear RegressionZ0(x) = 1 & Z1(x) = x

General Polynomial RegressionZ0(x) = 1, Z1(x)= x, Z2(x) = x2, …, Zm(x)= xm

Page 11: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

b = REGRESS(y,X) returns the vector of regression coefficients, b, in the linear model y = Xb, (X is an nxp matrix, y is the nx1 vector of observations). [B,BINT,R,RINT,STATS] = REGRESS(y,X,alpha) uses the input, ALPHA to calculate 100(1 - ALPHA) confidence intervals for B and the residual vector, R, in BINT and RINT respectively. The vector STATS contains the R-square statistic along with the F and p values for the regression.

>> x=linspace(0,1,20)’;>> y=2*x+1+0.1*randn(20,1);>> plot(x,y,'.')

>> xx=[ones(20,1), x];>> b=regress(y,xx)b = 1.0115 1.9941

>> yy=xx*b;>> hold on>> plot(x,yy,‘k-')

Page 12: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression: General Procedure 12

Least Squares Regression (cont'd):

General Procedure:For the ith data point, (xi,yi) we find the set of coefficients for which:

yi = a0 Z0(xi) + a1 Z1(xi) .... + am Zm (xi) + ei

where ei is the residual error = the difference between reported value and model:

ei = yi – a0Z0 (xi) – a1Z1 (x)i –… – amZm (xi)

Our "best fit" will minimize the total sum of the squares of the residuals:

n

iir eS

1

2

Page 13: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression: General Procedure 13

Our "best fit" will be the function which minimizes the sum of squares of the residuals:

y

xxi

yiei

measuredvalue

modeledvalue

2

n n m2

r i i j j i

i 1 i 1 j 1

S e y a Z (x )

n2

r i 0 0 i 1 1 i 2 2 i m m i

i 1

S y a Z (x ) a Z (x ) a Z (x ) a Z (x )

Page 14: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares Regression: General Procedure 14

Least Squares Regression (cont'd):

n

iir eS

1

2

n

iimmii xZaxZay

100 ))()((

nr

0 i i 0 0 i m m i0 i 1

nr

1 i i 0 0 i m m i1 i 1

nr

m i i 0 0 i m m im i 1

S2 Z (x )(y a Z (x ) ... a Z (x ))

a

S2 Z (x )(y a Z (x ) ... a Z (x ))

a

S2 Z (x )(y a Z (x ) ... a Z (x ))

a

To minimize this expression with respect to the unknowns a0, a1

… am take derivatives of Sr and set them to zero:

Page 15: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Linear Algebra 15

Least Squares Regression (cont'd):

In Linear Algebra form:{Y} = [Z] {A} + {E} or {E} = {Y} – [Z] {A}

where: {E} and {Y} --- n x 1[Z] -------------- n x (m+1){A} ------------- (m+1) x 1

n = # points (m+1) = # unknowns

{E}T = [e1 e2 ... en],

{Y}T = [y1 y2 ... yn],

{A}T = [a0 a1 a2 ... am]

0 1 0 1 m 1

0 2 1 2 m 2

0 n 1 n m n

Z x Z x Z x

Z x Z x Z xZ

Z x Z x Z x

Page 16: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Sum Square error 16

Least Squares Regression (cont'd):

{E} = {Y} – [Z]{A}

Then Sr = {E}T{E} = ({Y}–[Z]{A})T ({Y}–[Z]{A})

= {Y}T{Y} – {A}T[Z]T{Y} – {Y}T[Z]{A} + {A}T[Z]T[Z]{A}

= {Y}T{Y}– 2 {A}T[Z]T{Y} + {A}T[Z]T[Z]{A}

Setting = 0 for i =1,...,n yields:

= 0 = 2 [Z]T[Z]{A} – 2 [Z]T{Y} or [Z]T[Z]{A} = [Z]T{Y}

i

r

a

S

Page 17: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Normal Equations 17

Least Squares Regression (cont'd):

[Z]T[Z]{A} = [Z]T{Y} (C&C Eq. 17.25)

This is the general form of Normal Equations.

They provides (m+1) equations in (m+1) unknowns.

(Note that we end up with a system of linear equations.)

Page 18: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Simple Linear Regression 18

Simple Linear Regression (m = 1):Given: n data points, (x1,y1),(x2,y2),…(xn,yn)

with n > 2Obtain: "Best fit" curve: f(x) = a0 + a1x

from the n equations: y1 = a0 + a1x1 + e1

y2 = a0 + a1x2 + e2

yn = a0 + a1xn + en

Or, in matrix form: [Z]T[Z] {A} = [Z]T{Y}1 1

2 0 2

1 2 n 1 1 2 n

n n

1 x y

1 1 1 1 x a 1 1 1 y

x x x a x x x

1 x y

Page 19: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Simple Linear Regression 19

Simple Linear Regression (m = 1): Normal Equations

[Z]T[Z] {A} = [Z]T{Y}

upon multiplying the matrices become

n n

i i

i 1 i 10n n n12

i i i i

i 1 i 1 i 1

n x ya

ax x x y

Normal Equations for Linear Regression

C&C Eqs. (17.4-5)

(This form works well for spreadsheets.)

Page 20: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Simple Linear Regression 20

Simple Linear Regression (m = 1):

[Z]T[Z] {A} = [Z]T{Y}

C&C equations (17.6) and (17.7)

Solving for {a}:

1

n n n

i i i i

i 1 i 1 i 12n n

2i i

i 1 i 1

n x y x y

a

n x x

n n

0 i 1 i

i 1 i 1

1

1 1a y a x

n n

y a x

Page 21: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Least Squares: Simple Linear Regression 21

Simple Linear Regression (m = 1):

[Z]T[Z] {A} = [Z]T{Y}

which is easier and numerically more stable, but the 2nd equation remains the same:

A better version of the first normal equation is:

0 1a y a x

n

i i

i 11 n

2i

i 1

y y x x

a

x x

Page 22: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

ENGRD 241 / CEE 241: Engineering Computation Curve Fitting 22

Common Nonlinear Relations:

Objective: Use linear equations for simplicity.

Remedy: Transform data into linear form and perform regressions.

Given: data which appears as:

1b x1y = a e(1) exponential-like curve:

(e.g., population growth, radioactive decay,

attenuation of a transmission line)

Can also use: ln(y) = ln(a1) + b1x

Page 23: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

ENGRD 241 / CEE 241: Engineering Computation Curve Fitting 23

Common Nonlinear Relations:

(2) Power-like curve:

ln(y) = ln(a2) + b2 ln(x)

(3) Saturation growth-rate curve

population growth under limiting conditions

Be careful about the implied distribution of the errors. Always use the untransformed values for error analysis.

3

3

a xy =

b + x

2b2y = a x

33 3

3 3

b1 1 1 1= a + b = +

y x a a x

20 40 60 80 100

1

2

3

4

5

a3=5b3=1..10

Page 24: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Goodness of fit 24

Major Points in Least-Squares Regression:

1. In all regression models one is solving an overdetermined system of equations, i.e., more equations than unknowns.

2. How good is the fit?Often based on a coefficient of determination, r2

Page 25: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Goodness of fit 25

r2 Compares the average spread of the data about the regression line compared to the spread of the data about the mean.

Spread of the data around the regression line:

22 )'( iiir yyeS

Spread of the data around the mean:

2t iS (y - y)

Page 26: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Goodness of fit 26

Coefficient of determination

describes how much of variance is “explained” by the regression equation

2 t r

t

S - Sr =

S

• Want r2 close to 1.0.

• Doesn't work if models have different numbers of parameters.

• Be careful when using different transformations – always do the

analysis on the untransformed data.

Page 27: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Standard Errpr of the estimate 27

Precision :

If the spread of the points around the line is of similar magnitude along the entire range of the data,

Then one can use

ry x

Ss

n (m 1)

to describe the precision of the regression estimate (in which m+1 is the number of coefficients calculated for the fit, e.g., m+1=2 for linear regression)

= standard error of the estimate (standard deviation in y)

Page 28: Engineering Computation Curve Fitting 1 Curve Fitting By Least-Squares Regression and Spline Interpolation Part 7

Engineering Computation Standard Errpr of the estimate 28

Statistics  Chapra and Canale in sections PT5.2, 17.1.3 and 17.4.3 discuss the statistical interpretation of least squares regression and some of the associated statistical concepts.

The statistical theory of least squares regression is elegant, powerful, and widely used in the analysis of real data throughout the sciences.  See Lecture Notes pages X-14 through X-16.