Nonlinear Regression Analysis

Embed Size (px)

Citation preview

  • 8/9/2019 Nonlinear Regression Analysis

    1/135

    CHEE824

    Nonlinear Regression Analysis

    J. McLellan

    Winter 2004

  • 8/9/2019 Nonlinear Regression Analysis

    2/135

    Module 1:

    Linear Regression

  • 8/9/2019 Nonlinear Regression Analysis

    3/135

    3

    Outline -

    assessing systematic relationships

    matrix representation for multiple regression

    least squares parameter estimates

    diagnostics

    graphical quantitative

    further diagnostics

    testing the need for terms lack of fit test

    precision of parameter estimates, predicted responses

    correlation between parameter estimates

  • 8/9/2019 Nonlinear Regression Analysis

    4/135

    4

    The Scenario

    We want to describe the systematic relationship

    between a response variable and a number of

    explanatory variables

    multiple regression

    we will consider the case whichis linear in the parameters

  • 8/9/2019 Nonlinear Regression Analysis

    5/135

    5

    Assessing Systematic Relationships

    Is there a systematic relationship?

    Two approaches: graphical

    scatterplots, casement plots

    quantitative form correlations between response, explanatory

    variables

    consider forming correlation matrix - table of pairwisecorrelations between regressor and explanatories, and

    pairs of explanatory variables

    correlation between explanatory variables leads to

    correlated parameter estimates

  • 8/9/2019 Nonlinear Regression Analysis

    6/135

    chee824 - Winter 2004 J. McLellan 6

    Graphical Methods for Analyzing Data

    Visualizing relationships between variables

    Techniques

    scatterplots

    scatterplot matrices

    also referred to as casement plots

    Time sequence plots

  • 8/9/2019 Nonlinear Regression Analysis

    7/135

  • 8/9/2019 Nonlinear Regression Analysis

    8/135

    chee824 - Winter 2004 J. McLellan 8

    Scatterplots - Example

    Scatterplot (teeth 4v*20c)

    FLUORIDE

    DISCOLOR

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

    trend - possibly

    nonlinear?

    tooth discoloration data - discoloration vs. fluoride

  • 8/9/2019 Nonlinear Regression Analysis

    9/135

    chee824 - Winter 2004 J. McLellan 9

    Scatterplot - Example

    Scatterplot (teeth 4v*20c)

    BRUSHING

    DISCOLOR

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    4 5 6 7 8 9 10 11 12 13

    tooth discoloration data -discoloration vs. brushing

    signficant trend?

    - doesnt appear to

    be present

  • 8/9/2019 Nonlinear Regression Analysis

    10/135

    chee824 - Winter 2004 J. McLellan 10

    Scatterplot - Example

    Scatterplot (teeth 4v*20c)

    BRUSHING

    DISCOLOR

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    4 5 6 7 8 9 10 11 12 13

    Variance appears

    to decrease as

    # of brushings increases

    tooth discoloration data -discoloration vs. brushing

  • 8/9/2019 Nonlinear Regression Analysis

    11/135

    chee824 - Winter 2004 J. McLellan 11

    Scatterplot matrices

    are a table of scatterplots for a set of variables

    Look for -

    systematic trend between independent variable and

    dependent variables - to be described by estimated

    model

    systematic trend between supposedly independent

    variables - indicates that these quantities are correlated

    correlation can negatively ifluence model estimation results

    not independent information

    scatterplot matrices can be generated automatically

    with statistical software, manually using Excel

  • 8/9/2019 Nonlinear Regression Analysis

    12/135

    chee824 - Winter 2004 J. McLellan 12

    Scatterplot Matrices - tooth data

    Matrix Plot (teeth 4v*20c)

    FLUORIDE

    AGE

    BRUSHING

    DISCOLOR

  • 8/9/2019 Nonlinear Regression Analysis

    13/135

    chee824 - Winter 2004 J. McLellan 13

    Time Sequence Plot - Naphtha 90% Point

    90%p

    oint(d

    egreesF)

    390

    400

    410

    420

    430

    440

    450

    460

    470

    480

    0 30 60 90 120 150 180 210 240 270

    Time Sequence Plot

    - for naphtha 90% point - indicates amount of heavy

    hydrocarbons present in gasoline range materialexcursion - sudden

    shift in operation

    meandering about

    average operating point- time correlation in data

  • 8/9/2019 Nonlinear Regression Analysis

    14/135

    chee824 - Winter 2004 J. McLellan 14

    What do dynamic data look like?

    Time Series Plot of Industrial Data

    var1

    var2# 1# 151

    # 301# 451

    # 601# 751

    # 901# 1051

    # 1201# 1351

    # 1501# 1651

    # 1801# 1951

    # 21010

    1

    2

    3

    4

    5

    6

    7

  • 8/9/2019 Nonlinear Regression Analysis

    15/135

    15

    Assessing Systematic Relationships

    Quantitative Methods

    correlation

    formal defn plus sample statistic (Pearsons r)

    covariance

    formal defn plus sample statistic

    provide a quantiative measure of systematic LINEAR

    relationships

  • 8/9/2019 Nonlinear Regression Analysis

    16/135

    16

    Covariance

    Formal Definition

    given two random variables X and Y, the covariance

    is

    E{ } - expected value

    sign of the covariance indicates the sign of the slope

    of the systematic linear relationship positive value --> positive slope

    negative value --> negative slope

    issue - covariance is SCALE DEPENDENT

    Cov X Y E X Y Y

    ( , ) {( )( )}=

  • 8/9/2019 Nonlinear Regression Analysis

    17/135

    17

    Covariance

    motivation for covariance as a measure of systematic

    linear relationship

    look at pairs of departures about the mean of X, Y

    X

    Y

    mean of X, Y

    X

    Y

    mean of X, Y

  • 8/9/2019 Nonlinear Regression Analysis

    18/135

    18

    Correlation

    is the dimensionless covariance

    divide covariance by standard devns of X, Y formal definition

    properties

    dimensionless

    range

    Corr X Y X Y Cov Y

    Y

    ( , ) ( , )( , )

    = =

    1 1( , )Ystrong linear relationshipwith negative slope

    strong linear relationshipwith positive slope

    Note - the correlation gives NO information about the

    actual numerical value of the slope.

  • 8/9/2019 Nonlinear Regression Analysis

    19/135

    19

    Estimating Covariance, Correlation

    from process data (with N pairs of observations)

    Sample Covariance

    Sample Correlation

    RN

    X X Y Y i ii

    N=

    =

    1

    1 1( )( )

    rN

    X X Y Y

    s s

    i ii

    N

    Y

    =

    =

    11 1

    ( )( )

  • 8/9/2019 Nonlinear Regression Analysis

    20/135

    20

    Making Inferences

    The sample covariance and corrleration are

    STATISTICS, and have their own probability

    distributions.

    Confidence interval for sample correlation -

    the following is approximately distributed as the standard

    normal random variable

    derive confidence limits for and convert to

    confidence limits for the true correlation using tanh

    tanh ( )1

    N r 3 1 1(tanh ( ) tanh ( ))

  • 8/9/2019 Nonlinear Regression Analysis

    21/135

    21

    Confidence Interval for Correlation

    Procedure

    1. find for desired confidence level

    2. confidence interval for is

    3. convert to limits to confidence limits for correlation by

    taking tanh of the limits in step 2

    A hypothesis test can also be performed using this function of the

    correlation and comparing to the standard normal distribution

    z/2tanh ( )1

    tanh ( ) /

    1 213

    rN

    z

  • 8/9/2019 Nonlinear Regression Analysis

    22/135

    22

    Example - Solder Thickness

    Objective - study the effect of temperature on solder

    thicknessData - in pairs

    Solder Temperature (C) Solder Thickness (microns)

    245 171.6

    215 201.1

    218 213.2

    265 153.3

    251 178.9

    213 226.6

    234 190.3

    257 171

    244 197.5

    225 209.8

  • 8/9/2019 Nonlinear Regression Analysis

    23/135

    23

    Example - Solder Thickness

    Solder Thickness (microns)

    140150

    160170180190200210220230

    200 210 220 230 240 250 260 270

    temperature

    thickness

    Solder Tem perature (C ) Thickness (m i

    Solder Tem perature (C) 1

    Solder Thickness (m icro -0.920001236 1

  • 8/9/2019 Nonlinear Regression Analysis

    24/135

    24

    Example - Solder Thickness

    Confidence Interval

    zalpha/2 of 1.96 (95% confidence level)

    limits in tanh^-1(rho) -2.329837282 -0.848216548

    limits in rho -0.981238575 -0.690136605

  • 8/9/2019 Nonlinear Regression Analysis

    25/135

    25

    Empirical Modeling - Terminology

    response

    dependent variable - responds to changes in othervariables

    the response is the characteristic of interest which we are

    trying to predict

    explanatory variable

    independent variable, regressor variable, input, factor

    these are the quantities that we believe have an

    influence on the response

    parameter

    coefficients in the model that describe how the

    regressors influence the response

  • 8/9/2019 Nonlinear Regression Analysis

    26/135

    26

    Models

    When we are estimating a model from data, we

    consider the following form:

    Y f= +( , )

    response

    explanatoryvariables

    parameters

    random error

  • 8/9/2019 Nonlinear Regression Analysis

    27/135

    27

    The Random Error Term

    is included to reflect fact that measured data contain

    variability

    successive measurements under the same conditions

    (values of the explanatory variables) are likely to be

    slightly different this is the stochastic component

    the functional form describes the deterministic

    component

    random error is not necessarily the result of mistakes in

    experimental procedures - reflects inherent variability

    noise

  • 8/9/2019 Nonlinear Regression Analysis

    28/135

    28

    Types of Models

    linear/nonlinear in the parameters

    linear/nonlinear in the explanatory variables

    number of response variables

    single response (standard regression)

    multi-response (or multivariate models)

    From the perspective of statistical model-building,the key point is whether the model is linear ornonlinear in the PARAMETERS.

  • 8/9/2019 Nonlinear Regression Analysis

    29/135

    29

    Linear Regression Models

    linear in theparameters

    can be nonlinear in the regressors

    T T T95 1 2= + +b b LGO mid

    T T T95 1 2= + +b bGO mid

  • 8/9/2019 Nonlinear Regression Analysis

    30/135

    30

    Nonlinear Regression Models

    nonlinear in theparameters

    e.g., Arrhenius rate expression

    r exp( RT )=

    k

    E0

    linear

    (if E is fixed)

    nonlinear

  • 8/9/2019 Nonlinear Regression Analysis

    31/135

    31

    Nonlinear Regression Models

    sometimes transformably linear

    start with

    and take ln of both sides to produce

    which is of the form

    r exp(RT

    )=

    +kE

    0

    ln(r) ln( )RT

    = +kE

    0

    Y = + + 0 11

    RT

    linear in the

    parameters

  • 8/9/2019 Nonlinear Regression Analysis

    32/135

    32

    Transformations

    note that linearizing the nonlinear equation by

    transformation can lead to misleading estimates if the

    proper estimation method is not used

    transforming the data can alter the statistical

    distribution of the random error term

  • 8/9/2019 Nonlinear Regression Analysis

    33/135

    33

    Ordinary LS vs. Multi-Response

    single response (ordinary least squares)

    multi-response (e.g., Partial Least Squares)

    issue - joint behaviour of responses, noise

    T T T95 1 2= + +b b LGO mid

    T T TT T T

    ,

    ,

    95 11 12 1

    95 21 22 2

    LGO LGO mid

    kero kero mid

    b bb b

    = + += + +

    We will be focussing on single response models.

  • 8/9/2019 Nonlinear Regression Analysis

    34/135

    34

    Linear Multiple Regression

    Model Equation

    Y X Xi i p ip i= + + + 1 1 K

    i-th observationof response(i-th data point)

    i-th value of

    explanatory variable X 1

    i-th value of

    explanatory variable X p

    The intercept can be considered as correspondingto an X which always has the value 1

    random noisein i-th observation

    of response

  • 8/9/2019 Nonlinear Regression Analysis

    35/135

    35

    Assumptions for Least Squares Estimation

    Values of explanatory variables are known EXACTLY

    random error is strictly in the response variable

    practically - a random component will almost always be

    present in the explanatory variables as well

    we assume that this component has a substantiallysmaller effect on the response than the random

    component in the response

    if random fluctuations in the explanatory variables are

    important, consider alternative method (Errors inVariables approach)

  • 8/9/2019 Nonlinear Regression Analysis

    36/135

    36

    Assumptions for Least Squares Estimation

    The form of the equation provides an adequate

    representation for the data can test adequacy of model as a diagnostic

    Variance of random error is CONSTANT over range ofdata collected

    e.g., variance of random fluctuations in thickness

    measurements at high temperatures is the same asvariance at low temperatures

    data is heteroscedastic if the variance is not constant -

    different estimation procedure is required

    thought - percentage error in instruments?

  • 8/9/2019 Nonlinear Regression Analysis

    37/135

    37

    Assumptions for Least Squares Estimation

    The random fluctuations in each measurement are

    statistically independent from those of othermeasurements

    at same experimental conditions

    at other experimental conditions implies that random component has no memory

    no correlation between measurements

    Random error term is normally distributed

    typical assumption

    not essential for least squares estimation

    important when determining confidence intervals,

    conducting hypothesis tests

  • 8/9/2019 Nonlinear Regression Analysis

    38/135

    Least Squares Estimation - graphically

    least squares - minimize sum of squared prediction errors

    response

    (solder thickness)

    T

    o

    o

    o

    o

    oodeterministictruerelationship

    prediction errorresidual

  • 8/9/2019 Nonlinear Regression Analysis

    39/135

    39

    More Notation and Terminology

    Random error is independent, identically distributed

    (I.I.D) -- can say that it is IID Normal

    Capitals - Y - denotes random variable- except in case of explanatory variable - capital usedto denote formal defn

    Lower case - y, x - denotes measured values ofvariables

    Model

    Measurement

    Y X= + + 0 1

    y x= + +

    0 1

  • 8/9/2019 Nonlinear Regression Analysis

    40/135

    40

    More Notation and Terminology

    Estimate - denoted by hat

    examples - estimates of response, parameter

    Residual - difference between measured and predicted

    response

    $, $y 0

    e y y= $

  • 8/9/2019 Nonlinear Regression Analysis

    41/135

    41

    Matrix Representation for Multiple RegressionWe can arrange the observations in tabular form - vector of

    observations, and matrix of explanatory values:

    Y

    Y

    Y

    Y

    X X X

    X X X

    X X X

    X X X

    N

    N

    p

    p

    N N N p

    N N N p

    p

    1

    2

    1

    11 12 1

    21 22 2

    11 1 2 1

    1 2

    1

    2

    1

    M

    L

    L

    M M M M

    L

    L

    M

    =

    +

    , , ,

    , , ,

    2

    1

    M

    N

    N

  • 8/9/2019 Nonlinear Regression Analysis

    42/135

    42

    Matrix Representation for Multiple Regression

    The model is written as:

    Y X= +

    Nx1vector

    Nxpmatrix

    px1vector

    Nx1vector

    N --> number of data observationsp --> number of parameters

  • 8/9/2019 Nonlinear Regression Analysis

    43/135

    43

    Least Squares Parameter Estimates

    We make the same assumptions as in the straight line

    regression case:

    independent random noise components in each

    observation

    explanatory variables known exactly - no randomness variance constant over experimental region (identically

    distributed noise components)

  • 8/9/2019 Nonlinear Regression Analysis

    44/135

    44

    Residual Vector

    Given a set of parameter values , the residual vector is formed

    from the matrix expression:

    e

    e

    e

    e

    Y

    Y

    Y

    Y

    X X

    X X X

    X X X

    X X X

    N

    N

    N

    N

    p

    p

    N N N p

    N N N p

    1

    2

    1

    1

    2

    1

    11 12 1

    21 22 2

    11 1 2 1

    1 2

    M M

    L

    L

    M M M M

    L

    L

    =

    , , ,

    , , ,

    ~

    ~

    ~

    1

    2

    M

    p

    ~

  • 8/9/2019 Nonlinear Regression Analysis

    45/135

    45

    Sum of Squares of Residuals

    is the same as before, but can be expressed as the squared

    length of the residual vector:

    SSE ei

    i

    NT

    T

    = =

    =

    =

    =

    2

    12

    e e

    e

    Y X Y X(

    ~

    ) (

    ~

    )

  • 8/9/2019 Nonlinear Regression Analysis

    46/135

    46

    Least Squares Parameter Estimates

    Find the set of parameter values that minimize the sum

    of squares of residuals (SSE)

    apply necessary conditions for an optimum from calculus

    (stationary point)

    system of N equations inp unknowns, with number of

    parameters < number of observations : over-determined

    system of equations

    solution - set of parameter values that comes closest to

    satisfying all equations (in a least squares sense)

    ( )

    $SSE = 0

  • 8/9/2019 Nonlinear Regression Analysis

    47/135

    47

    Least Squares Parameter Estimates

    The solution is:

    $ ( ) = X X X YT T1

    generalized matrix inverseof X

    - generalization of standardconcept of matrix inverse to case ofnon-square matrix case

  • 8/9/2019 Nonlinear Regression Analysis

    48/135

    48

    Example - Solder Thickness

    Lets analyze the data considered for the straight line case:

    Solder Temperature (C) Solder Thickness (microns)

    245 171.6

    215 201.1

    218 213.2

    265 153.3251 178.9

    213 226.6

    234 190.3

    257 171

    244 197.5

    225 209.8

    Model:

    Y X= + + 0 1

  • 8/9/2019 Nonlinear Regression Analysis

    49/135

    49

    Example - Solder Thickness

    In matrix form:

    1716

    2011

    2132

    1533

    1789

    2266

    1903

    171

    1975

    2098

    1 245

    1 215

    1 218

    1 265

    1 251

    1 213

    1 234

    1 257

    1 244

    1 225

    .

    .

    .

    .

    .

    .

    .

    .

    .

    =

    +

    0

    1

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    Y X= +

  • 8/9/2019 Nonlinear Regression Analysis

    50/135

    50

    Example - Solder Thickness

    In order to calculate the Least Squares Estimates:

    ( ) ;X XT =

    10 2367

    2367 563335

    X YT =

    1910

    449420

  • 8/9/2019 Nonlinear Regression Analysis

    51/135

    51

    Example - Solder Thickness

    The least squares parameter estimates are obtained as:

    $

    ( )

    . .

    . .

    .

    . = =

    =

    X X X Y

    T T1

    18 373 0 0772

    0 0772 0 0003

    1910

    449420

    45810

    113

  • 8/9/2019 Nonlinear Regression Analysis

    52/135

    52

    Example - Wave Solder Defects

    (page 8-31, Course Notes)

    Wave Solder Defects DataRun Conveyor Speed Pot Temp Flux Dens ity No. of Defec ts

    1 -1 -1 -1 100

    2 1 -1 -1 119

    3 -1 1 -1 118

    4 1 1 -1 2175 -1 -1 1 20

    6 1 -1 1 42

    7 -1 1 1 41

    8 1 1 1 113

    9 0 0 0 10110 0 0 0 96

    11 0 0 0 115

  • 8/9/2019 Nonlinear Regression Analysis

    53/135

    53

    Example - Wave Solder Defects

    In matrix form:100

    119

    118

    217

    20

    42

    41

    113

    101

    96

    115

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 0 0 0

    1 0 0 0

    1 0 0 0

    =

    +

    0

    1

    2

    3

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    Y X= +

  • 8/9/2019 Nonlinear Regression Analysis

    54/135

    54

    Example - Wave Solder Defects

    To calculate least squares parameter estimates:

    ( ) ;X XT =

    11 0 0 0

    0 8 0 0

    0 0 8 0

    0 0 0 8

    X YT =

    1082

    212

    208

    338

  • 8/9/2019 Nonlinear Regression Analysis

    55/135

    55

    Example - Wave Solder Defects

    Least squares parameter estimates:

    $ ( )

    .

    .

    .

    .

    = =

    =

    X X X Y

    1

    11

    18

    1

    8

    1

    8

    T T1

    0 0 0

    0 0 0

    0 0 0

    0 0 0

    1082

    212

    208

    338

    9336

    2650

    26 0

    42 25

  • 8/9/2019 Nonlinear Regression Analysis

    56/135

    56

    Examples - Comments

    if there N runs, and the model hasp parameters, XTX is apxp

    matrix (smaller dimension than number of runs) elements of XTY are for parametersj=1, ,p

    in the Wave Solder Defects example, the values of theexplanatory variable for the runs followed very specific patterns

    of -1 and +1, and XTX was a diagonal matrix

    in the Solder Thickness example, the values of the explanatory

    variable did not follow a specific pattern, and XTX was notdiagonal

    x yij ii

  • 8/9/2019 Nonlinear Regression Analysis

    57/135

  • 8/9/2019 Nonlinear Regression Analysis

    58/135

    58

    Graphical Diagnostics

    Residuals vs. Predicted Response Values

    residuale

    i

    $yi

    *

    *

    *

    *

    **

    *

    *

    *

    ** *

    *

    *

    - even scatterover range of prediction

    - no discernable pattern

    - roughly half the residualsare positive, half negative

    DESIRED RESIDUAL PROFILE

  • 8/9/2019 Nonlinear Regression Analysis

    59/135

    59

    Graphical Diagnostics

    Residuals vs. Predicted Response Values

    residualei

    $yi

    *

    **

    *

    *

    *

    *

    *

    *

    *

    * *

    *

    *

    outlier lies outsidemain body of residuals

    RESIDUAL PROFILE WITH OUTLIERS

  • 8/9/2019 Nonlinear Regression Analysis

    60/135

    60

    Graphical Diagnostics

    Residuals vs. Predicted Response Values

    residualei

    $yi

    **

    *

    *

    *

    *

    *

    *

    *

    *

    *

    *

    **

    variance of the residualsappears to increasewith higher predictions

    NON-CONSTANT VARIANCE

    *

    *

    *

    *

  • 8/9/2019 Nonlinear Regression Analysis

    61/135

    61

    Graphical Diagnostics

    Residuals vs. Explanatory Variables

    ideal - no systematic trend present in plot

    inadequate model - evidence of trend present

    residualei

    ** *

    *

    **

    * *

    ** *

    *

    *

    *x

    left over quadratic trend- need quadratic term in model

  • 8/9/2019 Nonlinear Regression Analysis

    62/135

    62

    Graphical Diagnostics

    Residuals vs. Explanatory Variables Not in Model

    ideal - no systematic trend present in plot

    inadequate model - evidence of trend present

    residualei

    *

    * **

    *

    * * *

    ** *

    *

    *

    *w

    systematic trendnot accounted for in model- include a linear term in w

  • 8/9/2019 Nonlinear Regression Analysis

    63/135

    63

    Graphical Diagnostics

    Residuals vs. Order of Data Collection

    residualei

    ** ** *

    *

    * * * * **

    **

    t

    *

    ** **** *

    * *

    **

    *t

    residual

    ei

    failure to account for time trendin data

    successive random noise

    components are correlated- consider more complex model- time series model for randomcomponent?

  • 8/9/2019 Nonlinear Regression Analysis

    64/135

    64

    Quantitative Diagnostics - Ratio Tests

    Residual Variance Test

    is the variance of the residuals significant to the inherent

    noise variance?

    same test as that for the straight line data

    only distinction - number of degrees of freedom for theMean Squared Error => N-p , wherep is the number of

    parameters in the model

    compare ratio to FN-p,M-1,0.05 where M is the number of

    data points used to estimate inherent variance significant? -> model is INADEQUATE

  • 8/9/2019 Nonlinear Regression Analysis

    65/135

    65

    Quantitative Diagnostics - Ratio Tests

    Residual Variance Ratio

    Mean Squared Error of Residuals (Var. of Residuals):

    s

    s

    Mean Squared Error of Residuals MSE

    s

    residuals

    inherent inherent

    2

    2 2=

    ( )

    s MSE

    e

    N presiduals

    ii

    N

    22

    1= =

    =

  • 8/9/2019 Nonlinear Regression Analysis

    66/135

    66

    Quantitative Diagnostics - Ratio Tests

    Mean Square Regression Ratio

    same as in the straight line case except for degrees of

    freedom

    Variance described by model:

    MSR

    y y

    p

    ii

    N

    =

    =( $ )

    2

    1

    1

  • 8/9/2019 Nonlinear Regression Analysis

    67/135

    67

    Quantitative Diagnostics - Ratio Test

    Test Ratio:

    is compared against Fp-1,N-p,0.95

    Conclusions? ratio is statistically significant --> significant trend

    NOT statistically significant --> significant trend has NOT

    been modeled, and model is inadequate in its present form

    SR

    SE

    For the multiple regression case, this test is a coarsemeasure of whether some trend has been modeled -

    it provides no indication of which Xs are important

  • 8/9/2019 Nonlinear Regression Analysis

    68/135

    68

    Analysis of Variance Tables

    The ratio tests involve dissection of the sum of squares:

    SSR

    y yii

    N=

    =

    ( $ )2

    1

    SSE

    y yi ii

    N=

    =

    ( $ )2

    1

    TSS y yii

    =

    =

    ( )2

    1

  • 8/9/2019 Nonlinear Regression Analysis

    69/135

    69

    Analysis of Variance (ANOVA) for Regression

    Sourceof

    Variation

    Degreesof

    Freedom

    Sum ofSquares

    MeanSquare

    F-Value p-value

    Regression p-1 SSR MSR=SSR/(p-1) F=MSR/MSE p

    Residuals N-p SSE MSE=SSE/(N-p)

    Total N-1 TSS

  • 8/9/2019 Nonlinear Regression Analysis

    70/135

    70

    Quantitative Diagnostics - R

    2

    Coefficient of Determination (R2 Coefficient)

    square of correlation between observed and predicted

    values:

    relationship to sums of squares:

    values typically reported in %, i.e., 100 R2

    ideal - R

    2

    near 100%

    R corr y y2 2= [ ( , $)]

    R

    SSE

    TSS

    SSR

    TSS

    21= =

  • 8/9/2019 Nonlinear Regression Analysis

    71/135

  • 8/9/2019 Nonlinear Regression Analysis

    72/135

    72

    Adjusted R

    2

    Adjust for number of parameters relative to number of observations

    account for degrees of freedom of the sums of squares

    define in terms of Mean Squared quantities

    want value close to 1 (or 100%), as before

    if N>>p, adjusted R2 is close to R2

    provides measure of agreement, but does not account for

    magnitude of residual error

    R

    SE

    TSS N

    SSE p

    TSS N adj2

    1 1 1 1= =

    / ( )

    / ( )

    / ( )

  • 8/9/2019 Nonlinear Regression Analysis

    73/135

    73

    Testing the Need for Groups of Terms

    In words: Does a specific group of terms account for significant

    trend in the model?

    Test

    compare difference in residual variance between full and

    reduced model benchmark against an estimate of the inherent variation

    if significant, conclude that the group of terms ARE

    required

    if not significant, conclude that the group of terms can be

    dropped from the model - not explaining significant trend

    note that remaining parameters should be re-estimated in

    this case

  • 8/9/2019 Nonlinear Regression Analysis

    74/135

    74

    Testing the Need for Groups of Terms

    Test:

    A - denotes the fullmodel (with all terms)

    B - denotes the reducedmodel (group of terms deleted)

    Form:

    pA, pB are the numbers of parameters in models A, B

    s2 is an estimate of the inherent noise variance:

    estimate as SSEA/(N-pA)

    SSE SSE

    s p pA B

    B

    model model

    2 ( )

  • 8/9/2019 Nonlinear Regression Analysis

    75/135

    75

    Testing the Need for Groups of Terms

    Compare this ratio to

    if MSEA is used as estimate of inherent variance, thendegrees of freedom of inherent variance estimate is pA

    Fp p inherent , , . 0 95

  • 8/9/2019 Nonlinear Regression Analysis

    76/135

    76

    Lack of Fit Test

    If we have replicate runs in our regression data set, we can break

    out the noise variance from the residuals, and assess thecomponent of the residuals due to unmodelled trend

    Replicates -

    repeated runs at the SAME experimental conditions

    note that all explanatory variables must be at fixed

    conditions

    indication of inherent variance because no other factorsare changing

    measure of repeatibility of experiments

  • 8/9/2019 Nonlinear Regression Analysis

    77/135

    77

    Using Replicates

    We can estimate the sample variance for each set of replicates,

    andpoolthe estimate of the variance constancy of variance can be checked using Bartletts

    test

    constant variance is assumed for ordinary least squares

    estimation

    For each replicate set, we have:

    s

    y y

    ni

    ij ij

    n

    i

    i

    2

    2

    1

    1=

    =

    ( )

    average ofvalues inreplicate seti

    number ofvalues in

    replicate seti values in

    replicate seti

  • 8/9/2019 Nonlinear Regression Analysis

    78/135

    78

    Using Replicates

    The pooled estimate of variance is:

    i.e., convert back to sums of squares, and divide by the total

    number of degrees of freedom (the sum of the degrees of

    freedom for each variance estimate)

    ( )n s

    n m

    i ii

    m

    ii

    m

    =

    =

    1 2

    1

    1

  • 8/9/2019 Nonlinear Regression Analysis

    79/135

    79

    The Lack of Fit Test

    Back to the sum of squares block:

    SSR

    TSS

    SSELOFSSEP

    pure error sumof squares

    lack of fitsum of squares

    SSE

  • 8/9/2019 Nonlinear Regression Analysis

    80/135

    80

    The Lack of Fit Test

    We partition the SSE into two components:

    component due to inherent noise component due to unmodeled trend

    Pure error sum of squares (SSEP):

    i.e., add together sums of squares associated with each replicate

    group (there are m replicate groups in total)

    SSEP y yij ij

    n

    i

    m i=

    ==( )2

    11

  • 8/9/2019 Nonlinear Regression Analysis

    81/135

    81

    The Lack of Fit Test

    The lack of fit sum of squares (SSELOF) is formed by backing out

    SSEP from SSE:

    Degrees of Freedom:

    - for SSEP:

    - for SSELOF:

    SSELOF SSE SSEP =

    n mii

    m

    =

    1

    N p n mi

    i

    m

    =1

  • 8/9/2019 Nonlinear Regression Analysis

    82/135

    82

    The Lack of Fit Test

    The test ratio:

    Compare to

    significant? - there is significant unmodeled trend, and

    model should be modified

    not significant? - there is nosignificant unmodeled trend,

    and supports model adequacy

    SELOF

    MSEP

    SSELOF

    SSEPLOF

    Pure=

    /

    /

    FOF ure , , .0 95

  • 8/9/2019 Nonlinear Regression Analysis

    83/135

    83

    Example - Wave Solder Defects

    From earlier regression, SSE = 2694.0 and SSR = 25306.5

    LACK OF FIT TEST

    ANOVA

    df SS MS F value from F-table (95% pt)

    Residual 7 2694.045LOF 5 2500.045 500.0091 5.154733 19.3 (this is F5,2,0.95)

    Pure Error 2 194 97

    Replicate Set

    9 0 0 0 10110 0 0 0 96

    11 0 0 0 115

    std. devn 9.848858

    sample var 97

    sum of sq 194(as (n_i-1)s^2)

    This was done by hand - Excel has no Lack of Fit test

  • 8/9/2019 Nonlinear Regression Analysis

    84/135

    84

    A Comment on the Ratio Tests

    Order of Preference (or value) - from most definitive to

    least definitive:

    Lack of Fit Test -- MSELOF/MSEP

    MSE/s2

    inherent

    MSR/MSE

    If at all possible, try to include replicate runs in your experimentalprogram so that the Lack of Fit test can be conducted

    Many statistical software packages will perform the Lack of Fit test

    in their Regression modules - Excel does NOT

  • 8/9/2019 Nonlinear Regression Analysis

    85/135

    85

    The Parameter Estimate Covariance Matrix

    summarizes the variance-covariance structure of the parameter

    estimates

    =

    Var Cov Cov

    Cov Var Cov

    Cov Cov Var

    p

    p

    p p p

    ( $ ) ( $ , $ ) ( $ , $ )

    ( $ , $ ) ( $ ) ( $ , $ )

    ( $ , $ ) ( $ , $ ) ( $ )

    1 1 2 1

    1 2 2 2

    1 2

    L

    L

    M M O M

    L

  • 8/9/2019 Nonlinear Regression Analysis

    86/135

    86

    Properties of the Covariance Matrix

    symmetric -- Cov(b1,b2) = Cov(b2,b1)

    diagonal entries are always non-negative

    off-diagonal entries can be +ve or -ve

    matrix ispositive definite

    for any vector vv v

    T > 0

  • 8/9/2019 Nonlinear Regression Analysis

    87/135

  • 8/9/2019 Nonlinear Regression Analysis

    88/135

    88

    Parameter Estimate Covariance Matrix

    Key point - the covariance structure of the parameter estimates is

    governed by the experimental run conditions used for the

    explanatory variables -the Experimental Design

    Example - the Wave Solder Defects data

    ( ) ;X XT =

    11 0 0 0

    0 8 0 0

    0 0 8 0

    0 0 0 8

    ( )X X

    1

    11

    1

    8

    1

    8

    1

    8

    T =

    1

    0 0 0

    0 0 0

    0 0 0

    0 0 0

    Parameter estimatesare uncorrelated, andvariances of thenon-intercept

    parameteres are thesame- towards uniform

    precision ofparameter estimates

  • 8/9/2019 Nonlinear Regression Analysis

    89/135

    89

    Estimating the Parameter Covariance Matrix

    The X matrix is known - set of run conditions - so the only

    estimated quantity is the inherent noise variance

    from replicates, external estimate, or MSE

    For wave solder defect data, the sample variance of the replicates

    is 384.86 with 7 degrees of freedom, and the parameter

    covariances are:

    $ ( ) ( . )

    .

    .

    .

    .

    = =

    =

    X X

    1

    11

    1

    8

    1

    8

    1

    8

    Tes

    1 2

    0 0 0

    0 0 0

    0 0 0

    0 0 0

    384 86

    34 99 0 0 0

    0 4811 0 0

    0 0 4811 0

    0 0 0 4811

    residualvariance from

    MSE

  • 8/9/2019 Nonlinear Regression Analysis

    90/135

    90

    Using the Covariance Matrix

    Variances of parameter estimates

    are obtained from the diagonal of the matrix

    square root is the standard devn, or standard error, of

    the parameter estimates

    use to formulate confidence intervals for the paramters use in hypothesis tests for the parameters

    Correlations between the parameter estimates

    can be obtained by taking covariance from appropriate

    off-diagonal element, and dividing by the standard errors

    of the individual parameter estimates

  • 8/9/2019 Nonlinear Regression Analysis

    91/135

    91

    Correlation of the Parameter Estimates

    Note that

    I.e., the parameter estimate for the intercept depends

    linearly on the slope!

    the slope and intercept estimates are correlated

    $ $ 0 1= Y x

    changing slope changespoint of intersection withaxis because the line mustgo through the centroid of thedata

  • 8/9/2019 Nonlinear Regression Analysis

    92/135

    92

    Getting Rid of the Covariance

    Lets define the explanatory variable as the deviation

    from its average:Z =

    $

    $

    0

    1 12

    1

    =

    =

    =

    =

    Y

    z Y

    z

    i i

    i

    N

    ii

    N

    Least Squares parameterestimates:

    - note that now there is no explicitdependence on the slope valuein the intercept expression

    - average of z is zero

  • 8/9/2019 Nonlinear Regression Analysis

    93/135

    93

    Getting Rid of the Covariance

    In this form of the model, the slope and intercept

    parameter estimates are uncorrelated

    Why is lack of correlation useful?

    allows indepedent decisions about parameter estimates decide whether slope is significant, intercept is significant

    individually

    unique assignment of trend

    intercept clearly associated with mean of ys

    slope clearly associated with steepness of trend

    correlation can be eliminated by altering form of model,

    and choice of experimental points

  • 8/9/2019 Nonlinear Regression Analysis

    94/135

    94

    Confidence Intervals for Parameters

    similar procedure to straight line case:

    given standard error for parameter estimate, useappropriate t-value, and form interval as:

    The degrees of freedom for the t-statistic come from the

    estimate of the inherent noise variance

    the degrees of freedom will be the same for all of the

    parameter estimates

    If the confidence interval contains zero, the parameter is plausibly

    zero and consideration should be given to deleting the term.

    $, / $ i t s

    i

    2

  • 8/9/2019 Nonlinear Regression Analysis

    95/135

    95

    Hypothesis Tests for Parameters

    represent an alternative approach to testing whether the term

    should be retained in the model

    Null hypothesis - parameter = 0

    Alternate hypothesis - parameter is not equal to 0

    Test statistic:

    compare absolute value to

    if test statistic is greater (outside the fence), parameter

    is significant -- retain

    inside the fence? - consider deleting the term

    $

    $

    is

    i

    t , /2

  • 8/9/2019 Nonlinear Regression Analysis

    96/135

    96

    Example - Wave Solder Defects Data

    Test statistic will be compared to

    because MSE is used to calculate standard errors of parameters,and has 7 degrees of freedom.

    Test statistic for intercept:

    Since 16.63 > 2.365, conclude that intercept parameter ISsignificant and should be retained.

    t7 0 025 2 365, . .=

    $ .

    ..

    $

    0

    0

    9836

    34 991663

    s= =

  • 8/9/2019 Nonlinear Regression Analysis

    97/135

    97

    Example - Wave Solder Defects Data

    For the next term in the model:

    Therefore this term should be retained in the model.

    Because the parameter estimates are uncorrelatedin this model,

    terms can be dropped without the need to re-estimate the otherparameters in the model -- in general, you will have to re-

    estimate the final model once more to obtain the parameter

    estimates corresponding to the final model form.

    $ .

    .. .

    $

    1

    1

    26 5

    4811382 2 365

    s= = >

  • 8/9/2019 Nonlinear Regression Analysis

    98/135

    98

    Example - Wave Solder Defects Data

    From Excel:

    Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

    Intercept 98.36363636 5.915031978 16.62943 6.948E-07 84.376818 112.3505

    Conveyor Speed 26.5 6.935989803 3.820652 0.0065367 10.099002 42.901

    Pot Temp 26 6.935989803 3.748564 0.0071817 9.599002 42.401Flux Density -42.25 6.935989803 -6.09142 0.0004953 -58.651 -25.849

    standard devns.of each parameterestimate

    test statisticfor eachparameter

    prob. thata value isgreater thancomputed test

    ratio - 2-tailedtest!

    confidencelimits

  • 8/9/2019 Nonlinear Regression Analysis

    99/135

    99

    Precision of the Predicted Responses

    The predicted response from an estimated model has uncertainty,

    because it is a function of the parameter estimates which haveuncertainty:

    e.g., Solder Wave Defect Model - first responseat the point -1,-1,-1

    If the parameter estimates were uncorrelated, the variance of the

    predicted response would be:

    (recall results for variance of sum of random variables)

    $

    $ $( )

    $( )

    $( )y1 0 1 2 31 1 1= + + +

    Var y Var Var Var Var ( $ ) ( $ ) ( $ ) ( $ ) ( $ )1 0 1 2 3= + + +

  • 8/9/2019 Nonlinear Regression Analysis

    100/135

    100

    Precision of the Predicted Responses

    In general, both the variances and covariances of the parameter

    estimates must be taken into account.

    For prediction at the k-th data point:

    [ ]

    Var y

    x x x

    x

    x

    x

    k kT T

    k

    k k kp

    T

    k

    k

    kp

    ( $ ) ( )

    ( )

    =

    =

    x X X x

    X X

    1 2

    1 2

    1

    1

    22

    LM

  • 8/9/2019 Nonlinear Regression Analysis

    101/135

    101

    Example - Wave Solder Defects Model

    In this example, the parameter estimates are uncorrelated

    XTX is diagonal variance of the predicted reponse is in fact the sum of the

    variances of the parameter estimates

    Variance of prediction at run #11 (0,0,0):

    Var y Var Var Var Var

    Var

    ( $ ) ( $ ) ( $ )( ) ( $ )( ) ( $ )( )

    ( $ )

    11 0 1 2 3

    0

    0 0 0= + + +

    =

  • 8/9/2019 Nonlinear Regression Analysis

    102/135

    102

    Precision of Future Predictions

    Suppose we want to predict the response at conditions other than

    those of the experimental runs --> future run.

    The value we observe will consist of the component from thedeterministic component, plus the noise component.

    In predicting this value, we must consider:

    uncertainty from our prediction of the deterministiccomponent

    noise component

    The variance of this future prediction is

    where is computed using the same expression

    for variance of predicted responses at experimental run conditions

    Var yfuture( $ ) + 2

    Var yfuture( $ )

  • 8/9/2019 Nonlinear Regression Analysis

    103/135

    103

    Estimating Precision of Predicted Responses

    Use an estimate of the inherent noise variance

    The degrees of freedom for the estimated variance of the predictedresponse are those of the estimate of the noise variance

    replicates, external estimate, MSE

    s sy k

    T Tk e

    k$( )2 1 2= x X X x

  • 8/9/2019 Nonlinear Regression Analysis

    104/135

    104

    Confidence Limits for Predicted Responses

    Follow an approach similar to that for parameters - 100(1-alpha)%

    confidence limits for predicted response at the k-th run are:

    degrees of freedom are those of the inherent noise

    variance estimate

    If the prediction is for a response at conditions OTHER than one of

    the experimental runs, the limits are:

    $ , / $ y t sk yk 2

    $ , / $ y t s sk y efuture

    + 22 2

  • 8/9/2019 Nonlinear Regression Analysis

    105/135

    105

    Practical Guidelines for Model Development

    1) ConsiderCODING your explanatory variables

    Coding - one standard form:

    places designed experiment into +1,-1 form if run conditions are from an experimental design, this

    coding must be used in order to obtain all of the benefits

    from the design - uncorrelated parameter estimates

    if conditions are not from an experimental design, such acoding improves numerical conditioning of the problem --

    similar numerical scales for all variables

    ~

    ( )

    xx x

    range xi

    i i

    i

    = 1

    2

  • 8/9/2019 Nonlinear Regression Analysis

    106/135

    106

    Practical Guidelines for Model Development

    2) Types of models -

    linear in the explanatory variables linear with two-factor interactions (xi xj)

    general polynomials

    3) Watch forcollinearityin the X matrix - run condition patterns for

    two or more explanatory variables are almost the same

    prevents clear assignment of trend to each factor

    shows up as singularity in XTX matrix

    associated with very strong correlation between

    parameter estimates

  • 8/9/2019 Nonlinear Regression Analysis

    107/135

    107

    Practical Guidelines for Model Development

    4) Be careful not to extrapolate excessively beyond the range of

    the data

    5) Maximum number of parameters that can be fit to a data set =

    number of unique run conditions

    N - number of data points

    m - number of replicate sets

    ni - number of points in replicate set i

    as number of parameters increases, precision of

    predictions decreases - start modeling noise

    N n mii

    m

    =1

  • 8/9/2019 Nonlinear Regression Analysis

    108/135

    108

    Practical Guidelines for Model Development

    6) Model building sequence

    building approach - start with few terms and add asnecessary

    pruning approach - start with more terms and remove

    those which arent statistically significant

    stepwise regression - terms are added, and retained

    according to some criterion - frequently R2

    uncorrelated? criterion?

    all subsets regression - consider all subsets of modelterms of certain type, and select model with best criterion

    significant computational load

  • 8/9/2019 Nonlinear Regression Analysis

    109/135

    109

    Polynomial Models

    Order - maximum over thep terms in the model of the sum of the

    exponents in a given terme.g.,

    is a fifth-order model

    Two factor interaction -

    product term -

    implies that impact of x1 on response depends on value

    of x2

    Y x x x x= + + + + 0 1 1 2 22

    3 12

    23

    x x1 2

  • 8/9/2019 Nonlinear Regression Analysis

    110/135

    110

    Polynomial Models

    Comments -

    polynomial models can sometimes suffer from collinearityproblems - coding helps this

    polynomials can provide approximations to nonlinear

    functions - think of Taylor series approximations

    high-order polynomial models can sometimes be

    replaced by fewer nonlinear function terms

    e.g., ln(x) vs. 3rd order polynomial

  • 8/9/2019 Nonlinear Regression Analysis

    111/135

    111

    Joint Confidence Region (JCR)

    answers the question

    Where do the true values of the parameters lie?

    Recall that for individual parameters, we gain an understanding of

    where the true value lies by:

    examining the variability pattern (distribution) for the

    parameter estimate

    identify a range in which most of the values of the

    parameter estimate are likely to lie manipulate this range to determine an interval which is

    likely to contain the true value of the parameter

  • 8/9/2019 Nonlinear Regression Analysis

    112/135

    112

    Joint Confidence Region

    Confidence interval for individual parameter:

    Step 1) The ratio of the estimate to its standard deviation is

    distributed as a Students t-distribution with degrees of freedomequal to that of the standard devn of the variance estimate

    Step 2) Find interval which contains

    of values -i.e., probability of a t-value falling in this interval is

    Step 3) Rearrange this interval to obtain interval

    which contains true value of parameter of the time

    $~

    $

    i i

    s

    t

    i

    [ , ], / , /t t 2 2 100 1( )%

    ( )1

    $, / $ i t s i

    2100 1( )%

  • 8/9/2019 Nonlinear Regression Analysis

    113/135

    113

    Joint Confidence Region

    Comments on Individual Confidence Intervals:

    sometimes referred to as marginalconfidence intervals -cf. marginal distributions vs. joint distributions from earlier

    marginal confidence intervals do NOT account for

    correlations between the parameter estimates

    examining only marginal confidence intervals can

    sometimes be misleading if there is strong correlation

    between several parameter estimates

    value of one parameter estimate depends in part on anther

    deletion of the other changes the value of the parameter

    estimate

    decision to retain might be altered

  • 8/9/2019 Nonlinear Regression Analysis

    114/135

    114

    Joint Confidence Region

    Sequence:

    Step 1) Identify a statistic which is a function of the parameterestimate statistics

    Step 2) Identify a region in which values of this statistic lie a certain

    fraction of the time (a region)

    Step 3) Use this information to determine a region which contains

    the true value of the parameters of the time

    100 1( )%

    100 1( )%

  • 8/9/2019 Nonlinear Regression Analysis

    115/135

    115

    Joint Confidence Region

    The quantity

    is the ratio of two sums of squares, and is distributed as an F-

    distribution withp degrees of freedom in the numerator, and n-p

    degrees of freedom in the denominator

    ( $ ) ( $ )

    ~ ,

    T T

    p n pp

    s

    F

    X X

    2estimate ofinherentnoise variance(if MSE is used, degrees of freedom is n-p)

  • 8/9/2019 Nonlinear Regression Analysis

    116/135

    116

    Joint Confidence Region

    We can define a region by thinking of those values of the ratio

    which have a value less than

    i.e.,

    Rearranging yields:

    Fp n p, , 1

    ( $ ) ( $ )

    , ,

    T T

    p n pp

    sF

    X X

    2 1

    ( $ ) ( $ ) , , T T

    p n p ps FX X2

    C f f

  • 8/9/2019 Nonlinear Regression Analysis

    117/135

    117

    Joint Confidence Region - Definition

    The joint confidence region for the parameters is

    defined as those parameter values satisfying:

    Interpretation:

    the region defined by this inequality contains the true

    values of the parameters of the time

    if values of zero for one or more parameters lie in this

    region, those parameters are plausibly zero, and

    consideration should be given to dropping the

    corresponding terms from the model

    100 1( )%

    ( $ ) ( $ ) , , T T

    p n p ps FX X2

    1

    100 1( )%

  • 8/9/2019 Nonlinear Regression Analysis

    118/135

    118

    Joint Confidence Region - Example with 2 Parameters

    Lets reconsider the solder thickness example:

    95% Joint Confidence Region (JCR) for slope&intercept:

    ( ) ;X XT =

    10 2367

    2367 563335

    $

    .

    .

    ; =

    45810

    113

    [ ]

    ( $ ) ( $ )

    $ $

    $

    $

    , , , .

    =

    =

    T T

    Tp n p ps F s F

    X X

    X X0 0 1 1

    0 0

    1 1

    2 22 10 2 0 952

    s2 13538= .

    C f

  • 8/9/2019 Nonlinear Regression Analysis

    119/135

    119

    Joint Confidence Region - Example with 2 Parameters

    95% Joint Confidence Region (JCR) for slope&intercept:

    The boundary is an ellipse...

    [ ]45810 11345810

    113

    2 13538

    2 135 38 4 46 1207 59

    0 1

    0

    1

    2 8 0 95. .

    .

    .

    ( . )

    ( . )( . ) .

    , , .

    = =

    X XT F

    J i t C fid R i E l ith 2 P t

  • 8/9/2019 Nonlinear Regression Analysis

    120/135

    120

    Joint Confidence Region - Example with 2 Parameters

    Region

    320 600

    -0.6

    -1.6

    Intercept

    Slope

    rotated - implies correlationbetween estimates of slope

    and intercept

    centred at least squaresparameter estimates

    greater shadow along horizontal axis --> variance of

    intercept estimate is greater than that of slope

    I t ti J i t C fid R i

  • 8/9/2019 Nonlinear Regression Analysis

    121/135

    121

    Interpreting Joint Confidence Regions

    1) Are axes aligned with coordinate axes?

    is ellipse horizontal or vertical?

    indicates no correlation between parameter estimates2) Which axis has the greatest shadow?

    projection of ellipse along axis

    indicates which parameter estimate has the greatest

    variance

    3) The elliptical region is, by definition, centred at the least squares

    parameter estimates

    4) Long, narrow, rotated ellipses indicate significant correlationbetween parameter estimates

    5) If a value of zero for one or more parameters lies in the region,

    these parameters are plausibly zero - consider deleting from

    model

    J i t C fid R i

  • 8/9/2019 Nonlinear Regression Analysis

    122/135

    122

    Joint Confidence Regions

    What is the motivation for the ratio

    used to define the joint confidence region?

    Consider the joint distribution for the parameter estimates:

    ( $ ) ( $ )

    T T

    p

    s

    X X

    2

    12

    122

    1

    ( ) det( )exp{ ( $ ) ( $ )}

    /$

    $

    p

    T

    Substitute in estimate forparameter covariance matrix:

    ( $ ) (( ) ) ( $ )

    ( $ ) ( $ )

    =

    T T

    T T

    s

    s

    X X

    X X

    1 2 1

    2

    C fid I t l f D iti

  • 8/9/2019 Nonlinear Regression Analysis

    123/135

    123

    Confidence Intervals from Densities

    Individual Interval Joint Regionf b$ ( )

    f b b$ $ ( , ) 0 1 0 1

    bb0

    b1

    lower upper

    area = 1-alpha

    volume = 1-alpha

    Joint ConfidenceRegion

    R l ti hi t M i l C fid Li it

  • 8/9/2019 Nonlinear Regression Analysis

    124/135

    124

    Relationship to Marginal Confidence Limits

    Region

    320 600

    -0.6

    -1.6

    Intercept

    Slope

    centred at least squaresparameter estimates

    marginal confidence interval for intercept

    marg

    inalconfide

    nceinterval

    forslo

    pe

    Relationship to Marginal Confidence Limits

  • 8/9/2019 Nonlinear Regression Analysis

    125/135

    125

    Relationship to Marginal Confidence Limits

    Region

    320 600

    -0.6

    -1.6

    Intercept

    Slope 95% confidence

    region for parametersconsidered jointly

    marginal confidence interval for intercept

    marg

    inalconfide

    nceinterval

    forslo

    pe

    95% confidenceregion implied by

    considering parametersindividually

    Relationship to Marginal Confidence Intervals

  • 8/9/2019 Nonlinear Regression Analysis

    126/135

    126

    Relationship to Marginal Confidence Intervals

    Marginal confidence intervals are contained in joint confidence

    region potential to miss portions of plausible parameter values

    at tails of ellipsoid

    using individual confidence intervals implies a

    rectangular region, which includes sets of parametervalues that lie outside the joint confidence region

    both situations can lead to

    erroneous acceptance of terms in model

    erroneous rejection of terms in model

    Going Further Nonlinear Regression Models

  • 8/9/2019 Nonlinear Regression Analysis

    127/135

    127

    Going Further - Nonlinear Regression Models

    Model:

    Estimation Approach:

    linearize model with respect to parameters

    treat linearization as a linear regression problem

    iterate by repeating linearization/estimation/linearizationabout new estimates, until convergence to parameter

    values - Gauss-Newton iteration - or solve numerical

    optimization problem

    Yi i i= + ( , )x

    explanatoryvariables

    parameters

    random noisecomponent

    Interpretation Columns of X

  • 8/9/2019 Nonlinear Regression Analysis

    128/135

    chee824 - Winter 2004 J. McLellan 128

    Interpretation - Columns of X

    values of a given variable at different operating points -

    entries in XTX

    dot products of vectors of regressor variable values

    related to correlation between regressor variables

    form of XTX is dictated by experimental design e.g., 2k design - diagonal form

    Parameter Estimation Graphical View

  • 8/9/2019 Nonlinear Regression Analysis

    129/135

    chee824 - Winter 2004 J. McLellan 129

    Parameter Estimation - Graphical View

    approximating observation vector

    $y

    yobservations

    residual

    vector

    P t E ti ti N li R i C

  • 8/9/2019 Nonlinear Regression Analysis

    130/135

    chee824 - Winter 2004 J. McLellan 130

    Parameter Estimation - Nonlinear Regression Case

    approximating observation vector

    $y

    y

    residual

    vector

    model surface

    observations

    Properties of LS Parameter Estimates

  • 8/9/2019 Nonlinear Regression Analysis

    131/135

    chee824 - Winter 2004 J. McLellan 131

    Properties of LS Parameter Estimates

    Key Point - parameter estimates are random variables

    because of how stochastic variation in data propagatesthrough estimation calculations

    parameter estimates have a variability pattern -

    probability distribution and density functions

    Unbiased

    average of repeated data collection / estimation

    sequences will be true value of parameter vectorE{$

    } =

    Properties of Parameter Estimates

  • 8/9/2019 Nonlinear Regression Analysis

    132/135

    chee824 - Winter 2004 J. McLellan 132

    Properties of Parameter Estimates

    Consistent

    behaviour as number of data points tends to infinity with probability 1,

    distribution narrows as N becomes large

    Efficient

    variance of least squares estimates is less than that ofother types of parameter estimates

    N

    =lim $

    Properties of Parameter Estimates

  • 8/9/2019 Nonlinear Regression Analysis

    133/135

    chee824 - Winter 2004 J. McLellan 133

    Properties of Parameter Estimates

    Covariance Structure

    summarized by variance-covariance matrix

    Cov( $) ( ) = X XT 1 2

    structure dictated byexperimental design

    variance ofnoise

    CovVar Cov

    Cov Var ( $)

    ( $ ) ( $ , $ )

    ( $ , $ ) ( $ )

    =

    0 0 1

    0 1 1

    Prediction Variance

  • 8/9/2019 Nonlinear Regression Analysis

    134/135

    chee824 - Winter 2004 J. McLellan 134

    Prediction Variance

    in matrix form -

    where is vector of conditions at k-th data point

    var( $ ) ( )yk kT T

    k= x X X x1 2

    x k

    Joint Confidence Regions

  • 8/9/2019 Nonlinear Regression Analysis

    135/135

    Joint Confidence Regions

    Variability in data can affect parameter estimates jointly

    depending on structure of data and model

    2

    1

    section of sum ofsquares

    (or likelihood)function