139
Introduction to Radial Basis Function Networks

Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Embed Size (px)

Citation preview

Page 1: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

Page 2: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Content

Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function Approximation The Projection Matrix Learning the Kernels Bias-Variance Dilemma The Effective Number of Parameters Model Selection Incremental Operations

Page 3: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

Overview

Page 4: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Typical Applications of NN

Pattern Classification

Function Approximation

Time-Series Forecasting

( )l f xl C N

( )fy x mY R y

mX R x

nX R x

1 2 3( ) ( , , , )t t tft x x x x

Page 5: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Function Approximation

mX R fnY R

:f X YUnknown

mX R nY R

ˆ :f X YApproximator

f

Page 6: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

NeuralNetwork

Supervised Learning

UnknownFunction

ixiy

ˆ iy1,2,i +

+

ie

Page 7: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Neural Networks as Universal Approximators

Feedforward neural networks with a single hidden layer of sigmoidal units are capable of approximating uniformly any continuous multivariate function, to any desired degree of accuracy.

– Hornik, K., Stinchcombe, M., and White, H. (1989). "Multilayer Feedforward Networks are Universal Approximators," Neural Networks, 2(5), 359-366.

Like feedforward neural networks with a single hidden layer of sigmoidal units, it can be shown that RBF networks are universal approximators.

– Park, J. and Sandberg, I. W. (1991). "Universal Approximation Using Radial-Basis-Function Networks," Neural Computation, 3(2), 246-257.

– Park, J. and Sandberg, I. W. (1993). "Approximation and Radial-Basis-Function Networks," Neural Computation, 5(2), 305-316.

Page 8: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Statistics vs. Neural Networks

Statistics Neural Networksmodel networkestimation learningregression supervised learninginterpolation generalizationobservations training setparameters (synaptic) weightsindependent variables inputsdependent variables outputsridge regression weight decay

Page 9: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

The Model ofFunction Approximator

Page 10: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Linear Models

1

(( ) )ii

i

m

f w

x xFixed BasisFunctions

Weights

Page 11: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Linear Models

Feature VectorsInputs

HiddenUnits

OutputUnits

• Decomposition• Feature Extraction• Transformation

Linearly weighted output

1

(( ) )ii

i

m

f w

x x

y

11 22 mm

x1 x2 xn

w1 w2 wm

x =

Page 12: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Linear Models

Feature VectorsInputs

HiddenUnits

OutputUnits

• Decomposition• Feature Extraction• Transformation

Linearly weighted output

y

11 22 mm

x1 x2 xn

w1 w2 wm

x =

Can you say some bases?Can you say some bases?

1

(( ) )ii

i

m

f w

x x

Page 13: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example Linear Models

Polynomial

Fourier Series

( ) ii

iwf xx

0exp 2( )k

k jf w k xx

Are they orthogonal bases?Are they orthogonal bases?

( ) , 0,1,2,ii ix x

0( ) exp 2 0,1,, 2, k x j k x k

1

(( ) )ii

i

m

f w

x x

Page 14: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

y

11 22 mm

x1 x2 xn

w1 w2 wm

x =

Single-Layer Perceptrons as Universal Aproximators

HiddenUnits

With sufficient number of sigmoidal units, it can be a universal approximator.

1

(( ) )ii

i

m

f w

x x

Page 15: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

y

11 22 mm

x1 x2 xn

w1 w2 wm

x =

Radial Basis Function Networks as Universal Aproximators

HiddenUnits

With sufficient number of radial-basis-function units, it can also be a universal approximator.

1

(( ) )ii

i

m

f w

x x

Page 16: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Non-Linear Models

1

(( ) )ii

i

m

f w

x xAdjusted by the

Learning process

Weights

1

(( ) )ii

i

m

f w

x x

Page 17: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

The Radial Basis Function Networks

Page 18: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Radial Basis Functions1

(( ) )ii

i

m

f w

x x

Center

Distance Measure

Shape

Three parameters for a radial function:

xi

r = ||x xi||

i(x)= (||x xi||)

Page 19: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Typical Radial Functions

Gaussian

Hardy Multiquadratic

Inverse Multiquadratic

2

22 0 and r

r e r

2 2 0 and r r c c c r

2 2 0 and r c r c c r

Page 20: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Gaussian Basis Function (=0.5,1.0,1.5)

2

22 0 and r

r e r

0.5 1.0

1.5

Page 21: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Inverse Multiquadratic

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-10 -5 0 5 10

2 2 0 and r c r c c r

c=1

c=2c=3

c=4c=5

Page 22: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

++

Most General RBF

1( ) ( )i iT

i μ x μx x

+1μ

2μ 3μ

Basis {i: i =1,2,…} is `near’ orthogonal.

Page 23: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Properties of RBF’s

On-Center, Off Surround

Analogies with localized receptive fields found in several biological structures, e.g.,– visual cortex;– ganglion cells

Page 24: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Topology of RBF

Feature Vectorsx1 x2 xn

y1 ym

Inputs

HiddenUnits

OutputUnits

Projection

Interpolation

As a function approximator ( )y f x

Page 25: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Topology of RBF

Feature Vectorsx1 x2 xn

y1 ym

Inputs

HiddenUnits

OutputUnits

Subclasses

Classes

As a pattern classifier.

Page 26: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

RBFN’s forFunction Approximation

Page 27: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The idea

x

y Unknown Function to Approximate

TrainingData

Page 28: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The idea

x

y Unknown Function to Approximate

TrainingData

Basis Functions (Kernels)

Page 29: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The idea

x

y

Basis Functions (Kernels)

Function Learned

1

)) ((m

iiiwy f

xx

Page 30: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The idea

x

y

Basis Functions (Kernels)

Function Learned

NontrainingSample

1

)) ((m

iiiwy f

xx

Page 31: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The idea

x

y Function Learned

NontrainingSample

1

)) ((m

iiiwy f

xx

Page 32: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Radial Basis Function Networks as Universal Aproximators

1

( ) ( )m

ii

iy wf

x x

x1 x2 xn

w1 w2 wm

x =

Training set ( )

1

( ),p

k

k

ky

xT

Goal (( ))k kfy x for all k

2

( )

1

( )min kp

k

k

SSE fy

x

2

)(

1

) (

1

p mk

ik

ki

i

y w

x

Page 33: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Learn the Optimal Weight Vector

w1 w2 wm

x1 x2 xnx =

Training set ( )

1

( ),p

k

k

ky

xT

Goal (( ))k kfy x for all k

1

( ) ( )m

ii

iy wf

x x

2

( )

1

( )min kp

k

k

SSE fy

x

2

)(

1

) (

1

p mk

ik

ki

i

y w

x

Page 34: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

2

( )

1

( )min kp

k

k

SSE fy

x

2

)(

1

) (

1

p mk

ik

ki

i

y w

x

Regularization

Training set ( )

1

( ),p

k

k

ky

xT

Goal (( ))k kfy x for all k

2

1

m

iiiw

2

1

m

iiiw

0i

If regularization is unneeded, set 0i

C

Page 35: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Learn the Optimal Weight Vector

22( )

1

( )

1

p mk

ki

ii

k wfyC

xMinimize

1

( ) ( )m

ii

iwf

x x

(

( )

) ( )

1

2 2kp

jk

j

k

k

j j

wfC

fw w

y

x

x

( ) ( ) ( )

1

2 2p

k kj

kjj

k f wy

x x

0

( ) * ( ) ( )

1 1

)* (p p

k k kj j

k kjj

kwf y

x x x

*

1

*( ) ( )i

m

ii

wf

x x

Page 36: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Learn the Optimal Weight Vector

(1) ( ), ,T

pj j j φ x x

* * (1) * ( ), ,T

pf ff x xDefine

(1) ( ), ,Tpy yy

* *j

T Tj jjw φ f φ y 1, ,j m

( ) * ( ) ( )

1 1

)* (p p

k k kj j

k kjj

kwf y

x x x

1

( ) ( )m

ii

iwf

x x *

1

*( ) ( )i

m

ii

wf

x x

Page 37: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Learn the Optimal Weight Vector

*1 1

*2 2

*1

*

*

1

2 2

*

T T

T T

T Tm mmm

w

w

w

φ f φ y

φ f φ y

φ f φ y

* *j

T Tj jjw φ f φ y 1, ,j m

1 2, , , mΦ φ φ φ

1

2

m

Λ

* * *1 2* , , ,

T

mw w ww

Define

**T T wΦ Λf Φ y

Page 38: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Learn the Optimal Weight Vector

**T T wΦ Λf Φ y

* (1)

* (2)*

* ( )p

f

f

f

x

xf

x

*

1

*( ) ( )i

m

ii

wf

x x

(1)

1

(2)

1

(

*

*

1

*

)

k

k

m

kk

m

kk

mP

kk

k

w

w

w

x

x

x

(1) (1) (1)*1 21

(2) (2) (2) *1 2 2

*( ) ( ) ( )

1 2

m

m

p p pm

m

w

w

w

x x x

x x x

x x x

*Φw

* *T T ΛwΦ wΦ Φ y

Page 39: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Learn the Optimal Weight Vector

* *T T ΛwΦ wΦ Φ y

*T T wΦ ΛΦ Φ y

1* T T Λw Φ Φ Φ y

Variance Matrix

1 TA Φ y

1 :ADesign Matrix:Φ

Page 40: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Summary

1* T T Λw Φ Φ Φ y

1 TA Φ y

(1)

(2)

( )

j

jj

pj

x

x

(1)

(2)

( )p

y

y

y

y

1 2, , , mΦ φ φ φ

*1*2

*

*

m

w

w

w

w

1

2

m

Λ

1

)) ((m

iiiwy f

xx

Training set ( )

1

( ),p

k

k

ky

xT

Page 41: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

The Projection Matrix

Page 42: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Empirical-Error Vector

1* Tw A Φ y

*1w *

2w *mw

( )kx ( )1

kx ( )2kx ( )k

nx( )kx ( )1

kx ( )2kx ( )k

nx

UnknownFunction

UnknownFunction

y ** f Φw

1φ 2φ mφ

Page 43: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Empirical-Error Vector

1* Tw A Φ y

*1w *

2w *mw

( )kx ( )1

kx ( )2kx ( )k

nx( )kx ( )1

kx ( )2kx ( )k

nx

UnknownFunction

UnknownFunction

y ** f Φw

1φ 2φ mφ

Error Vector

* e y f * y Φw 1 T ΦAy Φ y

1 Tp

AI Φ Φ yPy

Page 44: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Sum-Squared-Error

e Py1 T

p P I Φ ΦA

T A Φ Φ Λ

1 2, , , mΦ φ φ φError Vector

T Py Py2Ty P y

2( ) * ( )

1

pk k

k

SSE y f

x

If =0, the RBFN’s learning algorithm is to minimize SSE (MSE).

Page 45: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Projection Matrix

e Py1 T

p P I Φ ΦA

T A Φ Φ Λ

1 2, , , mΦ φ φ φError Vector

2TSSE y P y

1 2( , , , )mspane φ φ φΛ 0

( )P Py Pee Py

Ty Py

Page 46: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

Learning the Kernels

Page 47: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

RBFN’s as Universal Approximators

x1 x2 xn

y1 yl

12 m

w11 w12w1m

wl1 wl2wlml

Training set

) ( )(

1,

pk

k

k

yxT

2

2( ) exp

2j

jj

x μx

Kernels

Page 48: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

What to Learn?

Weights wij’s Centers j’s of j’s Widths j’s of j’s

Number of j’s Model Selection

x1 x2 xn

y1 yl

12 m

w11 w12w1m

wl1 wl2wlml

Page 49: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

One-Stage Learning

( ) ( ) ( )1

k k kij i i jw y f x x

( ) ( )

1

lk k

i ij jj

f w

x x

( ) ( ) ( )2 2

1

ljk k k

j j ij i iij

w y f

x μμ x x

2

( ) ( ) ( )3 3

1

ljk k k

j j ij i iij

w y f

x μx x

Page 50: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

One-Stage Learning

( ) ( )

1

lk k

i ij jj

f w

x xThe simultaneous updates of all three sets of parameters may be suitable for non-stationary environments or on-line setting.

The simultaneous updates of all three sets of parameters may be suitable for non-stationary environments or on-line setting.

( ) ( ) ( )1

k k kij i i jw y f x x

( ) ( ) ( )2 2

1

ljk k k

j j ij i iij

w y f

x μμ x x

2

( ) ( ) ( )3 3

1

ljk k k

j j ij i iij

w y f

x μx x

Page 51: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

x1 x2 xn

y1 yl

12 m

w11 w12w1m

wl1 wl2wlml

Two-Stage Training

Step 1

Step 2

Determines Centers j’s of j’s. Widths j’s of j’s. Number of j’s.

Determines Centers j’s of j’s. Widths j’s of j’s. Number of j’s.

Determines wij’s.Determines wij’s.

E.g., using batch-learning.

Page 52: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Train the Kernels

Page 53: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

+

+

+

+

+

Unsupervised Training

Page 54: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Subset Selection– Random Subset Selection– Forward Selection– Backward Elimination

Clustering Algorithms– KMEANS– LVQ

Mixture Models– GMM

Methods

Page 55: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Subset Selection

Page 56: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Random Subset Selection

Randomly choosing a subset of points from training set

Sensitive to the initially chosen points.

Using some adaptive techniques to tune– Centers– Widths– #points

Page 57: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Clustering Algorithms

Partition the data points into K clusters.

Partition the data points into K clusters.

Page 58: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Clustering Algorithms

Is such a partition satisfactory?

Is such a partition satisfactory?

Page 59: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Clustering Algorithms

How about this?How about this?

Page 60: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

+

++

+

Clustering Algorithms

1

2

3

4

Page 61: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

Bias-Variance Dilemma

Page 62: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Goal Revisit

Minimize Prediction Error

• Ultimate Goal Generalization

• Goal of Our Learning Procedure

Minimize Empirical Error

Page 63: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Badness of Fit Underfitting

– A model (e.g., network) that is not sufficiently complex can fail to detect fully the signal in a complicated data set, leading to underfitting.

– Produces excessive bias in the outputs. Overfitting

– A model (e.g., network) that is too complex may fit the noise, not just the signal, leading to overfitting.

– Produces excessive variance in the outputs.

Page 64: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Underfitting/Overfitting Avoidance

Model selection Jittering Early stopping Weight decay

– Regularization– Ridge Regression

Bayesian learning Combining networks

Page 65: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Best Way to Avoid Overfitting

Use lots of training data, e.g.,– 30 times as many training cases as there are we

ights in the network.– for noise-free data, 5 times as many training ca

ses as weights may be sufficient.

Don’t arbitrarily reduce the number of weights for fear of underfitting.

Page 66: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Badness of Fit

Underfit Overfit

Page 67: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Badness of Fit

Underfit Overfit

Page 68: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

Underfit Overfit

Large bias Small bias

Small variance Large variance

However, it's not really a dilemma.

Page 69: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Underfit Overfit

Large bias Small bias

Small variance Large variance

Bias-Variance Dilemma

More on overfitting

Easily lead to predictions that are far beyond the range of the training data.

Produce wild predictions in multilayer perceptrons even with noise-free data.

More on overfitting

Easily lead to predictions that are far beyond the range of the training data.

Produce wild predictions in multilayer perceptrons even with noise-free data.

Page 70: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

It's not really a dilemma.

fitmore

overfitmore

underfit

Bias

Variance

Page 71: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

Sets of functionsE.g., depend on # hidden nodes used.

The true modelnoise

bias

Solution obtained with training set 2.

Solution obtained with training set 1.

Solution obtained with training set 3.

biasbias

The mean of the bias=?

The variance of the bias=?

Page 72: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

Sets of functionsE.g., depend on # hidden nodes used.

The true modelnoise

The mean of the bias=?

The variance of the bias=?

bias

Variance

Page 73: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Model Selection

Sets of functionsE.g., depend on # hidden nodes used.

The true modelnoise

bias

Variance

Reduce the effective number of parameters.

Reduce the number of hidden nodes.

Page 74: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

Sets of functionsE.g., depend on # hidden nodes used.

The true modelnoise

bias

( )g x( ) ( )y g x x

( )f x

Goal: 2min ( ) ( )E y f x x

Page 75: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

Goal: 2min ( ) ( )E y f x x

2( ) ( )E y f x x 2( ) ( )E g f x x

2 2( ) ( ) 2 ( ) ( )E g f g f x x x x

2 2( ) ( ) 2 ( ) ( )E g f E g f E x x x x

Goal: 2min ( ) ( )E g f x x 2min ( ) ( )E y f x x

0 constant

Page 76: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

2( ) ( )E g f x x 2[ ( )] [( ) () )( ]E f E fE g f x xx x

2[ ]( )) (E fE g x x 2[ ]( )) (E fE f x x

[ ( )] [ ( )) ( ]2 ( )E g fE f E f x xx x

2 2[ ( )] [ ( )]( ) ( ) [ ( )] [2 ( ) ( ) ( )]E f E f E f EE g g f ff x x x xx x x x

( ) ( )[ ( )] [ ( )]E E f Eg f f x xxx

( ) ( )E g f x x [ )]( ()EE fg x x [ ( ()] )E E f f x x [ ( )] [ ( )]E f E fE x x

( ) ( )E g E f x x [ )]( () EE fg x x [ ( )] ( )EE f f x x [ ( )] [ ( )]E f E f x x 0

2 22( ) ( ) ( ) ( )E y f E E g f x x x x

0

Page 77: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

2 22( ) ( ) ( ) ( )E y f E E g f x x x x

2 22 [ ( )] [ ( )( ( ]) )E fE E fE g E f x x xx

Goal: 2min ( ) ( )E y f x x

noise bias2 variance

Cannot be minimized

Minimize both bias2 and variance

Page 78: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Model Complexity vs. Bias-Variance

2 22( ) ( ) ( ) ( )E y f E E g f x x x x

2 22 [ ( )] [ ( )( ( ]) )E fE E fE g E f x x xx

Goal: 2min ( ) ( )E y f x x

noise bias2 variance

ModelComplexity(Capacity)

Page 79: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Bias-Variance Dilemma

2 22( ) ( ) ( ) ( )E y f E E g f x x x x

2 22 [ ( )] [ ( )( ( ]) )E fE E fE g E f x x xx

Goal: 2min ( ) ( )E y f x x

noise bias2 variance

ModelComplexity(Capacity)

Page 80: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example (Polynomial Fits)

22exp( 16 )y x x

Page 81: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example (Polynomial Fits)

Page 82: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example (Polynomial Fits)

Degree 1 Degree 5

Degree 10 Degree 15

Page 83: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

The Effective Number of Parameters

Page 84: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Variance Estimation

Mean

Variance 22

1

p

ii

xp

In general, not available.

Page 85: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Variance Estimation

Mean1

p

ii

x xp

Variance 22 2

1

ˆ1

p

ii

s xp

Loss 1 degree of freedom

Page 86: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Simple Linear Regression

0 1y x 2~ (0, )N

01

y

x

y

x

Page 87: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Simple Linear Regression

01

y

x

y

x

01ˆ

ˆ

y

x

0 1y x 2~ (0, )N

21

ˆp

i ii

SSE y y

Minimize

Page 88: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Mean Squared Error (MSE)

01

y

x

y

x

01ˆ

ˆ

y

x

0 1y x 2~ (0, )N

21

ˆp

i ii

SSE y y

Minimize

2

SSEMSE

p

2

SSEMSE

p

Loss 2 degrees

of freedom

Page 89: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Variance Estimation

2ˆSSE

MSEmp

Loss m degrees of freedom

m: #parameters of the model

Page 90: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Number of Parameters

w1 w2 wm

x1 x2 xnx =

1

( ) ( )m

ii

iy wf

x x

m#degrees of freedom:

Page 91: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Effective Number of Parameters ()

w1 w2 wm

x1 x2 xnx =

1

( ) ( )m

ii

iy wf

x x The projection Matrix

1 Tp

P I Φ ΦA

T A Φ Φ Λ

Λ 0

( )p trace P m

Page 92: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Effective Number of Parameters ()

The projection Matrix

1 Tp

P I Φ ΦA

T A Φ Φ Λ

Λ 0

1( ) Tptrace trace AP I Φ Φ

1 Tp trace ΦA Φ

( ) ( ) ( )trace trace trace A B A BFacts:( ) ( )trace traceAB BA

1 TTp trace

Φ Φ ΦΦ

Pf)

1T Tp trace

Φ ΦΦ Φ

mp trace I

p m ( )p trace P m

Page 93: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Regularization

((

1

)

2

)p

k

k

k fy

x2

1

m

iiiw

Empirial

Er

Model s

penalror ts

yCo t

2

( )

1 1

( )p m

ki

k

k

iiwy

x

Penalize models withlarge weightsSSE

( )p trace PThe effective number of parameters:

Page 94: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Regularization

((

1

)

2

)p

k

k

k fy

x2

1

m

iiiw

Empirial

Er

Model s

penalror ts

yCo t

2

( )

1 1

( )p m

ki

k

k

iiwy

x

Penalize models withlarge weightsSSE Without penalty

(i=0), there are m degrees of freedom

to minimize SSE (Cost).

The effective number of

parameters =m.

Without penalty (i=0), there are m

degrees of freedom to minimize SSE

(Cost).

The effective number of

parameters =m.

The effective number of parameters:

( )p trace P

Page 95: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Regularization

((

1

)

2

)p

k

k

k fy

x2

1

m

iiiw

Empirial

Er

Model s

penalror ts

yCo t

2

( )

1 1

( )p m

ki

k

k

iiwy

x

Penalize models withlarge weightsSSE

With penalty (i>0), the liberty to

minimize SSE will be reduced.

The effective number of

parameters <m.

With penalty (i>0), the liberty to

minimize SSE will be reduced.

The effective number of

parameters <m.

The effective number of parameters:

( )p trace P

Page 96: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Variance Estimation

2ˆ MSE

Loss degrees of freedom

The effective number of parameters:

( )p trace P

SSE

p

Page 97: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Variance Estimation

2ˆ MSE

The effective number of parameters:

( )p trace P

( )

SSE

trace

P

Page 98: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

Model Selection

Page 99: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Model Selection Goal

– Choose the fittest model Criteria

– Least prediction error Main Tools (Estimate Model Fitness)

– Cross validation– Projection matrix

Methods– Weight decay (Ridge regression)– Pruning and Growing RBFN’s

Page 100: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Empirical Error vs. Model Fitness

Minimize Prediction Error

• Ultimate Goal Generalization

• Goal of Our Learning Procedure

Minimize Empirical Error

Minimize Prediction Error

(MSE)

Page 101: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Estimating Prediction Error

When you have plenty of data use independent test sets– E.g., use the same training set to

train different models, and choose the best model by comparing on the test set.

When data is scarce, use– Cross-Validation– Bootstrap

Page 102: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Cross Validation

Simplest and most widely used method for estimating prediction error.

Partition the original set into several different ways and to compute an average score over the different partitions, e.g.,– K-fold Cross-Validation– Leave-One-Out Cross-Validation– Generalize Cross-Validation

Page 103: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

K-Fold CV

Split the set, say, D of available input-output patterns into k mutually exclusive subsets, say D1, D2, …, Dk.

Train and test the learning algorithm k times, each time it is trained on D\Di and tested on Di.

Page 104: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

K-Fold CV

Available DataAvailable Data

Page 105: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

K-Fold CV

Available DataAvailable DataD1 D2 D3. . . Dk

D1 D2 D3. . . Dk

D1 D2 D3. . . Dk

D1 D2 D3. . . Dk

D1 D2 D3. . . Dk

Test Set

Training Set

Estimate 2

Page 106: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Leave-One-Out CV

Split the p available input-output patterns into a training set of size p1 and a test set of size 1.

Average the squared error on the left-out pattern over the p possible ways of partition.

A special case of k-fold CV.

Page 107: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Error Variance Predicted by LOO

A special case of k-fold CV.

22

1

1ˆ ( )

p

LOO i i ii

y fp

x

( , ) : 1,2, ,k kD y k p x

\ ( , )i i iD D y x

Available input-output patterns.

1, ,i p Training sets of LOO.

if Function learned using Di as training set.

The estimate for the variance of prediction error using LOO:

Error-square for the left-out element.

Page 108: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Error Variance Predicted by LOO

A special case of k-fold CV.

22

1

1ˆ ( )

p

LOO i i ii

y fp

x

( , ) : 1,2, ,k kD y k p x

\ ( , )i i iD D y x

Available input-output patterns.

1, ,i p Training sets of LOO.

if Function learned using Di as training set.

The estimate for the variance of prediction error using LOO:

Error-square for the left-out element.

Given a model, the function with least empirical error for Di.

Given a model, the function with least empirical error for Di.

As an index of model’s fitness.We want to find a model also minimize

this.

As an index of model’s fitness.We want to find a model also minimize

this.

Page 109: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Error Variance Predicted by LOO

A special case of k-fold CV.

22

1

1ˆ ( )

p

LOO i i ii

y fp

x

( , ) : 1,2, ,k kD y k p x

\ ( , )i i iD D y x

Available input-output patterns.

1, ,i p Training sets of LOO.

if Function learned using Di as training set.

The estimate for the variance of prediction error using LOO:

Error-square for the left-out element.

How to estimate?How to estimate?

Are there any efficient ways?

Are there any efficient ways?

Page 110: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Error Variance Predicted by LOO

22

1

1ˆ ( )

p

LOO i i ii

y fp

x

Error-square for the left-out element.

2 2(1

ˆ ˆ ( )) ˆTLOO di

pag y PP Py

Page 111: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Generalized Cross-Validation

2 2(1

ˆ ˆ ( )) ˆTLOO di

pag y PP Py

( )( ) p

tracediag

p

PP I

2

22

ˆ ˆˆ

( )

T

GCV

p

trace

y P y

P

2

2

ˆ ˆ

( )

Tp

p

y P y

Page 112: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

More Criteria Based on CV

22

2

ˆ ˆˆ

( )

T

GCV

p

p

y P y GCV

(Generalized CV)

22 ˆ ˆ

ˆT

UEV p

y P y UEV

(Unbiased estimate of variance)

22 ˆ ˆ

ˆT

FPE

p

p p

y P y FPE(Final Prediction Error)

Akaike’sInformation Criter

ion

BIC(Bayesian Information Criterio)

22 ln( ) 1 ˆ ˆ

ˆT

BIC

p p

p p

y P y

Page 113: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

More Criteria Based on CV

22

2

ˆ ˆˆ

( )

T

GCV

p

p

y P y

22 ˆ ˆ

ˆT

UEV p

y P y

22 ˆ ˆ

ˆT

FPE

p

p p

y P y

22 ln( ) 1 ˆ ˆ

ˆT

BIC

p p

p p

y P y

2ˆUEV

p

p

2ˆUEV

p

p

2ln( ) 1ˆUEV

p p

p

2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC

Page 114: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

More Criteria Based on CV

22

2

ˆ ˆˆ

( )

T

GCV

p

p

y P y

22 ˆ ˆ

ˆT

UEV p

y P y

22 ˆ ˆ

ˆT

FPE

p

p p

y P y

22 ln( ) 1 ˆ ˆ

ˆT

BIC

p p

p p

y P y

2ˆUEV

p

p

2ˆUEV

p

p

2ln( ) 1ˆUEV

p p

p

2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC

?P

Page 115: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Regularization

((

1

)

2

)p

k

k

k fy

x2

1

m

iiiw

Empirial

Er

Model s

penalror ts

yCo t

2

( )

1 1

( )p m

ki

k

k

iiwy

x

Penalize models withlarge weightsSSE

Standard Ridge Regression,

,i i

Page 116: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Regularization

((

1

)

2

)p

k

k

k fy

x2

1

m

iiiw

Empirial

Er

Model s

penalror ts

yCo t

2

( )

1 1

( )p m

ki

k

k

iiwy

x

Penalize models withlarge weightsSSE

2

1

m

iiw

2

1

m

iiw

Standard Ridge Regression,

,i i

Page 117: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Solution Review

2

(( ) )

1 1

2

1

min p m m

ki

k

ki i

i i

C w wy

x

1* Tw A Φ yT

m Φ Φ IA1 T

p P I Φ ΦA

Used to compute model selection criteria

Page 118: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

22( ) (1 2 ) xy x x x e

50p

2~ (0,0.2 )N

Width of RBF r = 0.5Width of RBF r = 0.5

Page 119: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC

2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC

22( ) (1 2 ) xy x x x e

50p

2~ (0,0.2 )N

Width of RBF r = 0.5Width of RBF r = 0.5

Page 120: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC

2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC

22( ) (1 2 ) xy x x x e

50p

2~ (0,0.2 )N

Width of RBF r = 0.5Width of RBF r = 0.5

How the determine the

optimal regularization

parameter effectively?How the determine the

optimal regularization

parameter effectively?

Page 121: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Optimizing the Regularization Parameter

Re-Estimation Formula

2 1 2

1

ˆˆ

( )

T

T

trace

trace

A

w

Ay P y

A Pw

Page 122: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Local Ridge Regression

Re-Estimation Formula

2 1 2

1

ˆˆ

( )

T

T

trace

trace

A

w

Ay P y

A Pw

Page 123: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF

( )p trace P( )p trace P

0.05r

Page 124: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF

( )p trace P( )p trace P

0.05r

Page 125: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF0.05r

There are two local-minima.

2 1 2

1

ˆˆ

( )

T

T

trace

trace

A

w

Ay P y

A Pw

Using the about re-estimation formula, it will be stuck at the nearest local minimum.

That is, the solution depends on the initial setting.

Page 126: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF0.05r

There are two local-minima.

2 1 2

1

ˆˆ

( )

T

T

trace

trace

A

w

Ay P y

A Pw

ˆ(0) 0.01

lim 0.1ˆ( )t

t

Page 127: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF0.05r

There are two local-minima.

2 1 2

1

ˆˆ

( )

T

T

trace

trace

A

w

Ay P y

A Pw

5ˆ(0) 10

4ˆ( )lim 2.1 10t

t

Page 128: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF0.05r

RMSE: Root Mean Squared ErrorIn real case, it is not available.

RMSE: Root Mean Squared ErrorIn real case, it is not available.

Page 129: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Example

sin(12 )y x 2~ (0,0.1 )N50m p

— Width of RBF0.05r

RMSE: Root Mean Squared ErrorIn real case, it is not available.

RMSE: Root Mean Squared ErrorIn real case, it is not available.

Page 130: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Local Ridge Regression

2

(( ) )

1 1

2

1

min p m m

ki

k

ki i

i i

C w wy

x

Standard Ridge Regression

2

(( ) )

1 1

2

1

min p m m

i ik

ik i

ik

i

C w wy

x

Local Ridge Regression

Page 131: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Local Ridge Regression

2

(( ) )

1 1

2

1

min p m m

ki

k

ki i

i i

C w wy

x

Standard Ridge Regression

2

(( ) )

1 1

2

1

min p m m

i ik

ik i

ik

i

C w wy

x

Local Ridge Regression

j implies that j() can be removed.

j implies that j() can be removed.

Page 132: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Solutions1* Tw A Φ y

Tm Φ Φ IA

1 Tp

P I Φ ΦAUsed to compute model selection criteria

T A Φ Φ Λ

TA Φ Φ Linear Regression

Standard Ridge Regression

Local Ridge Regression

Page 133: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Optimizing the Regularization Parameters

Incremental Operation

P: The current projection Matrix.

Pj: The projection Matrix obtained by removing j().

Tj j j j

j Tj j j j

P φ φ PP P

φ P φ

2

22

ˆ ˆˆ

( )

T

GCV

p

trace

y P y

P

Page 134: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Optimizing the Regularization Parameters

2

22

ˆ ˆˆ

( )

T

GCV

p

trace

y P y

P

Solve 2ˆ

0GCV

j

Subject to 0j

Tj j j j

j Tj j j j

P φ φ PP P

φ P φ

Page 135: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Optimizing the Regularization Parameters

2Tja y P y2T Tj j j jby P φ y P φ2 2( )T T

j j j j jc φ P φ y P φ

( )jtrace P2T

j j j φ P φ

if

if and ˆ0 if and

if

c bTj j j b a

c bTjj j j b a

c b c bT Tj j j j j jb a b a

a b

a b

a b

φ P φ

φ P φ

φ P φ φ P φ

if

if and ˆ0 if and

if

c bTj j j b a

c bTjj j j b a

c b c bT Tj j j j j jb a b a

a b

a b

a b

φ P φ

φ P φ

φ P φ φ P φ

Solve 2ˆ

0GCV

j

Subject to 0j

Page 136: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Optimizing the Regularization Parameters

2Tja y P y2T Tj j j jby P φ y P φ2 2( )T T

j j j j jc φ P φ y P φ

( )jtrace P2T

j j j φ P φ

if

if and ˆ0 if and

if

c bTj j j b a

c bTjj j j b a

c b c bT Tj j j j j jb a b a

a b

a b

a b

φ P φ

φ P φ

φ P φ φ P φ

if

if and ˆ0 if and

if

c bTj j j b a

c bTjj j j b a

c b c bT Tj j j j j jb a b a

a b

a b

a b

φ P φ

φ P φ

φ P φ φ P φ

Solve 2ˆ

0GCV

j

Subject to 0j Remove j()

Remove j()

Page 137: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

The Algorithm

Initialize i’s. – e.g., performing standard ridge regression.

Repeat the following until GCV converges:– Randomly select j and compute– Perform local ridge regression– If GCV reduce & remove j()

ˆj

ˆj

Page 138: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

References

Kohavi, R. (1995), "A study of cross-validation and bootstrap for accuracy estimation and model selection," International Joint Conference on Artificial Intelligence (IJCAI).

Mark J. L. Orr (April 1996), Introduction to Radial Basis Function Networks, http://www.anc.ed.ac.uk/~mjo/intro/intro.html.

Page 139: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function

Introduction to Radial Basis Function Networks

Incremental Operations