30
EE 225D N.MORGAN / B.GOLD LECTURE 8 8.1 LECTURE ON PATTERN RECOGNITION University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Pattern Classification Lecture 8

Lecture 8 - University of California, Berkeley

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

EE 2

N.M 8.1

ECTURE ON PATTERN RECOGNITION

PE Spring,1999

25D

ORGAN / B.GOLD LECTURE 8

L

University of CaliforniaBerkeley

College of EngineeringDepartment of Electrical Engineering

and Computer Sciences

rofessors : N.Morgan / B.GoldE225D

Pattern Classification

Lecture 8

EE 2

N.M 8.2

ECTURE ON PATTERN RECOGNITION

nitionporal sequence

lass labels used

: class labels not

25D

ORGAN / B.GOLD LECTURE 8

L

Speech Pattern Recog•Soft pattern classification plus tem

integration

•Supervised pattern classification: c

in training

•Unsupervised pattern classification

available or used

EE 2

N.M 8.3

ECTURE ON PATTERN RECOGNITION

on

1 k K<≤

ωk

25D

ORGAN / B.GOLD LECTURE 8

L

Feature

Extraction

Pattern

Feature

Vector

Classificati

x1

x2

xd

EE 2

N.M 8.4

ECTURE ON PATTERN RECOGNITION

assifier

et, compare with

25D

ORGAN / B.GOLD LECTURE 8

L

•Training: learning parameters of cl

•Testing: classify independent test s

labels and score

EE 2

N.M 8.5

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

EE 2

N.M 8.6

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

EE 2

N.M 8.7

ECTURE ON PATTERN RECOGNITION

eria

25D

ORGAN / B.GOLD LECTURE 8

L

Feature Extraction Crit

•Class discrimination

•Generalization

•Parsimony (efficiency)

EE 2

N.M 8.8

ECTURE ON PATTERN RECOGNITION

ent gains

E

t

25D

ORGAN / B.GOLD LECTURE 8

L

plosive + vowel energies for 2 differ

t

t)( )

E t( )

EE 2

N.M 8.9

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

t∂∂ CE t( )log

t∂∂ Clog E t( )log+( )=

t∂∂ E t( )log=

EE 2

N.M 8.10

ECTURE ON PATTERN RECOGNITION

tion on training

tion to test set are

25D

ORGAN / B.GOLD LECTURE 8

L

Feature Vector Size

•Best representations for discrimina

set are large (highly dimensioned)

•Best representations for generaliza

(typically) succinct)

EE 2

N.M 8.11

ECTURE ON PATTERN RECOGNITION

tion

L transform,

)

n

25D

ORGAN / B.GOLD LECTURE 8

L

Dimensionality Reduc

•Principal components (i.e., SVD, K

eigenanalysis ...)

•Linear Discriminant Analysis (LDA

•Application-specific knowledge

•Feature Selection via PR Evaluatio

EE 2

N.M 8.12

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

x x x

x x x

x x

o o

o o

o o

o o

f1

f2

EE 2

N.M 8.13

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

EE 2

N.M 8.14

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

PR Methods

•Minimum Distance

•Discriminant Functions

•Linear Discriminant

•Nonlinear Discriminant

(e.g, quadratic, neural networks)

•Statistical Discriminant Functions

EE 2

N.M 8.15

ECTURE ON PATTERN RECOGNITION

ent

t closest to new

plicit statistical

mplicates this

25D

ORGAN / B.GOLD LECTURE 8

L

Minimum Distance•Vector or matrix representing elem

•Define a distance function

•Choose the class of stored elemen

input

•Choice of distance equivalent to im

assumptions

•For speech, temporal variability co

EE 2

N.M 8.16

ECTURE ON PATTERN RECOGNITION

xTx ziTzi 2xTzi–+( )

i

25D

ORGAN / B.GOLD LECTURE 8

L

zi template vector (prototype)=

x input vector=

Choose i to minimize distance

argimin x zi–( )T x zi–( ) argimin x zi–( )T x zi–( ) argimin= =

argimaxzi

Tzi 2xTzi–2–

------------------------- argimax xTzi

12---zi

Tz–=

If ziTzi 1 for all i= argimax xTzi( )⇒

EE 2

N.M 8.17

ECTURE ON PATTERN RECOGNITION

ance

, discrimination)

ace

25D

ORGAN / B.GOLD LECTURE 8

L

Problems with Min Dist

•Proper scaling of dimensions (size

•For high dim, sparsely sampled sp

EE 2

N.M 8.18

ECTURE ON PATTERN RECOGNITION

stance

t of infinite

f optimum

potentially large

25D

ORGAN / B.GOLD LECTURE 8

L

Decision Rule for Min Di

•Nearest Neighbor (NN) - in the limi

samples, at most twice the error o

classifier

•k-Nearest Neighbor (kNN)

•Lots of storage for large problems;

searches

EE 2

N.M 8.19

ECTURE ON PATTERN RECOGNITION

to reduce its

variance often a

recognition

25D

ORGAN / B.GOLD LECTURE 8

L

Some Opinions

•Better to throw away bad data than

weight

•Dimensionality-reduction based on

bad choice for supervised pattern

EE 2

N.M 8.20

ECTURE ON PATTERN RECOGNITION

sect class, min

line, for 3 is

ωωωωTx ωωωω0+ + 0=

25D

ORGAN / B.GOLD LECTURE 8

L

Discriminant Analysi•Discriminant functions max for corr

for others

•Decision surface between classes

•Linear decision surface for 2-dim is

plane; generally called hyperplane

•For 2 classes, surface at

•2-class quadratic case, surface at

ωωωωTx ωωωω0+ 0=

xTWx

EE 2

N.M 8.21

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

EE 2

N.M 8.22

ECTURE ON PATTERN RECOGNITION

ctions

25D

ORGAN / B.GOLD LECTURE 8

L

Training Discriminant Fun

•Minimum distance

•Fisher linear discriminant

•Gradient learning

EE 2

N.M 8.23

ECTURE ON PATTERN RECOGNITION

- ANNs

25D

ORGAN / B.GOLD LECTURE 8

L

Generalized Discriminators

•McCulloch Pitts neural model

•Rosenblatt Perceptron

•Multilayer Systems

EE 2

N.M 8.24

ECTURE ON PATTERN RECOGNITION

erceptron

yo

25D

ORGAN / B.GOLD LECTURE 8

L

The Perceptron

McCulloch-Pitts Neuron - Rosenblatt P

+

xd

x2

x1

bias

wd

w2

w1

EE 2

N.M 8.25

ECTURE ON PATTERN RECOGNITION

ncele will converge in a

k)

k)

25D

ORGAN / B.GOLD LECTURE 8

L

Perceptron ConvergeIf classes are linearly separable the following rufinite number of steps :

For each pattern x at time step k;

if

x k( ) class 1, ωT k( )x k( ) 0≤∈ ω k 1+( ) = ω k( ) cx(+⇒

x k( ) class 2, ωT k( )x k( ) 0≥∈ ω k 1+( ) = ω k( ) cx(–⇒

else

ω k 1+( ) = ""ω k( )

EE 2

N.M 8.26

ECTURE ON PATTERN RECOGNITION

s :(DAID, 1961)

I/On

25D

ORGAN / B.GOLD LECTURE 8

L

Multilayer Perceptron•Heterogeneous, “hard” nonlinearity

•Homogeneous, “soft” nonlinearity

(“modern” MLP)

PerceptroGaus. classsubsets

feature

EE 2

N.M 8.27

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

EE 2

N.M 8.28

ECTURE ON PATTERN RECOGNITION

y

25D

ORGAN / B.GOLD LECTURE 8

L

f y( )

f y( ) 11 e y–+--------------- (sigmoid)=

0 f y( ) 1<<

EE 2

N.M 8.29

ECTURE ON PATTERN RECOGNITION

25D

ORGAN / B.GOLD LECTURE 8

L

EE 2

N.M 8.30

ECTURE ON PATTERN RECOGNITION

ples: overfitting

25D

ORGAN / B.GOLD LECTURE 8

L

Some PR Issues

•Testing on the training set

•Training on the test set

•No. parameters vs no. training exam

and overtraining