13
ICP3083 L.I. Kuncheva Lecture 8: Classifiers Linear discriminant classifier Rule-based classifiers 1

Lecture8 classifiers ldc_rules

Embed Size (px)

Citation preview

Page 1: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

Lecture 8: Classifiers

Linear discriminant classifier

Rule-based classifiers

1

Page 2: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

Classifier Models

1. Nearest mean classifier

2. Linear discriminant classifier (LDC)

3. Rule-based classifiers

4. k-Nearest Neighbour Classifier (k-nn)

5. Decision tree classifier

6. Support Vector Machine classifier (SVM)

7. Classifier Ensembles

2

Page 3: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

The name shows the type of discriminant functions.

Linear discriminant classifier

ncncccc

nn

xaxaxaag

xaxaxaag

...)(

...

...)(

22110

1212111101

x

x

The coefficients aij could be any: positive, negative or zero.

3

Page 4: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

An example: two classes, 1-d feature space

.2, cx

xxg

xxg

23)(

2)(

2

1

)(2 xg

)(1 xg

Classification regions

x

Q1. Find the threshold pointthat determines the

classification regions.

3

1

232

)()( 21

x

xx

xgxg

4

Page 5: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

An example: three classes, 1-d feature space

3, cx

3)(

23)(

2)(

3

2

1

xg

xxg

xxg

)(2 xg)(1 xg

)(3 xg

Q2. Draw a graph and find the classification regions

Classification regions:

Class 1: from 1 to Class 2: from - to 0Class 3: from 0 to 1

discriminant functions

5

Page 6: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

2,2 cx

Each discriminant function is a plane, e.g.

An example: two classes, 2-d feature space

212

211

41)(

325)(

xxg

xxg

x

x

6

Page 7: Lecture8 classifiers ldc_rules

ICP3083 L.I. Kuncheva

Linear discriminant classifier (LDC)

• How do we get the discriminant functions?

We train the classifier so that the separability between the classes

is maximised. To train an LDC means to find all the coefficients aijin the discriminant functions.

• What do the classification regions of LDC look like?

For 1-d feature space these are intervals on the x-axis, one interval per class.

For 2-d feature space the classification regions are divided by straight lines. In a 2-class problem, the discriminant functions define a single line (classification boundary) that separates the two classification regions.

7

Page 8: Lecture8 classifiers ldc_rules

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

IPS3083/4083 L.I. Kuncheva

if

2.0

5.0

25.0

2

1

1

x

x

x

class “green”

class = “grey”

then

if

35.0

55.0

2

1

x

x

class “blue”

then

Rule-based classifiers

8

Page 9: Lecture8 classifiers ldc_rules

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

IPS3083/4083 L.I. Kuncheva

Rule-based classifiers 831 Grey 42%480 Green 24%689 Blue 34%

Q1. What would be the error rate of the Largest Prior (Majority) classifier?

Majority classifier will label all as Grey: error = 100-42% = 58% error rate

9

Zero-Rclassifier (0 rule)= Largest prior classifier

Page 10: Lecture8 classifiers ldc_rules

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

IPS3083/4083 L.I. Kuncheva

Rule-based classifiers 831 Grey 42%480 Green 24%689 Blue 34%

ONE-R classifier (1 rule)

Check each feature separately and calculate the accuracy at each split. Keep the ONE split with the highest accuracy.

10

Page 11: Lecture8 classifiers ldc_rules

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.6425

IPS3083/4083 L.I. Kuncheva

Rule-based classifiers

Label here as Grey

Label here as Blue

Mo

ve a

cro

ss t

he

wh

ole

sp

an o

f th

e f

eat

ure

Grey 147Green 57Blue 635

Grey 650Green 444Blue 67

11

Page 12: Lecture8 classifiers ldc_rules

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.6425

IPS3083/4083 L.I. Kuncheva

Rule-based classifiers

Even though the error is estimated on the TRAINING data only (resubstitution) the classifier is very robust, i.e., its generalisation is good!

The resubstitutionerror rate is 100-64.25 = 35.75%

12

Page 13: Lecture8 classifiers ldc_rules

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.6425

IPS3083/4083 L.I. Kuncheva

Rule-based classifiers

Posterior probabilities for ANY x in the respective region

Grey 147/839 = 0.18Green 57/839 = 0.07Blue 635/839 = 0.75

Grey 650/1161 = 0.56Green 444/1161 = 0.38Blue 67/1161 = 0.06

13