32
1 Formal Evaluation Techniques Chapter 7

1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

Embed Size (px)

Citation preview

Page 1: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

1

Formal Evaluation Techniques

Chapter 7

Page 2: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

2

• test set error rates, confusion matrices, lift charts

• Focusing on formal evaluation methods for supervised learning and unsupervised clustering

Page 3: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

3

7.1 What Should Be Evaluated?

1. Supervised Model

2. Training Data

3. Attributes

4. Model Builder

5. Parameters

6. Test Set Evaluation

Page 4: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

4

ModelBuilder

SupervisedModel EvaluationData

Instances

Attributes

Parameters

Test Data

Training Data

Page 5: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

5

Single-Valued Summary Statistics

• Mean

• Variance

• Standard deviation

7.2 Tools for Evaluation

Page 6: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

6

-99 -3 -2 -1 0 1 2 3 99

13.54%

34.13%

2.14%

34.13%

13.54%

2.14%

.13%.13%

f(x)

x

The Normal Distribution

Page 7: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

7

Normal Distributions and Sample Means

• A distribution of means taken from random sets of independent samples of equal size are distributed normally.

• Any sample mean will vary less than two standard errors from the population mean 95% of the time.

Page 8: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

8

Computing the Standard Error

• The population variance is estimated by dividing the sample variance by the

sample size.

• The standard error is computed by taking the square root of the

estimated population variance.

Page 9: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

9

Population

Sample 2

Sample 1

X2

X2

X10

X9

X8

X7

X6

X5

X4

X3

X1

X7

X4

X4

X9

X10

Sample 3

X4

X4

X10

Page 10: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

10

A Classical Model for Hypothesis Testing

• Hypothesis: educated guess about the outcome of some event

• Experimental group, Control group

• Null hypothesis– There is no significant difference in the mean

increase or decrease of total allergic reactions per day between patients in the group receiving treatment X and patients in the group receiving the placebo.

Page 11: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

11

A Classical Model for Hypothesis Testing

sizes. sampleingcorrespondareand

means; respectivetheforscoresvarianceareand

samples;tindependentheformeanssampleareand

and; score cesignifican theis

21

21

21

nn

XX

P

where

vv

)//( 2211

21

nvnv

XXP

To be 95% confident, P must >= 2

Page 12: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

12

Table 7.1 • A Confusion Matrix for the Null Hypothesis

Computed Computed Accept Reject

Accept Null True Accept Type 1 ErrorHypothesis

Reject Null Type 2 Error True RejectHypothesis

Page 13: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

13

7.3 Computing Test Set Confidence Intervals

instances set test of #

errors set test of # )( e Error RatClassifier E

Page 14: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

14

Computing 95% Confidence Intervals

1. Given a test set sample S of size n and error rate E

2. Compute sample variance as V= E(1-E)

3. Compute the standard error (SE) as the square root of V divided by n.

4. Calculate an upper bound error as E + 2(SE)

5. Calculate a lower bound error as E - 2(SE)

Page 15: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

15

Three general comments

• The rest data has been randomly chosen from the pool of all possible test set instances

• Test, training, and validation data must represent disjoint sets

• The instances in each class should be distributed in the training, validation, and test data as they are seen in the entire dataset

Page 16: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

16

7.4 Comparing Supervised Learner Models

Page 17: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

17

Comparing Models with Independent Test Data

where

E1 = The error rate for model M1

E2 = The error rate for model M2

q = (E1 + E2)/2

n1 = the number of instances in test set A

n2 = the number of instances in test set B

 

)2/11/1)(1(

21

nnqq

EEP

Page 18: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

18

7.5 Attribute Evaluation

Page 19: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

19

Locating Redundant Attributes with Excel

• Correlation Coefficient

• Positive Correlation• Negative Correlation• Curvilinear Relationship (curve line)

–Two attributes having a low r value may still have a curvilinear

Page 20: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

20

Positive Correlation r=1

0

2

4

6

8

10

12

0 1 2 3 4 5 6 7 8 9 10

Attribute A

Att

rib

ute

B

Page 21: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

21

Negative Correlation r=-1

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

Attribute A

Att

rib

ute

B

Page 22: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

22

Curvilinear Relationship r=0

0

5

10

15

20

25

30

0 2 4 6 8 10 12

Attribute A

Att

rib

ute

B

Page 23: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

23

Creating a Scatterplot Diagram with MS Excel

Page 24: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

24

Blood Pressure vs. Cholesterol

0

50

100

150

200

250

300

350

400

450

0 20 40 60 80 100 120 140 160 180 200

Blood Pressure

Ch

ole

ste

rol

Page 25: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

25

Hypothesis Testing for Numerical Attribute Significance

jjii

ji

ininstancesofnumber theisand in instancesofnumber theis

. attributefor variancej class theand variancei class the

.attributeformeanjclass theis andmeaniclass theis i

where

CC

Aisis

Aj

XX

nn

vv

)//( jnjviniv

jXiX

ijP

Page 26: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

26

Table 7.2 • Cardiology Patient Data: Numerical Attribute Significance

Class Class ESX Attribute Hypothesis Test Sick Healthy Significance for Significance

Age (Mean) 56.50 52.50 0.45 4.076 (Sd) 7.96 9.55

BP (Mean) 134.40 129.30 0.29 2.511 (Sd) 18.73 16.17

Chol (Mean) 251.09 242.23 0.17 1.495 (Sd) 49.46 53.55

MHR (Mean) 139.10 158.47 0.85 7.955 (Sd) 22.60 19.1

Peak (Mean) 1.59 0.58 0.86 8.001 (Sd) 1.30 0.78

Page 27: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

27

7.6 Unsupervised Evaluation Techniques

• Unsupervised Clustering for Supervised Evaluation– If the instances cluster into the predefined classes contained in the training data, a supervised learner model built with the training data is likely to perform well.

• Supervised Evaluation for Unsupervised Clustering–Designate each formed cluster as a class–Build a supervised learner model by choosing a random sampling of instances from each class–Test the supervised learner model with the remaining instances

• Additional Methods

Page 28: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

28

Additional Methods

• Designate all instances as training data

• Apply an alternative technique’s measure of cluster quality

• Create your own measure of cluster quality

• Perform a between-cluster attribute-value comparison.

Page 29: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

29

7.7 Evaluating Supervised Models with Numeric Output

Page 30: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

30

Mean Squared Error

where for the ith instance,

ai = actual output value

ci = computed output value

 

 

n

cacacacamse

2) ( ... )(... 2) ( 2) ( nni i2211

Page 31: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

31

Mean Absolute Error

where for the ith instance,

ai = actual output value

ci = computed output value

 

 

n

cacacamae

| | .... | | | | nn2211

Page 32: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised

32

Table 7.3 • Absolute and Squared Error

Instance Life Ins. Promo. Computed Absolute SquaredNumber Actual Output Output Error Error

1 0.0 0.024 0.024 0.00052 1.0 0.998 0.002 0.00003 0.0 0.023 0.023 0.00054 1.0 0.986 0.014 0.00025 1.0 0.999 0.001 0.00006 0.0 0.050 0.050 0.00257 1.0 0.999 0.001 0.00008 0.0 0.262 0.262 0.06869 0.0 0.060 0.060 0.003610 1.0 0.997 0.003 0.000011 1.0 0.999 0.001 0.000012 1.0 0.776 0.224 0.050213 1.0 0.999 0.001 0.000014 0.0 0.023 0.023 0.000515 1.0 0.999 0.001 0.0000