Parametric measures to estimate and predict performance of identification techniques Amos Y. Johnson & Aaron Bobick STATISTICAL METHODS FOR COMPUTATIONAL

Parametric measures to estimate and predict performance of identification techniques

Amos Y. Johnson & Aaron Bobick

STATISTICAL METHODS FOR COMPUTATIONAL EXPERIMENTS

IN VISUAL PROCESSING & COMPUTER VISIONNIPS 2002

Setup – for example

Given a particular human identification technique


Given a particular human identification technique This technique measures 1 feature (q) from n individuals

n321 q ... q q qx

- 1D Feature Space -


Given a particular human identification technique This technique measures 1 feature (q) from n individuals Measure the feature again


n321 q ... q q qx

''''

n321q ... q q q




n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe




n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe

For template

Target




n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe

For template

Target Imposters

Question

For a given human identification technique, how should identification performance be evaluated?


n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe

For template

Target Imposters

Possible ways to evaluate performance

For a given classification threshold, compute False accept rate (FAR) of impostors Correct accept rate (HIT) of genuine targets


n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe

For template

Target Imposters


For various classification thresholds, plot Multiple FAR and HIT rates (ROC curve)


For various classification thresholds, plot Multiple FAR and HIT rates (ROC curve) Compute area under a ROC curve (AUROC)

Probability of correctclassification


For various classification thresholds, plot Multiple FAR and HIT rates (ROC curve) Compute 1 - area under a ROC curve (1 -AUROC)

Probability of incorrectclassification

Problem Database size

If the database is not of sufficient size, then results may not estimate or predict performance on a larger population of people.

1 - AUROC

Our Goal

To estimate and predict identification performance with a small number subjects

1 - AUROC

Our Solution

Derive two parametric measures Expected Confusion (EC) Transformed Expected-Confusion (EC*)

Our Solution


Probability that an imposter’s feature vector is withinthe measurement variation of a target’s template

Our Solution


Probability that an imposter’s feature vector is closer to a target’s template, than the target’s feature vector

Our Solution


EC* = 1 - AUROC

Expected Confusion

Probability that an imposter’s feature vector is within the measurement variation of a target’s template


n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe

For template

Target Imposters

Expected Confusion - Uniform

The templates of the n individuals, are from an uniform density Pp(x) = 1/n


n321 q ... q q qx

P(x)

1/nPp(x)


The measurement variation of a template is also uniform Pi(x) = 1/m


n321 q ... q q qx

P(x)

1/nPp(x)

1/mPi(x)


The probability that an imposter’s feature vector is within the measurement variation of template q3 is the area of overlap

True if m << n


n321 q ... q q qx

P(x)

1/nPp(x)

1/mPi(x)

n

mA )q( 3


The probability that an imposter’s feature vector is within the measurement variation of any template q

True if m << n

n321 q ... q q qx

P(x)

1/nPp(x)

1/mPi(x)

n

mA )q( 3

n

mdqPAEC p

)q()q(

Following the same analysis, for the multidimensional Gaussian case

Expected Confusion - Gaussian

),()q( ppp Np : Population density

),q()( ii Nxp : Measurement variation


Following the same analysis, for the multidimensional Gaussian case True if the measurement variation is significantly less then the population variation

2/1

2/1

||

||EC

p

i

Probability that an imposter’s feature vector is within the measurement variation of a target’s template


Relationship to other metrics Mutual Information

The negative natural log of the EC is the mutual information of two Gaussian densities

)|ln(|)|ln(|)||

||ln( 2/12/1

2/1

2/1

ipp

i

Transformed Expected-Confusion

Probability that an imposter’s feature vector is closer to a target’s template, than the target’s feature vector


n321 q ... q q qx

''''

n321q ... q q q

Gallery

Probe

For template

Target Imposters


First: We find the probability that a target’s feature vector is some distance k away from its template

n321 q ... q q qx

''''

n321q ... q q q

For template

Target Imposters

k

)( dkkpqt

n321 q ... q q qx

''''

n321q ... q q q

For template

Target Imposters

k

)( dkkpqt


Second: We find the probability that an imposter’s feature vector is less than or equal to that distance k

k

qim dvvp

0

)(

n321 q ... q q qx

''''

n321q ... q q q

Target Imposters

k


Therefore: The probability that an imposter’s feature is closer to the target’s template, than the target’s feature (for a distance k) is

dkdqdvvpkpqpk

qim

qtp

00

)()()(

n321 q ... q q qx

''''

n321q ... q q q

Target Imposters

k


Therefore: The probability that an imposter’s feature is closer to the target’s template, than the target’s feature (for any distance k) is

dkdqdvvpkpqpk

qim

qtp

00

)()()(

x

''''

n321q ... q q q


Therefore: The expected value of this probability over all target’s templates is

dkdqdvvpkpqpk

qim

qtp

00

)()()(

n321 q ... q q q


Next: Replace the density of the distance between a target’s feature-vectors and its template q

dkdqdvvpkpqpECk

qim

qtp

00

)()()(*

)(kpt


Answer: Probability that an imposter’s feature vector is closer to a target’s template, than the target’s feature vector

dkdqdvvpkpqpECk

qim

qtp

00

)()()(*


This probability can be shown to be one minus the area under a ROC curve

Following the analysis of Green and Swets (1966)

dkdqdvvpkpqpECk

qim

qtp

00

)()()(*


Integrate: With these assumptions

dkdqdvvpkpqpECk

qim

qtp

00

)()()(*



dkdqdvvpkpqpECk

qim

qtp

00

)()()(*

),;( ppqN



dkdqdvvpkpqpECk

qim

qtp

00

)()()(*

),;( ppqN

2

2

2)12/(

2/)2(

)2/(2i

k

ddi

d

ed

k



),;( ppqN d

dp kVqp )(

2

2

2)12/(

2/)2(

)2/(2i

k

ddi

d

ed

k

dkdqdvvpkpqpECk

qim

qtp

00

)()()(*


Integrate: Probability that an imposter’s feature vector is closer to a target’s template, than the target’s feature vector

dkdqdvvpkpqpECk

qim

qtp

00

)()()(* ECd

dVd

d

)2/()2(

)!1(2/


Compare: EC* with 1 - AUROC

EC* = 1 - AUROC

Conclusion

Derive two parametric measures Expected Confusion

(EC) Transformed

Expected-Confusion (EC*)

Probability that an imposter’s feature vector is closer

to a target’s template, than the target’s feature vector

Conclusion


(EC) Transformed


Probability that an imposter’s feature vector is within

the measurement variation of a target’s template



Conclusion


(EC) Transformed


Probability that an imposter’s feature vector is within

the measurement variation of a target’s template



Future Work

Developing a mathematical model of the cumulative match characteristic (CMC) curve Benefit: To predict how the CMC curve

changes as more subjects are added

Documents

Parametric measures to estimate and predict performance of identification techniques Amos Y. Johnson & Aaron Bobick STATISTICAL METHODS FOR COMPUTATIONAL