13
* Corresponding author. Tel.: #30-31-998225; fax: #30-31- 998419. E-mail addresses: costas@zeus.csd.auth.gr (C. Kotropoulos), tefas@zeus.csd.auth.gr (A. Tefas), pitas@zeus.csd.auth.gr (I. Pitas). Pattern Recognition 33 (2000) 1935}1947 Morphological elastic graph matching applied to frontal face authentication under well-controlled and real conditions C. Kotropoulos*, A. Tefas, I. Pitas Department of Informatics, Aristotle University of Thessaloniki, Box 451, Thessaloniki 540 06, Greece Received 11 February 1999; accepted 12 July 1999 Abstract In this paper, morphological elastic graph matching is applied to frontal face authentication on databases ranging from small to large multimedia ones collected under either well-controlled or real-world conditions. It is shown that the morphological elastic graph matching achieves a very low equal error rate on databases collected under well-controlled conditions. However, its performance deteriorates when it is applied to databases recorded in real-world scenarios. The compensation for variable recording conditions such as changes in illumination, scale di!erences and varying face position prior to the application of morphological elastic matching is proposed. ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Frontal face veri"cation; Morphological elastic graph matching; Large databases; Face normalization; Receiver operating characteristics 1. Introduction The interest and the research activities in automatic face recognition have increased signi"cantly over the past few years. The growth is mainly driven by application demands, such as identi"cation for law enforcement and authentication for remote banking and access-control applications. A recent survey on face recognition can be found in Chellapa et al. [1]. Machine recognition of faces yields problems of the following two categories: f Face recognition. Given a test face and a set of refer- ence faces in a database "nd the N closest reference faces to a test one. f Face authentication. Given a test face and a reference one, decide if the test face is identical to the reference one. The just mentioned problems are conceptually di!erent. On the one hand, a face recognition system usually as- sists a human face-recognition expert to determine the identity of the test face by computing all similarity scores between the test face and each human face stored in the system database and by ranking them. On the other hand, a face authentication system should decide itself if the face is a client (i.e., he or she claims his/her own identity) or is an impostor (i.e., he or she pretends to be someone else). The evaluation criteria for face recognition systems are di!erent from those applied to face authentication sys- tems. The performance of face recognition systems is quanti"ed in terms of the percentage of correctly identi- "ed faces within the N best matches. By varying the rank N of the match, the curve of cumulative match score versus rank is obtained [2]. The performance of face authentication systems is measured in terms of the false rejection rate (FRR) achieved at a "xed false acceptance rate (FAR) or vice versa. By varying FAR, the receiver operating characteristic (ROC) curve is obtained. If a scaler "gure of merit is used to judge the performance of an authentication algorithm, then we usually choose the operating point having FAR"FRR, the so-called equal error rate (EER). However, a more detailed com- parison should also include the values FRRD FART0 and FARD FRRT0 . 0031-3203/00/$20.00 ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 8 5 - 5

Morphological elastic graph matching applied to frontal face

Embed Size (px)

Citation preview

*Corresponding author. Tel.: #30-31-998225; fax: #30-31-998419.

E-mail addresses: [email protected] (C. Kotropoulos),[email protected] (A. Tefas), [email protected] (I. Pitas).

Pattern Recognition 33 (2000) 1935}1947

Morphological elastic graph matching applied to frontal faceauthentication under well-controlled and real conditions

C. Kotropoulos*, A. Tefas, I. Pitas

Department of Informatics, Aristotle University of Thessaloniki, Box 451, Thessaloniki 540 06, Greece

Received 11 February 1999; accepted 12 July 1999

Abstract

In this paper, morphological elastic graph matching is applied to frontal face authentication on databases rangingfrom small to large multimedia ones collected under either well-controlled or real-world conditions. It is shown that themorphological elastic graph matching achieves a very low equal error rate on databases collected under well-controlledconditions. However, its performance deteriorates when it is applied to databases recorded in real-world scenarios. Thecompensation for variable recording conditions such as changes in illumination, scale di!erences and varying faceposition prior to the application of morphological elastic matching is proposed. ( 2000 Pattern Recognition Society.Published by Elsevier Science Ltd. All rights reserved.

Keywords: Frontal face veri"cation; Morphological elastic graph matching; Large databases; Face normalization; Receiver operatingcharacteristics

1. Introduction

The interest and the research activities in automaticface recognition have increased signi"cantly over the pastfew years. The growth is mainly driven by applicationdemands, such as identi"cation for law enforcement andauthentication for remote banking and access-controlapplications. A recent survey on face recognition can befound in Chellapa et al. [1]. Machine recognition of facesyields problems of the following two categories:

f Face recognition. Given a test face and a set of refer-ence faces in a database "nd the N closest referencefaces to a test one.

f Face authentication. Given a test face and a referenceone, decide if the test face is identical to the referenceone.

The just mentioned problems are conceptually di!erent.On the one hand, a face recognition system usually as-

sists a human face-recognition expert to determine theidentity of the test face by computing all similarity scoresbetween the test face and each human face stored in thesystem database and by ranking them. On the otherhand, a face authentication system should decide itselfif the face is a client (i.e., he or she claims his/her ownidentity) or is an impostor (i.e., he or she pretends to besomeone else).

The evaluation criteria for face recognition systems aredi!erent from those applied to face authentication sys-tems. The performance of face recognition systems isquanti"ed in terms of the percentage of correctly identi-"ed faces within the N best matches. By varying the rankN of the match, the curve of cumulative match scoreversus rank is obtained [2]. The performance of faceauthentication systems is measured in terms of the falserejection rate (FRR) achieved at a "xed false acceptancerate (FAR) or vice versa. By varying FAR, the receiveroperating characteristic (ROC) curve is obtained. If ascaler "gure of merit is used to judge the performanceof an authentication algorithm, then we usually choosethe operating point having FAR"FRR, the so-calledequal error rate (EER). However, a more detailed com-parison should also include the values FRRD

FART0and

FARDFRRT0

.

0031-3203/00/$20.00 ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 8 5 - 5

A third di!erence is in the requirements needed whenface recognition/authentication systems are trained. Facerecognition systems are usually trained on sets havingone frontal image per person. For example, in face recog-nition experiments conducted on FERET database, thefa images are used to train the system, while fb images areused to test the system. Face authentication systemsusually need more images per individual for training tocapture intra-class variability (i.e., to model the vari-ations of the face images corresponding to the sameindividual). The requirements increase dramaticallywhen linear discriminant analysis is employed to accom-plish feature selection [11]. Although the algorithmsemployed in both face recognition and authenticationsystems are of common origin (for example, the dynamiclink architecture [16]), there is no a priori guarantee thatthe same algorithm would enjoy the same performancelevel in both cases.

Face recognition has received more attention than faceauthentication by the scienti"c community. There are notmany published works on face authentication. Even forface recognition only a handful of algorithms have beentested on databases greater than 150 persons. For a com-parative study, the interested reader may consult Wiskottet al. [3]. Moreover, the conditions under which facedatabases are collected a!ect seriously the performanceof a face recognition system. Recently, a comparativestudy has been performed for three well-known facerecognition techniques, namely, the eigenfaces, the auto-association and classi"cation neural networks, and theelastic graph matching [4]. It has been found that theeigenfaces work well when the face images have relativelysmall lighting and moderate expression variations. Theirperformance deteriorates signi"cantly as lighting vari-ation increases. On the contrary, elastic graph matchingis found relatively insensitive to variations in lighting,face position and expressions. Zhang et al. attributed therobustness of elastic graph matching to the use of Gabor"lters and to the rigid and deformable matching stagesof the recognition algorithm. Recognition errors close to20% are the best rates reported for elastic graph match-ing when scale variations are present. Auto-associationand classi"cation networks actually break down (i.e.,their recognition errors are +60%) in the presence ofscale variations.

In the closely related work, Adini et al. [5] presentedan empirical study that evaluates the sensitivity of severalimage representations (e.g. edge maps, directional andnon-directional derivatives of Gaussian "lters, imagesconvolved by 2-D Gabor-like "lters) in changes of theillumination conditions. It has been found that all theaforementioned image representations are insu$cient toovercome variations due to changes in illumination di-rection, viewpoint and expressions.

Motivated by the studies [2,4,5], "rst we test the per-formance of morphological elastic graph matching to

databases ranging from small ones to large ones collectedunder either well-controlled or real conditions. Secondly,by evaluating the sensitivity of face authentication sys-tems employing the morphological elastic graph match-ing and other competitive algorithms developed withinthe framework of EU-ACTS project M2VTS to changesin illumination, face size and position and facial expres-sions, we propose simple and powerful face normaliz-ation techniques that compensate for varying recordingconditions. Accordingly, the paper complements the pre-viously published works of Zhang et al. [4] and Adiniet al. [5].

More speci"cally, several frontal-face authenticationalgorithms have been developed within M2VTS project.Among others we mention the morphological dynamiclink architecture (MDLA), a sort of morphological elasticgraph matching [8,11], the morphological signal de-composition-dynamic link architecture [9,10], the elasticgraph matching (residual matching and local dis-criminants) [12], the optimized robust correlation [13]and the grey level frontal face matching [14]. All algo-rithms were initially tested on the M2VTS database [15]which contains video data of 37 persons' in 4 shots.A brief comparative study of their performance is in-cluded in Section 3.1.

The paper is focused on the morphological dynamiclink architecture and its application to frontal face auth-entication on four databases. Section 2 brie#y describesthe morphological elastic graph matching algorithm (ab-breviated as MDLA throughout the paper). The "rst twodatabases are multimedia ones collected under well-controlled conditions ranging from small (37 persons) tolarge ones (295 persons). In Section 3, we demonstratethat the morphological elastic graph matching achievesa very low equal error rate on databases collected underwell-controlled conditions. The remaining two databasesare small galleries recorded during real-world tests, suchas access-control to buildings and cash dispenser servicesor access-control to tele-services via Internet in a typicalof"ce environment. Several types of degradation are pres-ent in the latter databases, such as, varying face size(i.e., scale di!erences), varying face position, changes inlighting as well as facial expression variations. Moredetails can be found in Section 4. Although, MDLAachieves a very low equal error rate under well-control-led conditions, its performance deteriorates when it isapplied to databases recorded during real-world tests.The compensation for variable recording conditions suchas changes in illumination, scale di!erences and varyingface position is addressed next. The use of simple andpowerful pre-processing techniques aiming at compen-sating for the aforementioned conditions prior to theapplication of morphological elastic matching is pro-posed. Section 5 outlines the face normalization tech-niques. Experimental results that quantify the success ofthe techniques are included in Section 5 as well. The

1936 C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947

results obtained indicate that the proposed normaliz-ation technique overcomes the image variations andstabilizes the performance of the authenticationalgorithm.

2. Morphological elastic graph matching

An alternative to linear techniques for generating aninformation pyramid is the scale-space morphologicaltechniques. In the following, a brief description of mor-phological dynamic link architecture (MDLA) is given.In MDLA, we substitute the image representation partof elastic graph matching [16] that is based on Gaborwavelets by the multiscale morphological dilation-erosion.Let R and Z denote the set of real and integer numbers,respectively. Given an image f (x) : D-Z2PR and astructuring function g (x) : G-Z2PR, the dilation ofthe image f (x) by g (x) is de"ned by [6,7]

( f = g) (x)" maxz|G, x~z|D

M f (x!z)#g(z)N. (1)

Its complementary operation, the erosion is given by

( f > g) (x)" minz|G, x`z|D

M f (x#z)!g(z)N. (2)

The scaled hemisphere is employed as a structuring func-tion. The multiscale dilation-erosion of the image f (x) bygp(x) is de"ned by [17]

( f w gp)(x)"G( f = gp) (x) if p'0,

f (x) if p"0,

( f > g@p@) (x) if p(0.

(3)

The outputs of multiscale dilation-erosion for p"!9,2, 9 form the feature vector located at the grid nodex, i.e., j (x)"(( f w g

9) (x),2, f (x),2, ( f w g

~9) (x)). An

8]8 sparse grid has been created by measuring thefeature vectors j(x) at equally spaced nodes over theoutput of the face detection algorithm described in Ko-tropoulos et al. [25].

Let the superscripts t and r denote a test and a refer-ence person (or grid), respectively. The ¸

2norm of the

di!erence between the feature vectors at the ith grid nodehas been used as a (signal) similarity measure, i.e.,

Sv( j (xt

i), j (xr

i))"E j (xt

i)!j (xr

i)E. (4)

As in DLA [16], the quality of a match is evaluated bytaking into account the grid deformation as well. Let usdenote by V the set of grid nodes. The grid nodes aresimply the vertices of a graph. Also let N(i) denote theneighborhood of vertex i. A four-connected neighbor-hood has been used. An additional cost function:

Se(i, j)"S

e(dt

ij, dr

ij)"Edt

ij!dr

ijE ∀i 3 V; j 3 N (i) (5)

with dij"(x

i!x

j) can be used to penalize grid deforma-

tions. The penalty (5) can be incorporated into the costfunction:

C(MxtiN)"+

i|VGSv

( j(xti), j(xr

i))#j +

j|N(i)

Se(i, j)H. (6)

The cost function (6) is actually a matching error thatde"nes a distance measure between two persons. Ladeset al. argue that a two-stage coarse-to-"ne optimizationprocedure su$ces for the minimization of Eq. (6) [16]. Inour experiments, the above-mentioned approach is pro-ved inadequate. Accordingly, we propose: (i) to exploitthe face detection results that are provided by the hier-archical rule-based system [25] for initializing the min-imization of the cost function, and (ii) to replace the twostage optimization procedure by a probabilistic hillclimbing algorithm (i.e., a simulated annealing algorithm)that is reminiscent of the Algorithm 1.4 [18] that doesnot make a distinction between coarse and "ne matching.The design of the face authentication system based onmorphological elastic graph matching spans severalissues that have been solved separately. Linear projectionalgorithms for feature selection could be used to increasethe e$ciency of the method [10]. Automatic weighting ofnodes according to their discriminatory power has alsobeen treated [11].

3. Performance assessment on small and largemultimedia databases collected under well-controlledconditions

In this section, we assess the performance of mor-phological elastic graph matching when it is applied forface authentication to two multimedia databases col-lected under well-controlled conditions. The databasesare the M2VTS database [15] and the extended M2VTSdatabase (XM2VTSDB) [19]. First, we brie#y review theperformance achieved by the morphological elastic graphmatching on the M2VTS database. We proceed then toits performance assessment on the XM2VTSDB.

3.1. Face authentication from M2VTS database images

The morphological dynamic link architecture has beeninitially tested on the M2VTS database [15]. Thedatabase contains both sound and image information.Four recordings (i.e., shots) of 37 persons have beencollected. In our experiments, the sequences of rotatedheads have been considered by using only the luminanceinformation at a resolution of 286]350 pixels. We haveused only one frontal image from the image sequences ofeach person that has been chosen based on symmetryconsiderations. Four experimental sessions have beenimplemented by employing the `leave one outa principle

C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947 1937

Fig. 1. Receiver operating characteristics of morphological dynamic link architecture with/without linear projections (LP) or localdiscriminatory power coe$cients (DPCs).

Table 1Best equal error rates on M2VTS database achieved by frontalface authentication algorithms

Method EER (%)

Elastic graph matching (residual matching) [12] 11.8Morphological dynamic link architecture (with-

out linear projection) [8]9.3

Grey-level frontal face matching [14] 8.5Elastic graph matching (local discriminants) [12] 6.1Morphological dynamic link architecture (with

linear projections of the feature vectors) [10]5.4

Optimized robust correlation [13] 4.8Morphological dynamic link architecture with

discriminatory power coe$cients [11]3.7

thus producing 5328 client claims and another 5328imposter claims. Details on the experimental protocolused in the performance evaluation as well as on thecomputation of thresholds that discriminate each personfrom the remaining persons in the database can be foundin Kotropoulos et al. [8]. Fig. 1 shows the receiver operat-ing characteristics of morphological dynamic link archi-tecture with and without any linear projections as well aswhen discriminatory power coe$cients are employed.

Table 1 summarizes the equal error rates achieved bythe face authentication algorithm under study and com-pares them to those achieved by the other frontal authen-tication algorithms on M2VTS database. It is seen thatthe optimized MDLA with discriminatory power coe$-cients is ranked as the "rst method in terms of the equalerror rate.

3.2. Face authentication from extended M2VTSdatabase images

The extended M2VTS database contains video data of295 persons' which include speech and image sequencesof rotated heads [19]. Eight recordings of 295 personshave been collected under well-controlled recording con-ditions. Morphological elastic graph matching has beentested on the XM2VTSDB according to the two con"g-urations de"ned in Luettin et al. [20]. A subset of 200persons (i.e., training clients) has been used for training.Let us denote byS

1the set of training clients. The design

of a person authentication system based on MDLA is splitinto three procedures, namely the training, evaluation andtest procedure. In the following, we describe the design ofsuch a system in detail for the "rst con"guration.

In the "rst con"guration three frontal images havebeen used for each client. The objective of the trainingprocedure is to determine a threshold, ¹(X; Q), for eachtraining client as follows. Let i, j be the recording indices.For i, j"1, 2, 3 and jOi;

1. Compute the intra-class distances d (Xi, X

j). Let

D (X, X)"mini,j

Md (Xi, X

j)N.

2. Compute the inter-class distances:

d (>i, X

j), X3S

1, >3 (S

1!MXN) (7)

Let D (>, X)"mini,j

Md (>i, X

j)N.

3. Let OS(l)A denote the lth-order statistic of a set A.

For Q"0, 1,2 compute a threshold by

¹(X; Q)"OS(1`Q)

MD (>, X)>OXN. (8)

1938 C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947

Fig. 2. Receiver operating characteristics of morphological elastic matching on the training, evaluation and test sets on the extendedM2VTS database (a) in con"guration I, and (b) in con"guration II.

It is evident that the training procedure results in 200client claims and 39,800 impostor claims. For severalchoices of Q, di!erent pairs of false acceptance and falserejection rates are found. Accordingly, the receiver oper-ating characteristics on the training set is obtained whichis plotted in Fig. 2(a). It is seen that an EER+3.0% isachieved.

A second subset of 25 persons has been selected toevaluate the performance of the authentication algo-rithm. These persons act as evaluation impostors. Toimplement the evaluation procedure, another three frontalimages have been used for each client and eight frontalimages have been employed for each evaluation impos-tor. The evaluation client images and the evaluationimpostor images build the evaluation set. By using thejust-mentioned set, 600 client claims and 40,000 impostorones are produced. Although there is a possibility toadapt the thresholds determined previously to the evalu-ation set and to use the updated thresholds on the testset, we decided not to do so in order to keep the evalu-ation claims disjoint to the claims that are employed inthe subsequent test procedure. Then the evaluationclaims can be exploited to train a fusion manager. Theestimation of FAR and FRR parallels the steps describedsubsequently in the test procedure. The ROC on theevaluation set is overlaid in Fig. 2(a). It can easily be seenthat an EER+8.0% is obtained on the evaluation set.

A third subset of 70 persons has been selected toconstitute the set of test impostors. Two new frontalimages for each client have been used to implement client

accesses. Eight frontal images for each test have beenused for each test impostor. The test client images and thetest impostor images form the test set. By using the testset, 400 client claims and 112,000 impostor ones can beproduced. The objective of the test procedure is to pro-vide estimates of FAR and FRR by applying the pre-viously determined thresholds on the test set. Let usdenote the new frontal images of person X by X@

l, l"1, 2.

For each X 3 S1, the number of false rejections is ob-

tained as follows:

1. Compute the distances d (X@l, X

i) for i"1, 2, 3.

2. Normalize the distances to the range [0, 1]:z(X, X, i)"f (d(X@

l, X

i), ¹(X, Q)).

3. Compute D(X, X)"miniMz(X, X, i)N.

4. Count a false rejection if D (X, X)(0.5.

Let us denote the set of test impostor images by S3. For

each X 3 S1

and > 3 S3

the number of false accept-ances is obtained as follows:

1. Compute the distances d (>, Xi) for i"1, 2, 3.

2. Normalize the distances to the range [0, 1]:z(>, X, i)"f (d (>, X

i), ¹(X, Q)).

3. Compute D(>, X)"miniMz(>, X, i)N.

4. Count a false acceptance if D(>, X)*0.5.

By varying the parameter Q, di!erent thresholds arerecalled from the training procedure and the receiveroperating characteristic on the test set is created. The testROC is shown overlaid in Fig. 2(a). The inspection ofFig. 2(a) reveals that an EER+6.57% is achieved.

C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947 1939

Table 2Rates at FAR+0, FAR+FRR, and FRR+0 on XM2VTSDBin the two con"gurations of the experimental protocol. All ratesare in %

Con"guration Operatingpoint

Evaluation set Test set

FAR FRR FAR FRR

I FAR+0 0.5 19 0.46 20FAR+FRR 8.11 8 8.23 6FRR+0 48.42 1.66 46.63 0.75

II FAR+0 0.46 18.75 0.46 16.25FAR+FRR 6.04 6.5 6.17 3.5FRR+0 36.85 1.25 34.67 0.75

Table 3False rejection rates measured a posteriori at approximately thesame false acceptance rates in the test procedure. The algorithmsare tested on the XM2VTSDB in both con"gurations. All ratesare in %

Method Con"guration I Con"guration II

FAR FRR FAR FRR

Elastic graph matching(residual matching) [21]

8.08 8.50 7.67 7.25

Morphological dynamiclink architecture (with-out linear projections)

8.23 6 7.94 3.25

Optimized robust correla-tion [22]

7.76 7.25 5.71 5.75

Morphological dynamiclink architecture (with-out linear projections)

7.61 6.25 5.52 4.00

1 It was collected by MATRANORTEL Communications,France.

2 It was collected by IbermaH tica, Spain.

A second con"guration is also tested. It employs fourtraining images for each client, two images for evaluationand another two images for testing. This con"gurationresults in 400 evaluation client claims whereas the num-ber of evaluation impostor claims, test client claims andtest impostor claims are left intact. By adapting the justdescribed training and evaluation procedures and byapplying the previously presented test procedure, theROCs plotted in Fig. 2(b) are obtained. It is seen that theEER is approximately 1.5% on the training set, 6.2% onthe evaluation set and 5% on the test set.

Table 2 summarizes the performance of morphologicalelastic graph matching in terms of the FAR and FRRachieved at three operating points de"ned on the evalu-ation set, namely, FAR+0, FAR+FRR, and FRR+0.

The false rejection rates at approximately the samefalse acceptance rate are measured a posteriori in the testprocedure for morphological elastic graph matching,standard elastic graph matching and optimized robustcorrelation. Both con"gurations are considered. By in-specting the entries in Table 3, it is seen that MDLAoutperforms the elastic graph matching by 2.5 and 4% inCon"gurations I and II, respectively. The correspondinggain in FRR with respect to the optimized robust correla-tion is 1 and 1.75%, respectively.

4. Problems occurred in real-world tests

It is well-known that the conditions under which facedatabases are collected a!ect drastically the performanceof face recognition/authentication algorithms [4,5]. Akey to the successful development of a general face authe-ntication system is to systematically account for the dif-ferent acquisition parameters, i.e., the varying lighting,background, pose and scale of the face, etc. Accordingly,the evaluation of any authentication algorithm on adatabase collected under well-controlled conditions isinadequate when the algorithm is to be integrated ina platform for commercial exploitation. In such a case, it

is of utmost importance to evaluate the authenticationperformance of the algorithm on real-world tests. To-wards this goal two small galleries are collected under`real operating conditionsa, namely, the MATRANOR-TEL database1 and the IBERMATICA database.2

The experiments reported on MATRANORTELdatabase have been conducted on 21 persons. However,the size of the database progressively increases (51 per-sons are currently included in the database). Severalsources of degradation are modeled in the database:

(a) Face size and position. In practice, it is very di$cult tocontrol the position of the subject with respect to thecamera.

(b) Changes in illumination. If a spotlight is not used,lighting variations occur. For example close to a win-dow, the lighting depends strongly on the day-timeand the weather.

(c) Facial expressions. In practice, it is almost impossibleto control the mood of the subject. The smile causesprobably the largest image variation among all facialexpressions.

In addition to images belonging to the just-mentionedcases, the database contains one set of training images(1 image per person) and one set of test images (2 imagesper person) recorded under well-controlled conditions.That is, a uniform white background exists in theimages, uniform lighting conditions are used, the faceis of neutral expression and is located at the center ofthe image. All images in the MATRANORTEL database

1940 C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947

Table 4Equal error rates achieved by elastic graph matching (EGM), morphological elastic graph matching (MDLA) and optimized robustcorrelation (ORC) when they are applied to MATRANORTEL database

Conditions Number of claims EER (%)

Client Impostor MDLA EGM ORC

Well controlled 38 781 12 6 11Lighting variations 21 441 33 25 32Face size variations 21 441 28 40 30Varying face position 21 441 23 15 20Variable facial expressions 21 441 17 15 10General 122 2545 22 21 20

Fig. 3. Sample images from MATRANORTEL database demonstrating the recording of images under (a) lighting variations, (b) scaledi!erences (c) varying face position, and (d) when facial expressions are not neutral.

are recorded in 256 grey levels and they are ofdimensions 144]192 pixels. Sample images demonstrat-ing the variable recording conditions are shown in Fig. 3.

A set of experiments has been implemented so that theperformance of the morphological elastic matching iscorrelated to each degradation source. More speci"cally,441 impostor claims and another 21 client claims havebeen tested under each degradation source. In additionto these claims, 781 impostor claims and 38 client claimshave been tested under the just-mentioned well-control-led conditions. This is tantamount to 2545 impostorclaims and 122 client claims in total. The fourth columnin Table 4 summarizes the performance of morphologicaldynamic link architecture for each source of degradationseparately and in the general case that encompasses allvarying conditions. The next two columns describe theperformance of the elastic graph matching [12] and theoptimized robust correlation [13] for each source ofdegradation [23].

A set of similar experiments has been conducted on theIBERMATICA database. The experiments reported herehave been conducted on 11 persons. It is worth noting

that the size of the database progressively increases (17persons are currently included in the database). For eachperson two training sample images are stored in thedatabase. A large number of test images has also beenrecorded. In particular, more than 20 test images of eachuser have been recorded when he or she claims thecorrect identity (client images). In addition, a couple oftest images per individual have been recorded when he orshe pretends that he or she is someone else (impostorimages). In total, 514 client accesses have been testedagainst 56 impostor ones. The three types of degradationpreviously described are present in this database as well.All images in the IBERMATICA database are recordedin 256 grey levels and they are of dimensions 320]240.Sample images are shown in Fig. 4. The performance ofthe elastic graph matching [12] and the morphologicalelastic graph matching has been evaluated on the IBER-MATICA database. The EERs obtained using the just-mentioned techniques are shown in Table 5. Fromthe inspection of Tables 4 and 5, it becomes evident thatthe performance of the algorithms developed withinthe M2VTS project depends strongly on the variable

C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947 1941

Fig. 4. Sample images from IBERMATICA database exhibiting scale and lighting di!erences.

Table 5Equal errors rates achieved by elastic graph matcting and mor-phological elastic graph matching when they are applied toIBERMATICA database

Method EER (%)

Elastic graph matching [12,24] 25Morphological elastic graph matching (MDLA) 35Normalization#morphological elastic graph

matching20

recording conditions. To alleviate such a dependency,a compensation for the di!erent conditions is needed. Todo so, the use of simple and powerful face normalizationtechniques prior to the application of any authenticationalgorithm is proposed in Section 5.

5. Face normalization techniques

The proposed techniques are based on the detection ofthe facial region in the image and its splitting in twosegments, the left segment and the right one. We assumethat: (1) the background of the images is uniform, and (2)one only person appears in the scene. Fig. 5 demonstratespictorially the steps of the algorithm that are described indetail below

Step 1: The oval shape of a face can be approximatedby an ellipse. Therefore, the detection of the facial area inan image can be performed by detecting an object ofelliptical shape. To do so, "rst we have to discard theimage background. The starting point is an edge detec-tion algorithm. By thresholding the resulting image afteredge detection, a zero value is assigned to the back-ground. Since the background is uniform and does notcontain any edges, it can easily be discarded by employ-ing a grass-"re algorithm. Accordingly, the image is seg-mented into two regions, one of which contains the facialregion and the other contains the background.

Step 2: The next step is to model the face-like region byan ellipse using moment-based features [26,27]. Let usdenote the face-like area by C and the best-"t ellipse by E.An ellipse is de"ned by its center (x

0, y

0), its orientation

h and the length a and b of its semi-major and semi-minor

axes. The center of the ellipse is estimated by the center ofmass of the region C. The orientation of the ellipse iscomputed by determining the angle between the axis ofthe least moment of inertia and the horizontal axis of thecoordination system, i.e.,

h"1

2arctan A

2o1,1

o2,0

!o0,2B, (9)

where oi, j

denotes the (i, j)-central moment of the regionC. The length of the semi-major axis a and the length ofthe semi-minor axis b can be computed by evaluating theleast and the greatest moment of inertia, respectively. Theleast and the greatest moment of inertia of an ellipse withorientation h, I

minand I

max, are given by

Imin

" +(x,y)|C

[(y!y0) cos h!(x!x

0) sin h]2, (10)

Imax

" +(x,y)|C

[(y!y0) sin h#(x!x

0) cos h]2, (11)

where x, y denote the horizontal and vertical coordinateof a pixel. Accordingly, a and b are given by

a"A4

pB1@4

C(I

max)3

Imin

D1@8

, b"A4

pB1@4

C(I

min)3

Imax

D1@8

, (12)

respectively. To "nd the ellipse that models the givenregion best, we iteratively maximize the measure

M" +(x,y)|EWC

1! +(x,y)|EWC

c

1, (13)

where Cc denotes the complement of the region C (i.e., thebackground). The maximization of Eq. (13) correspondsto the maximization of the number of correctly modeledpixels (i.e., (x, y) 3 EWC) and the minimization of thenumber of incorrectly modeled pixels (i.e., (x, y) 3 EWCc).In order to "nd a maximum of M the following recursiveprocedure is proposed.

Step (a). Calculate the parameters of the initial ellipseE0

and the measure M0

for the initial region C0.

At the ith iteration of the algorithm:Step (b). Find the new region C

i"C

i~1W E

i~1.

Step (c). Calculate the parameters of the ellipse Ei

that"ts C

iand the measure M

i.

1942 C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947

Fig. 5. Pictorial description of the proposed face normalization algorithm.

Step (d). If Mi'M

i~1go the Step (b). Otherwise, the

best ellipse is Ei~1

that has already been found.

By using this iterative algorithm the ellipse "ttingbecomes more robust to noise, i.e., to pixels that corres-pond to clothes, hair, etc.

Step 3: The "rst ellipse found is a coarse approximat-ion of the facial area, because the hair, and in somecases, parts of the clothes are included in it. Thus, theskin region can be segmented. To overcome this problemthe ellipse is subdivided into its left and right segmentswith respect to the vertical axis. Moreover, the sub-division aids to compensate for the di!erent lighting

conditions which e!ect unevenly the two parts of theface.

Step 4: The next step is to apply a clustering algorithmto each segment of the ellipse, separately. By choosingK"2, a K-means algorithm hopefully succeeds to relatethe skin-like area with a single cluster in each segment.Thus, the skin region is detected accurately. The meanintensity values of each face side are also estimated. Theyprovide information for the lighting conditions duringthe recording procedure.

Step 5: The union of clusters in the left and rightsegments that correspond to skin-like areas is modeledby an ellipse using the algorithm described in Step 2. The

C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947 1943

Fig. 6. Facial images collected under varying illumination conditions. The best ellipses determined by the algorithm are drawn on theoriginal images. The normalized images are shown in the second and fourth columns.

quality of "t is measured again by Eq. (13). By applyingSteps (a)}(d), a "ner approximation is obtained. Theellipse found at the last iteration is the best-"t ellipse wesearched for.

In addition to the use of the just-described techniquein facial area detection, the method provides: (i) anestimate of the face center that can be used in compensat-ing for face translation, (ii) an estimate of the facewidth which is related to the length of the minor axis ofthe best-"t ellipse, and (iii) an estimate of the mainintensity value of each segment of the face (left andright) which is related to the lighting conditions of therecording procedure.

5.1. Illumination normalization

Lighting may cause uneven illuminations of the rightand the left face segments. We assume that the illumina-tion conditions in the left and right face segments areuniform. To compensate for the aforementioned e!ect,the mean intensities of both face segments should beequalized. Let m

L, m

Rbe the main intensity values in the

left and the right segment, respectively. The initial imageI(x, y) is transformed so that the left and right segmentsof the normalized image, I

N(x, y), have the same (desired)

mean intensity Id:

IN(x, y)"I

d A(1/m

R)!(1/m

L)

1#exp ((x0!x)/j)

#

1

mLBI(x, y), (14)

where j controls the slope of the sigmoidal function thatappears in the denominator of the "rst fractional terminside parentheses. Let L be the image region in whichexp ((x

0!x)/j)PR and R be the image region in

which exp ((x0!x)/j)+0. These regions correspond to

the left and the right segments of the best-"t ellipsemodeling the face, respectively. It can be easily proventhat:

E[IN(x, y), (x, y)3L]"E[I

N(x, y), (x, y)3R]"I

d, (15)

which satis"es our objective. Fig. 6 demonstrates theillumination normalization achieved when images fromMATRANORTEL database are used.

5.2. Face position and size

Varying face position and size are easily compensatedfor, if the face is accurately approximated by an ellipse.The problem of varying face position can be solved bytranslating the initial image so that the center of theellipse always coincides with the image plane center. Thewidth of the face can be approximated by the length ofthe minor axis 2b. Size (i.e., scale) normalization can beachieved by resizing the image with a horizontal scalefactor=

d/2b, where =

dis the desired width of the nor-

malized face. Image resizing is achieved by linear interpo-lation. Fig. 7 shows the compensation for scale variationsin images from MATRANORTEL database.

The successful compensation for varying face positionin images from MATRANORTEL database is demon-strated in Fig. 8. Examples of normalized face imagesfrom IBERMATICA database are depicted in Fig. 9.

5.3. Impact of face normalization on the authenticationperformance of morphological elastic graph matching

To quantify the success of the face normalization tech-nique we evaluate the ROC when the proposed normaliz-ation technique is incorporated into the morphologicalelastic graph matching on the MATRANORTEL andthe IBERMATICA databases. The ROCs of mor-phological elastic graph matching with and without facenormalization for the three degradation sources, namely,the changes in lighting, the varying face size and position,are plotted in Fig. 10. In each subplot, "rst the ROCwithout applying the proposed face normalization tech-nique is given. The ROC, when we compensate for thespeci"c source of degradation, is plotted next. Finally, the

1944 C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947

Fig. 7. Facial images possessing scale di!erences. The best ellipses determined by the algorithm are drawn on the original images. Thenormalized images are shown in the second and fourth columns.

Fig. 8. Facial images exhibiting di!erences in face position. The best ellipses determined by the algorithm are drawn on the originalimages. The normalized images are shown in the second and fourth columns.

Fig. 9. Normalized images corresponding to the sample images of Fig. 4 from IBERMATICA database.

ROC when we simultaneously compensate for the threesources of degradation is also shown. The gain in EERwhen the proposed face normalization technique is em-ployed is shown in Table 6 for each degradation sourceseparately as well as in the general case. A signi"cantdrop 7.3% in EER (which amounts to a 33% relativedrop) is achieved in the general case. For comparisonpurposes, the EERs achieved without normalization arealso included.

The ROC, when we compensate for the three types ofdegradation on the IBERMATICA database is plottedin Fig. 11. The ROC without applying the proposed facenormalization technique is overlaid in the same plot as

well. The EER drops to 20% when the face normaliz-ation technique is employed. That is, a signi"cant drop of15% in EER (which amounts to a 43% relative drop) isachieved.

6. Conclusions

A thorough assessment of morphological elastic graphmatching as a frontal face authentication algorithm onfour databases has been presented. The databases rangefrom small galleries to large multimedia ones collectedunder either well-controlled or real-world conditions. It

C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947 1945

Fig. 10. Receiver operating characteristics of morphologicalelastic graph matching with and without employing face nor-malization on MATRANORTEL database.

Table 6Equal errors rates achieved on MATRANORTEL database bymorphological elastic graph matching with and without facenormalization

Conditions EER (%)

Withoutnormalization

Withnormalization

Well controlled 12 9.5Lighting variations 33 15Face size variations 28 19Varying face position 23 17Variable facial expressions 17 13General 22 14.7

Fig. 11. Receiver operating characteristics of morphologicalelastic graph matching with and without employing face nor-malization on IBERMATICA database.

has been shown that the morphological elastic graphmatching achieves a very low equal error rate ondatabases recorded under well-controlled conditions(3.7}5%). However, its performance deteriorates when itis applied to real-world databases. A face normalization

method that succeeds to compensate for lighting changes,varying face size and position has been proposed. Itguarantees an almost stable authentication performanceof morphological elastic graph matching in "eld testsunder any recording conditions. As a by-product imagevariations attributed to di!erent facial expressions arecompensated for as well.

7. Summary

In this paper, morphological elastic graph matching isapplied to frontal face authentication on four databases.

1946 C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947

Two of them are multimedia databases collected undercontrolled conditions. The "rst one contains audio andvideo data of 37 persons in four shots. The seconddatabase is extended to audio and video data of 295persons recorded at eight time instances. The remainingtwo databases are small galleries recorded during real-world tests, such as access-control to buildings and cashdispenser services or access-control to tele-services viaInternet in a typical o$ce environment. It is demon-strated that the morphological elastic graph matchingachieves a very low equal error rate on databases col-lected under well-controlled conditions. However, itsperformance deteriorates when it is applied to databasesrecorded during "eld tests. The compensation for imagevariations attributed to variable recording conditions,i.e., changes in illumination, face size di!erences andvarying face position, is addressed next. The use of simpleand powerful pre-processing techniques aiming at com-pensating for the aforementioned variations prior to theapplication of morphological elastic graph matching isproposed. The results obtained indicate that such anapproach overcomes the image variations and stabilizesthe performance of the authentication algorithm.

References

[1] R. Chellapa, C.L. Wilson, S. Sirohey, Human and machinerecognition of faces: a survey, Proc. IEEE 83 (1995)705}740.

[2] P. Jonathon Phillips, Matching pursuit "lters applied toface identi"cation, IEEE Trans. Image Process. 7 (1998)1150}1164.

[3] L. Wiskott, J.-M. Fellous, N. KruK ger, C. von der Malsburg,Face recognition by elastic graph matching, TechnicalReport IR-INI 96-08, Insitut FuK r Neuroinformatik, Ruhr-UniversitaK t Bochum, 1996.

[4] J. Zhang, Y. Yan, M. Lades, Face recognition: eigenface,elastic matching and neural nets, Proc. IEEE 85 (1997)1423}1435.

[5] Y. Adini, Y. Moses, S. Ullman, Face recognition: the prob-lem of compensating for changes in illumination direction,IEEE Trans. Pattern Anal. Mach. Intell. 19 (1997)721}732.

[6] R.M. Haralick, L.G. Shapiro, Computer and Robot Vi-sion, Vol. I, Addison-Wesley, Reading, MA, 1992.

[7] I. Pitas, A.N. Venetsanpoulos, Nonlinear Digital Filters:Principles and Applications, Kluwer Academic Publishers,Norwell, MA, 1990.

[8] C. Kotropoulos, I. Pitas, Face authentication based onmorphological grid matching, Proceedings of the IEEEInternational Conference on Image Processing (ICIP 97),1997, pp. I-105}I-108.

[9] A. Tefas, C. Kotropoulos, I. Pitas, Face veri"cation basedon morphological shape decomposition, Proceedings of

the Third International Workshop on Face and GestureRecognition, 1998, pp. 36}41.

[10] A. Tefas, C. Kotropoulos, I. Pitas, Variants of DynamicLink Architecture based on mathematical morphologyfor frontal face authentication, Proceedings of the IEEEComputer Society Conference on Computer Vision andPattern Recognition, 1998, pp. 814}819.

[11] C. Kotropoulos, A. Tefas, I. Pitas, Frontal face authentica-tion using variants of Dynamic Link Matching based onmathematical morphology, Proceeding of the IEEE Inter-national Conference on Image Processings, Vol. I, 1998,pp. 122}126.

[12] B. Duc, S. Fischer, J. BiguK n, Face authentication withGabor information on deformable graphs, IEEE Trans.Image Process. 8 (1998) 504}516.

[13] J. Matas, K. Jonsson, J. Kittler, Fast face localisation andveri"cation, Proceedings of the British Machine VisionConference, 1997.

[14] S. Pigeon, L. Vandendorpe, Image-based multimodal faceauthentication, Signal Process 69 (1998) 59}79.

[15] S. Pigeon, L. Vandendorpe, in: J. BiguK n, G. Chollet, G.Borgefors (Eds.), The M2VTS Multimodal Face Database,Audio- and Video- based Biometric Person Authentica-tion, Lecture Notes in Computer Science, Vol. 1206,Springer, Berlin, 1997, pp. 403}409.

[16] M. Lades, J.C. VorbruK ggen, J. Buhmann, J. Lange, C.v.d.Malsburg, R.P. WuK rtz, W. Konen, Distortion invariantobject recognition in the Dynamic Link Architecture,IEEE Trans. Comput. 42 (1993) 300}311.

[17] P.T. Jackway, M. Deriche, Scale-space properties of themultiscale morphological dilation-erosion, IEEE Trans.Pattern Anal. Mach. Intell. 18 (1996) 38}51.

[18] R.H.J.M. Otten, L.P.P.P. van Ginneken, The AnnealingAlgorithm, Kluwer Academic Publishers, Norwell, MA,1989.

[19] K. Messer, J. Matas, J. Kittler, Acquisition of a largedatabase for biometric identity veri"cation, Proceedings ofBioSig 98, 1998.

[20] J. Luettin, G. Mam( tre, Evaluation protocol for the extendedM2VTS database (XM2VTSDB), in: IDIAP Communica-tion 98-054, IDIAP, Martigny, Switzerland, 1998.

[21] Y. Abdeljaoued, J. Bigun, personal communication, 1998.[22] K. Jonsson, J. Matas, J. Kittler, personal communication,

1998.[23] Y. Menguy, Pilot demonstrators and prototyping plat-

form, in: G. Richard (Ed.), M2VTS Annual Project ReviewReport, MATRANORTEL Communications, 1997.

[24] C. Fernandez-Grande, Personal communication, Iber-maH tica, 1998.

[25] C. Kotropoulos, I. Pitas, Rule-based face detection infrontal views, Proceedings of the IEEE InternationalConference on Acoustics, Speech and Signal Processing(ICASSP 97), Vol. IV, 1997, pp. 2537}2540.

[26] A.K. Jain, Fundamentals of Digital Image Processing,Prentice-Hall, Englewood Cli!s, NJ, 1989.

[27] K. Sobottka, I. Pitas, A novel method for automatic facesegmentation, facial feature extraction and tracking, SignalProcess: Image Commun. 12 (1998) 263}281.

C. Kotropoulos et al. / Pattern Recognition 33 (2000) 1935}1947 1947