Predicting trait impressions of faces using local face recognition techniques

Expert Systems with Applications 37 (2010) 5086–5093

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Predicting trait impressions of faces using local face recognition techniques

Sheryl Brahnam a, Loris Nanni b,*

a Computer Information Systems, Missouri State University, 901 S. National, Springfield, MO 65804, USAb DEIS, Università di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy

a r t i c l e i n f o a b s t r a c t

Keywords:Trait impressionsFace recognitionFace classificationHuman–computer interactionOvergeneralization effectsClassifier ensembles

0957-4174/$ - see front matter � 2009 Elsevier Ltd. Adoi:10.1016/j.eswa.2009.12.002

* Corresponding author.E-mail addresses: [email protected] (

bo.it (L. Nanni).

The aim of this work is to propose a method for detecting the social meanings that people perceive infacial morphology using local face recognition techniques. Developing a reliable method to model peo-ple’s trait impressions of faces has theoretical value in psychology and human–computer interaction.

The first step in creating our system was to develop a solid ground truth. For this purpose, we collecteda set of faces that exhibit strong human consensus within the bipolar extremes of the following six traitcategories: intelligence, maturity, warmth, sociality, dominance, and trustworthiness.

In the studies reported in this paper, we compare the performance of global face recognition techniqueswith local methods applying different classification systems. We find that the best performance isobtained using local techniques, where support vector machines or Levenberg-Marquardt neural net-works are used as stand-alone classifiers. System performance in each trait dimension is compared usingthe area under the ROC curve. Our results show that not only are our proposed learning methods capableof predicting the social impressions elicited by facial morphology but they are also in some cases able tooutperform individual human performances.

� 2009 Elsevier Ltd. All rights reserved.

1. Introduction

Over the last few decades, research in face recognition has ex-panded its focus to include some of the social aspects of faces, suchas facial expression recognition. Emotional information is transi-tory in nature and is conveyed by deformations of facial featuresand by the dynamic movements involved in facial muscle contrac-tions. Originally of theoretical interest, machine recognition of fa-cial expressions is an established area of research, and we havewitnessed rapid growth over the last decade in the application ofthis technology in such diverse areas as interface design, lip read-ing, synthetic face animation, human emotion analysis, face-imagecompression, security, and teleconferencing, to mention a few(Fasel & Luettin, 2003).

Another social aspect of faces that has sparked considerableinterest in social psychology and other fields, but not so much inthe area of machine classification, relates to the personality impres-sions that the static morphology of the face produces in the typicalobserver. Evidence indicates that people across cultures, agegroups, genders, and social-economic status are consistent and sim-ilar to one another in their impressions of various facial forms (Zeb-rowitz, 1998). Several theoretical frameworks have been advanced

ll rights reserved.

S. Brahnam), loris.nanni@uni

that attempt to explain why certain facial characteristics consis-tently elicit specific impressions. One prominent approach is toexamine facial appearances in terms of social affordances. In onevariant of this theory, impression formation is based on a numberof overgeneralization effects (McArthur & Baron, 1983). As outlinedin Section 2, significant overgeneralization effects include facialattractiveness, maturity, gender, and emotion. According to thistheory, specific facial characteristics advertize important informa-tion that lead people to behave in ways that increase their chancesof survival. Recognizing a healthy potential mate, for example, in-creases the chances of producing robust offspring, and facial fea-tures indicative of health (rosy cheeks and unblemished skintextures) are generally sexually attractive. It is theorized that thepositive effects of these facial characteristics are then overgeneral-ized so that attractive people, for instance, are assumed to possess ahost of other desirable traits, such as intelligence, optimism, extro-version, and leadership and social skills.

In this paper, we attempt to classify faces according to the per-sonality impressions they elicit. This task presents unique difficul-ties primarily because the ground truth is not based onascertainable information about the subjects, such as identity, gen-der, and age. As noted in Sun, Sebe, and Lew (2004), developing asound ground truth for detecting social aspects of faces is compli-cated even in the case of facial expression recognition, where re-search has identified a basic set of emotions that can berecognized using a well-established facial action coding system

http://dx.doi.org/10.1016/j.eswa.2009.12.002

mailto:[email protected]

mailto:loris.nanni@uni

http://www.sciencedirect.com/science/journal/09574174

http://www.elsevier.com/locate/eswa

S. Brahnam, L. Nanni / Expert Systems with Applications 37 (2010) 5086–5093 5087

(Ekman & Friesen, 1978). Lacking a strong theoretical foundationfor building a ground truth for the social impressions of facialmorphology, our classes must be rooted in the impressions a setof faces make on the average observer. We are thus asking ma-chines to map faces to a set of subjective social and psychologicalcategories.

Few face classification experiments have focused on matchingconsensual subjective judgments. One early attempt is Brahnam(2002). In that study, principle component classifiers were trainedto match the human classification of faces along the bipolar traitdimensions of adjustment, dominance, warmth, sociality, andtrustworthiness. Results varied depending on the trait dimension;the classifiers matched the average observer in dominance (64%),adjustment (71%), sociality (70%), trustworthiness (81%) andwarmth (89%). We believe, however, that these classification rateswere inflated by the low number of faces (<15) in each trait class.

Related to our work are experiments designed to detect facialattractiveness. In Kanghae and Sornil (2007), for instance, two setsof photographs of 92 Caucasian females, approximately 18 yearsold, were rated by subjects using a 7-point scale. In their experi-ments, they classified faces into two classes, attractive (highest25% rated images) and unattractive (lowest 25% rated images). Per-formance, based on percentage of correctly classified images, aver-aged 76.0% using K-nearest neighbor and 70.5% using linearsupport vector machines (SVMs). In this case, the authors believedthat the low number of faces in the two classes (�24 images each)accounted for the poor performance of the classifiers.

As illustrated in Brahnam (2002) and Kanghae and Sornil(2007), it is important to develop a database that contains a suffi-cient number of representative faces within each trait class. In Sec-tion 3, we describe the databases first used in Brahnam and Nanni(2009) and in this study that attempt to circumvent some of thedifficulties mentioned above in developing a solid ground truthfor this problem domain. For the studies reported in Brahnamand Nanni (2009), six databases of faces were produced for the fol-lowing six trait dimensions: intelligence, maturity, dominance,sociality, trustworthiness, and warmth. Each database was dividedinto two classes: high and low. Because artists were asked to designthe faces, the number of faces verified by human subjects asbelonging in each trait class averaged 111 faces, a much highernumber of faces in each class then those used above. Single classi-fiers and ensembles were then trained to match the bipolar ex-tremes of the faces in each of the six trait dimensions. Withperformance measured by the Area Under the ROC curve (AROC)and averaged across all six dimensions, results showed that singleclassifiers, especially linear SVM (0.74) and Levenberg-Marquardtneural networks (LMNNs) (0.73), performed as well as human rat-ers (0.77). These single classifiers, however, performed poorly inthe trait dimension of maturity. Ensembles of 100 LMNNs, con-structed using Bagging (BA) (0.76), Random Subspace (SUB)(0.77), and Class Switching (CS) (0.75) compared equally well torater performance but were better than the single classifiers athandling all six trait dimensions.

This study proposes using local methods for this classificationtask. It is well known in the biometric literature that methodsbased only on the global information of the images are not veryeffective with differences in facial expressions, illumination condi-tions, and poses. In the last few years, several face recognitionmethods based on local information have been proposed (Got-tumukkal and Asari, 2004; Nanni and Maio, 2007; Shan et al.,2006; Tan and Chen, 2005). In Shan et al. (2006), the authors pro-posed using ensembles of classifiers, where the face image is firstdivided into M subimages (with all subimages having the same sizewithout overlapping) according to their spatial locations. M featurevectors were then obtained by extracting the Local Binary Pattern(LBP) from the Gabor convolved subimage. A Fisher Discriminant

Analysis (FDA) was then used to project the Gabor features ontoa lower dimensional space. Finally, a nearest neighbor (NN) wasperformed in each space, and the results were combined.

LBP is a well known descriptor for textures and has been usedfor representing faces (Ojala, Pietikainen, & Maeenpaa, 2002; Shanet al., 2006). In Shan et al. (2006) it is shown that if the LBP featuresare extracted from the Gabor convolved image, it is possible to ob-tain better performance with respect to that obtained by extractingthe LBP features from the gray level image.

The remainder of this paper is organized as follows. In Section 2,we provide background information regarding this problem do-main by summarizing the characteristics and traits associatedwith the overgeneralization effects of facial maturity, attractive-ness, emotion, and gender. In Section 3, we present the studydesign and our method of generating the stimulus faces and ofevaluating subject ratings. In Section 4, we describe our systemarchitecture, and in Section 5 we compare the performance ofsimple classifiers and global classifier ensembles to the perfor-mance of different local methods. Our tests show that the localmethods are more stable in performance across all six trait dimen-sions and closely approximate, if not exceed, the performance ofindividual human raters.

2. Social impressions of faces

In this section we provide some background material regardingthe social impressions of facial morphology from the perspective ofsocial psychology. As reviewed in Zebrowitz (2006), several theo-ries have been advanced to explain why people form impressionsof facial morphology that are fairly consistent. No one theory pro-vides a complete understanding of this process. Several prominentsocial psychologists involved in facial impression research take anecological approach by examining facial appearances in terms oftheir social affordances. As noted in the introduction, one impor-tant elaboration of the ecological approach is based on a numberof overgeneralization hypothesis (McArthur & Baron, 1983). Re-search has shown that the structural characteristics of these effectsare associated with specific sets of personality traits. As an intro-duction to this problem domain, we provide an overview in thissection of the morphological characteristics and associated traitimpressions of the following overgeneralization effects: facialattractiveness, maturity, gender, and emotion.

2.1. Overgeneralization effect of attractiveness

Attractive people are sexually desirable. Although debated,there is some evidence that attractiveness broadcasts actual indi-cators of physical, emotional, and intellectual fitness (Jones,1995; Zebrowitz, Hall, & Murphy, 2002). The facial characteristicsassociated with attractiveness include proportion (Alley & Hilde-brandt, 1988), symmetry (Grammer & Thornhill, 1994), straight-ness of profile (Carello, Grosofsky, & Shaw, 1989), closeness tothe population average (Langlois & Roggman, 1990), and sex spe-cific features (Alley, 1992).

The overgeneralization effect of attractiveness hypothesizesthat attractive people, being more sexually desirable, would beattributed more positive character traits. Much evidence supportswhat has come to be called the attractiveness halo effect. Attractivepeople are generally considered more socially adroit (Kuhlensch-midt & Croger, 1988), better leaders (Cherulnik, 1989), healthier,psychologically more adapted (Langlois, Kalakanis, & Rubenstein,2000), intellectually more capable, and more moral than thoseless attractive. As would be expected, unattractive faces are associ-ated with a host of negative traits, and they elicit more negativebehaviors from people. Unattractive people are thought to be

1 Implemented as in PrTools 3.1.7 ftp://ftp.ph.tn.tudelft.nl/pub/bob/prtools/prtools3.1.7.

5088 S. Brahnam, L. Nanni / Expert Systems with Applications 37 (2010) 5086–5093

untrustworthy (Bull & Rumsey, 1988), uncooperative (Mulford,Orbell, & Shatto, 1998), socially maladroit, unintelligent (Feingold,1992), and psychologically unstable and antisocial (Langlois et al.,2000). People tend to gravitate towards the more attractive andignore and avoid the unattractive (Bull & Rumsey, 1988). Attractivepeople are rewarded more often (Raza & Carpenter, 1987), whereasunattractive people suffer more inconsiderate behavior, aggression(Alcock, Solano, & Kayson, 1998) and abuse (Langlois et al., 2000).

2.2. Overgeneralization effect of babyfaceness

Infant faces trigger protective and nurturing behaviors in mostmature animals and human beings (Lorenz, 1970–1971). The facialfeatures that provoke these behaviors have been identified and in-clude relatively large eyes, high eyebrows, red lips that are propor-tionally larger, and a small nose and chin. These facial features arealso placed lower on the face (Berry & McArthur, 1985).

Given the strength of these nurturing-inducing facial stimuli, itis not surprising that adult faces that share some of the character-istics of baby faces also produce impressions similar to those asso-ciated with infants (Berry & McArthur, 1985). Babyfaced people areattributed such common childlike characteristics as submissive-ness, naïveté, honesty, kind heartedness, weakness, and warmth(Berry & McArthur, 1985; McArthur & Apatow, 1984). Anotherstrong correlation is lack of credibility and intellectual competence(Berry & McArthur, 1985). Babyfaced people are also thought to bemore helpful, caring, and in need of protection (Berry & McArthur,1986). In contrast, mature-faced individuals are more likely tocommand respect and to be perceived as experts (Zebrowitz,1998). Facial maturity is characterized by a broader forehead,strong eyebrows, thin lips, lowered eyebrows, and narrow eyes(Keating, 1985).

2.3. Overgeneralization effect of gender

The trait impressions of babyfaced adults are similar to traitsstereotypically attributed to females (Zebrowitz, 1998). This maybe due to the fact that female faces tend to retain into adulthoodmore of the morphological characteristics of youth (Enlow & Hans,1996). Female heads and facial features are smaller and the eyestend to be more protrusive, making the eyes look larger. Femaleeyebrows are usually thinner and arched, resulting in eyebrowsthat look higher up on the face. In contrast, male faces tend havefeatures associated with more mature faces: large nose, dominantjaw, angular cheekbones, deep-set eyes, and a more pronouncebrow. The last two characteristics make the eyes look smaller (En-low & Hans, 1996).

Females, like babyfaced individuals, are more likely to be as-cribed characteristics associated with children. Women are stereo-typically thought to be more submissive, caring, and in need ofprotection. Male faces, in contrast, are perceived as being moredominant, intelligent, and capable (Zebrowitz, 1998).

2.4. Overgeneralization effect of emotion

Research supports the idea that facial features that are associ-ated with emotional states, such as smiling when happy, are re-garded in ways appropriate to the emotional expression. Peoplereact positively to smiling faces and find them disarming and notvery dominant (Keating, Mazur, & Segall, 1981). The overgenerali-zation effect of emotion predicts that a person whose lips resemblesmiling would be perceived more positively. Research supportsthis hypothesis. People with upturned lips are considered friendly,kind, easygoing, and nonaggressive (Secord, Dukes, & Bevan, 1954).Similarly, facial morphology that resembles anger, i.e., faces thathave low-lying eyebrows, thin lips, and withdrawn corners of the

mouth, are perceived to be threatening, aggressive, and dominant(Keating et al., 1981).

3. Study design

In this section, we provide a brief description of the datasetsused in our experiments. A complete description of the study de-sign and datasets can be found in Brahnam and Nanni (2009).

Our concern in developing the study design was to produce aset of faces that produced strong human consensus in the followingtrait categories: dominance, intelligence, maturity, sociality, trust-worthiness, and warmth. A slight modification of these categorieswas found to be most significant in the formation of personimpressions according to a meta-analysis of the social psychologyliterature from 1967 to 1987 (Eagly, Ashmore, & Makhijan, 1991;Feingold, 1992; Rosenberg, 1977).

In Brahnam and Nanni (2009), we justify our decision to ex-plore machine classification of the social impressions faces makeon the average human observer using artificially constructedfaces, as opposed to photographs and 3-D scans. Each representa-tion offers both benefits and drawbacks and can be comparedwith numerous social psychology studies. None of the above rep-resentations truly presents faces as they appear to us in our socialenvironment (Brahnam & Nanni, 2009). We chose to use facialcomposites because they have been widely studied in the personperception literature, and compared to 3-D scans are easier tocollect.

In earlier studies, we randomized the construction of facesusing the popular composite program FACES (Freierman, 2000).This produced a large number of faces that failed to produce signif-icant impressions along the six trait dimensions. To remedy thisproblem, we had four artists design an equal number of maleand female faces they thought fit each of the bipolar extremes ofthe trait dimensions. This resulted in the production of 480 stimu-lus faces that had a greater probability of eliciting strong traitimpressions (see Fig. 1 for example faces). We then asked humansubjects from a large Midwestern university to assess each facealong each dimension using a 3-point scale. The midpoint was la-beled neutral and the first and last points in the scale were ran-domly labeled, using polar opposite descriptors, e.g., intelligent/unintelligent and dominant/submissive. Subjects were provided witha definition of each trait descriptor.

A statistical analysis of the subject ratings produced the sixdatasets of faces listed in Table 1. Faces in each trait database weredivided into two groups of low and high. For a face to be included inone of the six datasets, it needed to have a standard deviation lessthan 1.0, a mean rating <1.6 for low membership and >2.4 for highmembership. Furthermore, the mode had to match the class (1 forlow and 3 for high). Fig. 1 provides a sample of the faces that fellinto the low (cold) and high (warm) classes for the trait dimensionof warmth.

4. System architecture

Fig. 2 provides a basic schematic of the architecture of our pro-posed system. Starting from the idea that each subwindow (SW)provides unique information, we train different classifiers usingthe features extracted from a given SW. Moreover, we selectonly a subset of all the SWs by running Sequential Forward Float-ing Selection (SFFS) (Nanni & Lumini, 2007; Pudil, Novovicova, &Kittler, 1994).1 Hence, for each selected SW, a different classifieris trained, and then these classifiers are combined using the sumrule.

Fig. 1. Example faces from the warmth dataset. The top row of faces were rated significantly higher in the high set, warm. The bottom row of faces were rated significantlyhigher is the low set, cold. (Figure reproduced from Brahnam and Nanni (2009), reprinted with permission.)

Table 1Number of images in the two classes (high and low) of each trait dimension.

Trait dimension Attribution class Number

Intelligence LowHigh

39140

Dominance LowHigh

99102

Maturity LowHigh

49126

Sociality LowHigh

126117

Trustworthiness LowHigh

97151

Warmth LowHigh

139146

Note: Since a face can be rated as high or low in more than one trait dimension, thetotal number of faces in the six datasets is >480 (the total number of stimulusfaces). Of course, within a given dimension, no face could be rated as both high andlow.


In the feature extraction step, the LBP features are extractedfrom the Gabor convolved image (Shan et al., 2006). The resultantGabor feature set consists of the convolution results of an input SWimage with 16 Gabor filters. Since we concatenate the invariantLBP (uniform patterns) histograms2 of 10 bins, 18 bins, and 26 bins,our feature vector is 864-dimensional (864 = 16 � (10 + 18 + 26)) foreach SW.

Gabor filters: The Gabor wavelets capture the properties of spa-tial localization, orientation selectivity, spatial frequency selectiv-ity, and quadrature phase relationship (Shan et al., 2006). In thespatial domain, the 2D Gabor filter is a Gaussian kernel functionmodulated by a sinusoidal plane wave. The Gabor wavelet repre-sentation of an image is the convolution of the image with a familyof Gabor kernels. Usually a number of Gabor filters of differentscales and orientations are used. In this paper we apply Gaborwavelets at four different scales and four directions (0�, 45�, 90�, 180�).

2 The Matlab implementation are available at the Outex site (http://www.ee.ou-lu.fi/mvg/page/lbp_matlab).

Invariant local binary patterns2: In Ojala et al. (2002) the authorspropose a theoretically and computationally simple approach,based on local binary patterns, which is robust in terms of gray-scale variations and which is shown to discriminate a large rangeof rotated textures efficiently. Starting from the joint distributionof gray values of a circularly symmetric neighbor set of pixels ina local neighborhood, they derive an operator that is, by definition,invariant against any monotonic transformation of the grayscale.Rotation invariance is achieved by recognizing that this grayscaleinvariant operator incorporates a fixed set of rotation invariantpatterns. A binary pattern LBPP,R is calculated by considering thedifference between the gray value of the pixel x from the gray val-ues of the circularly symmetric neighborhood set of P pixels placedon a circle radius of R; when a neighbor does not fall exactly in thecenter of a pixel, its value is obtained by interpolation. The LBP his-togram of dimension n (usually n = P + 2) is obtained by consider-ing all the binary patterns of a given image. In this work we havetested the three configurations proposed in Ojala et al. (2002),namely, (P = 8; R = 1; n = 10), (P = 16; R = 2; n = 18), (P = 24; R = 3;n = 26).

Sequential Forward Floating Selection1 (SFFS) is a top downsearch that successively deletes features from a set of originalcandidate features in order to find a smaller optimal set of features.As the number of subsets to be considered grows exponentiallywith the number of original features, these algorithms provide aheuristic for determining the best order to transverse the featuresubset space. The best feature subset, Sk of size k, is constructedby adding a single feature to the subset, Sk�1 (with k � 1 initiallyequal to 0), that gives the best performance for the new subset.At this point, each feature in Sk is excluded, and the new set S0k�1

is compared with Sk�1. If S0k�1 outperforms Sk�1, then it replacesSk�1.

We use SFFS as a feature selection method, where each feature,as noted above, is a given SW (i.e., a selection of the face). We dothis to find the most useful SWs for classification in the whole face.Therefore, we select (one for all) the classifiers to be combined bySFFS, using the minimization of the Area Under the ROC curve(AROC) as the objective function (see Section 4).

http://www.ee.oulu.fi/mvg/page/lbp_matlab

http://www.ee.oulu.fi/mvg/page/lbp_matlab

Fig. 2. System architecture.

0.590.640.690.740.790.840.89

Intelligence

Maturity

Warmth

Sociality

Dominance

Trustworthiness

AVERAGE

PCA+NN PCA+SUB PCA+SVM LEM+SVMSW30(10) SW30(28) SW20(20) SW20(60)

Fig. 3. Comparison among the methods tested in this paper.


Given our proposed architecture, several methods can be builtusing different classifiers and methods for SW positioning. In thispaper, we propose three different systems:

1. SW30(k), where each image is divided into 28 SWs (size 30 � 30pixels) and where the LBP features are extracted and convolvedwith Gabor Filters from the SWs (Shan et al., 2006). Linear SVMsare used to classify the data. We select (one for all) the best kclassifiers to be combined by running SFFS using AROC as the

0.58

0.63

0.68

0.73

0.78

0.83

PCA+NN

PCA+SUB

PCA+SVM

LEM+SVM

SW30(28)

SW30(10)

SW20(20)

SW20(60)

Fig. 4. Average AROC obtained in the six problems studied in this work.

objective function on each of the six datasets. For each datasetwe randomly resample the learning and the testing sets (con-taining respectively half of the patterns) twenty times.

2. SW20(k), as in 1 above, except in this method each image isdivided into 60 SWs of dimension 20 � 20 pixels.

3. SW20RS(k), as in 2 above, except in this method the classifierwe use is a random subspace of an ensemble of 100 LMNNswith 5 hidden nodes.3 To reduce dimensionality, the Gabor + LBPfeatures are projected onto a lower PCA subspace (where the pre-served variance is 0.98) before training the ensemble.

5. Classification experiments

We compare the performance gained by our local methods withthe following four different global methods (as proposed in Brah-nam & Nanni (in press)):

1. PCA + NN, classifier based on the PCA3 coefficients extractedfrom the gray values (where the preserved variance is 0.98)using the Euclidean distance;

2. PCA + SUB, classifier based on the PCA coefficients and Oja’ssubspace maps, where subspaces are computed3 for each classof the dataset and the distance of the test patterns to these sub-spaces are used to classify the data;

3 Implemented as in PrTools 3.1.7 ftp://ftp.ph.tn.tudelft.nl/pub/bob/prtools/prtools3.1.7.

0.650.670.690.710.730.750.770.790.81

1 2 3 4 5 6 7 8 9 100.650.670.690.710.730.750.770.790.81

1 3 5 7 9 11 13 15 17 19

a b

Fig. 5. Plot of the performance of SW30 and SW20 varying the value of k.

Fig. 6. The three best sub-windows.

Fig. 7. The three worst sub-windows.


3. PCA + SVM, classifier based on the PCA coefficients and a LinearSVM classifier (Vapnik, 1995);

4. LEM + SVM, classifier based on the Laplacian eigenface (50dimensions are retained and preprocessed using PCA withpreserved variance of 0.98 (He, Yan, & Hu, 2005)) and a linearSVM.

In addition, we compare these systems with standard methodsfor building ensembles of classifiers, as reported in Brahnam andNanni (2009). We selected the LMNN with 5 hidden nodes3 asthe classifier for building the ensembles since it is well know thatensembles work better if unstable classifiers are used (Kuncheva &Whitaker, 2003). The ensembles were constructed using the fol-

lowing three methods (see Brahnam & Nanni (2009) for a more de-tailed description of each of the following methods):

1. PCA + RS, classifier based on the PCA coefficients and a randomsubspace (RS) ensemble of 100 LMNNs. In RS the individualclassifiers use only a subset of all features for training and test-ing (Ho, 1998). The percentage of the features retained in eachtraining set was set to 50%.

2. PCA + BA, classifier based on the PCA coefficients and a bagging(BA) ensemble of 100 LMNN. Given a training set S, BA gener-ates K new training sets S1, . . .,SK; each new set Si is used to trainexactly one classifier (Breiman, 1996). Hence an ensemble ofindividual classifiers is obtained from K new training sets.

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Intelligence

Maturity

Warmth

Sociality

Dominance

Trustworthiness

AVERAGE

PCA+RS PCA+BA PCA+CWPCA+SVM SW20RS(20) SW20(20)

Fig. 8. Comparison among the global and local ensemble methods tested in this paper.


3. PCA + CS, classifier based on the PCA coefficients and a classswitching (CS) ensemble of 100 LMNN. CS creates an ensemblethat combines the decisions of classifiers generated by usingperturbed versions of the training set where the classes of thetraining examples are randomly switched (Martínez-Muñoz &Suárez, 2005).

6. Results

As previously noted, the performance indicator adopted in thiswork is the AROC (Ling, Huang, & Zhang, 2003). AROC4 is a scalarmeasure to evaluate performance which can be interpreted as theprobability that the classifier will assign a higher score to a randomlypicked true positive sample than to a randomly picked true negativesample.

In this section, we report the average AROC obtained randomlyby resampling 20 times the learning and the testing sets (each setcontaining half of the patterns). In Fig. 3, we compare the perfor-mance gained using the different global methods with our localmethods (where a stand-alone classifier is trained for each SW).In Fig. 4, we report the average AROC obtained in the six trait prob-lems studied in this work.

The tests reported in Fig. 5 are aimed at comparing the verifica-tion performance gained by our system as a function of the num-ber, k, SWs selected by SFFS. We plot the AROC obtainedcombining the k best matchers selected by SFFS. In Fig. 5a, we re-port the performance of the SW30 method and compare that inFig. 5b to the performance of the SW20 method.

The conclusions that may be extracted from the results re-ported in Figs. 3–5 are the following. First, we see that local ap-proaches with SW selection dramatically outperform globalapproaches. As can be seen, without feature selection, the perfor-mance of the local approaches is similar to the best global meth-ods. This indicates to us that some SWs are very useful in thisclassification problem, whereas others are not. Second, we notethat both the SW20 and the SW30 methods, with k = 3, obtainsa performance similar to the best global approach. The bestSWs selected in the SW20 and SW30 methods are shown inFig. 6, and the worst SWs are shown in Fig. 7. Classifying theworst SWs using SW30, for example, obtained an average AROCof only 0.55.

4 AUC is implemented as in dd tools 0.95 http://[email protected].

In Fig. 8, we compare the performance gained using the stan-dard ensemble of classifier methods, described in this sectionabove, and our local methods, described in Section 4, where anensemble of classifiers is trained for each SW.

From this test it is clear that a global method based on anensemble of LMNNs outperforms a global method based on SVM.Among the methods for building an ensemble of classifiers, theRS is the best method. The local method SW20RS and the localmethod SW20 obtains a very similar result, with both obtainingan AROC curve �0.82.

In addition to the experiments reported above, we ran SW20(the system with the best trade-off between performance andcomputation time), using a different protocol to select the bestSWs. For each of the six trait datasets, the used SWs are selectedconsidering the other five datasets. For instance, the SWs used toclassify the trait category of intelligence are selected consideringthe trait categories of maturity, warmth, sociality, dominance,and trustworthiness. We find that using this blind testing protocolthe AROC curve is very high (�0.80).

It is important to stress that trait impressions are subjective andthat two different people might assign different trait scores to thesame face. We found that the AROC of human raters to be 0.77. Tocalculate this value, we use the score assigned to each image fromeach user to calculate his individual performance. These individualperformances were then averaged. It is interesting to note that theperformance obtained by a local ensemble is higher than the aver-age performance of the individual raters. This result leads us to be-lieve that machines are as capable of classifying faces according tothe impressions they make on the general observer as are most hu-man beings.

7. Conclusion

This study indicates that machines are capable of perceivingfaces in terms of the social impressions they make on people.Using a dataset we specifically developed for this task, wetrained single classifiers and ensembles to match human impres-sions of faces in the bipolar extremes of the following trait dimen-sions: intelligence, maturity, warmth, sociality, dominance, andtrustworthiness.

Two single classifiers, SVMs and LMNN, were able to closelymatch the performance of human raters at the task. However, localface recognition techniques proved superior to the single classifiers

http://[email protected]


in their ability to handle all six dimensions. Our local methods evenoutperformed human raters in the averaged AROC of the six traitdimensions.

In Section 3 we noted a number of limitations using variousmethods of representing faces (e.g., photographs, facial construc-tions, and 3-D scans). The faces in this study were constructed fromthe database of facial features in the popular composite programFACES. Four artists were asked to produce 480 stimulus faces withan eye towards making faces they thought were clearly intelligent,unintelligent, mature, immature, warm, cold, social, unsocial, dom-inant, submissive, trustworthy, and untrustworthy. This method ofgenerating faces succeeded in producing an average of 111 facesthat were verified to elicit the impressions associated with theabove classes. For future work, we are developing datasets of pho-tographs of natural faces that have equally large numbers of facesin each of the twelve trait classes.

At present, our interest in building systems that match humanclassification of faces in terms of the personality impressions theymake is purely theoretical. However, as with emotion recognition,we anticipate the development of application areas. We believeour local results and the SWs we have isolated could help social psy-chologist investigate further the specific facial characteristics thatlead people to make various trait attributions. Aside from develop-ing systems and models that may prove valuable in social psychol-ogy, research in this area could also prove useful in human–computer interaction (HCI). Many HCI researchers anticipate thathuman-like interfaces and robots of the future will need to possessan understanding of the social value of objects, including faces, ifthey are to engage with us in socially meaningful and gratifyingways. It is not enough in human society simply to recognize whatan object is; one is expected to be able to manipulate objects interms of the cultural layers of meanings that envelop them. Our pre-liminary research in this area indicates that it is possible for ma-chines to detect some of the social significance embedded in images.

References

Alcock, D., Solano, J., & Kayson, W. A. (1998). How individuals’ responses andattractiveness influence aggression. Psychological Reports, 82(June 3/2),1435–1438.

Alley, T. R. (1992). Perceived age, physical attractiveness and sex differences inpreferred mates’ ages. Behavioral and Brain Sciences, 15(1), 92.

Alley, T. R., & Hildebrandt, K. A. (1988). Determinants and consequences of facialaesthetics. In T. R. Alley (Ed.), Social and applied aspects of perceiving faces.Resources for ecological psychology (pp. 101–140). Hillsdale, NJ: LawrenceErlbaum Associates, Publishers.

Berry, D. S., & McArthur, L. Z. (1985). Some components and consequences of ababyface. Journal of Personality and Social Psychology, 48(2), 312–323. February.

Berry, D. S., & McArthur, L. Z. (1986). Perceiving character in faces: The impact ofage-related craniofacial changes on social perception. Psychological Bulletin,100(1), 3–18.

Brahnam, S. (2002). Modeling physical personalities for virtual agents by modeling traitimpressions of the face: A neural network analysis. Ph.D. diss, The Graduate Centerof the City of New York, Department of Computer Science, New York.

Brahnam, S., & Nanni, L. (2009). Predicting trait impressions of faces using classifierensembles. Computational intelligence (Vol. 1): Collaboration, fusion andemergence (pp. 403–439). Intelligent Systems Reference Library (ISRL): Springer.

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.Bull, R., & Rumsey, N. (1988). The social psychology of facial appearance. New York:

Springer-Verlag.Carello, C., Grosofsky, A., Shaw, R. E., et al. (1989). Attractiveness of facial profiles is

a function of distance from archetype. Ecological Psychology, 1(3), 227–251.Cherulnik, P. D. (1989). Physical attractiveness and judged suitability for leadership.

In Midwestern psychological association, Chicago.Eagly, A. H., Ashmore, R. D., Makhijan, M. G., et al. (1991). Psychological Bulletin,

110(1), 109–128.Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Palo Alto, CA:

Consulting Psychologists Press Inc.

Enlow, D. H., & Hans, M. G. (1996). Essentials of facial growth. Philadelphia: W.B.Saunders Company.

Fasel, B., & Luettin, J. (2003). Automatic facial expression analysis: A survey. PatternRecognition, 36, 259–275.

Feingold, A. (1992). Good-looking people are not what we think. PsychologicalBulletin, 111(2), 304–341.

Freierman, S. (2000). Constructing a real-life mr. potato head. Faces: The ultimatecomposite picture. The New York Times, 6.

Gottumukkal, R., & Asari, V. K. (2004). An improved face recognition techniquebased on modular PCA approach. Pattern Recognition Letters, 25(4), 429–436.

Grammer, K., & Thornhill, R. (1994). Human (homo sapiens) facial attractivenessand sexual selection: The role of symmetry and averageness. Journal ofComparative Psychology, 108(3), 233–242.

He, X., Yan, S., Hu, Y., et al. (2005). Face recognition using Laplacian faces. IEEETransactions on Pattern Analysis and Machine Intelligence, 27(3), 328–340.

Ho, T. K. (1998). The random subspace method for constructing decision forests.IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.

Jones, D. (1995). Sexual selection, physical attractiveness, and facial neoteny.Current Anthropology, 36(5), 723–748.

Kanghae, S., & Sornil, O. (2007). Face recognition using facial attractiveness. IAIAT,Bangkok Thailand.

Keating, C. F. (1985). Gender and the physiognomy of dominance andattractiveness. Psychological Quarterly, 48, 61–70.

Keating, C. F., Mazur, A., & Segall, M. H. (1981). A cross-cultural exploration ofphysiognomic traits of dominance and happiness. Ethology and Sociobiology, 2,41–48.

Kuhlenschmidt, S., & Croger, J. C. (1988). Behavioral components of socialcompetence in females. Sex Roles, 18, 107–112.

Kuncheva, L. I., & Whitaker, C. J. (2003). Measures of diversity in classifier ensemblesand their relationship with the ensemble accuracy. Machine Learning, 51(2),181–207.

Langlois, J. H., Kalakanis, L., Rubenstein, A. J., et al. (2000). Maxims or myths ofbeauty? A meta-analytic and theoretical review. Psychological Bulletin, 126(3),390–423.

Langlois, J. H., & Roggman, L. A. (1990). Attractive faces are only average.Psychological Science, 1, 115–121.

Ling, C. X., Huang, J., & Zhang, H. (2003). Auc: A better measure than accuracy incomparing learning algorithms. Advances in Artificial Intelligence, 329–341.

Lorenz, K. (1970–1971). Studies in animal and human behaviour. Cambridge, MA:Harvard University Press.

Martínez-Muñoz, G., & Suárez, A. (2005). Switching class labels to generateclassification ensembles. Pattern Recognition, 38(10), 1483–1494.

McArthur, L. Z., & Apatow, K. (1984). Impressions of baby-faced adults. SocialCognition, 2(4), 315–342.

McArthur, L. Z., & Baron, R. M. (1983). Toward an ecological theory of socialperception. Psychology Review, 90(3), 215–238.

Mulford, M., Orbell, J., Shatto, C., et al. (1998). and success in everyday exchange.American Journal of Sociology, 103(6), 1565–1592.

Nanni, L., & Lumini, A. (2007). A hybrid wavelet-based fingerprint matcher. PatternRecognition, 40(11), 3146–3203.

Nanni, L., & Maio, D. (2007). Weighted sub-Gabor for face recognition. PatternRecognition Letters, 28(4), 487–492.

Ojala, T., Pietikainen, M., & Maeenpaa, T. (2002). Multiresolution gray-scale androtation invariant texture classification with local binary patterns. IEEETransactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.

Pudil, P., Novovicova, J., & Kittler, J. (1994). Floating search methods in featureselection. Pattern Recognition Letters, 5(11), 1119–1125.

Raza, S. M., & Carpenter, B. N. (1987). A model of hiring decisions in realemployment interviews. Journal of Applied Psychology, 72, 596–603.

Rosenberg, S. (1977). New approaches to the analysis of personal constructs inperson perception. In A. L. Land & J. K. Cole (Eds.), Nebraska symposium onmotivation (pp. 179–242). Lincoln: University of Nebraska Press.

Secord, P. F., Dukes, W. F., & Bevan, W. W. (1954). An experiment in socialperceiving. Genetic Psychology Monographs, 49(second half), 231–279.

Shan, S., Zhang, W., Chen, X., et al. (2006). Ensemble of piecewise FDA based onspatial histograms of local (Gabor) binary patterns for face recognition, The 2006international conference of pattern recongnition (pp. 606–609).

Sun, Y., Sebe, N., Lew, M. S., et al. (2004). Authentic emotion detection in real-timevideo. Lecture Notes in Computer Science, 3058, 94–104.

Tan, K., & Chen, S. (2005). Adaptively weighted sub-pattern PCA for face recognition.Neurocomputing, 64, 505–511.

Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer-Verlag.

Zebrowitz, L. A. (1998). Reading faces: Window to the soul? Boulder, CO: WestviewPress.

Zebrowitz, L. A. (2006). Finally, faces find favor. Social Cognition, 24(5), 657–701.Zebrowitz, L. A., Hall, J. A., Murphy, N. A., et al. (2002). Looking smart and looking

good: Facial cues to intelligence and their origins. Personality and SocialPsychology Bulletin, 28(238–249).

Technology

Predicting trait impressions of faces using local face recognition techniques