Deciphering the Faceclopinet.com/isabelle/Projects/CVPR2011/slides/aleix.pdfAleix M. Martinez...

Preview:

Citation preview

Deciphering the FaceDeciphering the Face

Aleix M. MartinezComputational BiologyComputational Biology

and Cognitive Science Labl i @ daleix@ece.osu.edu

Human-ComputerPoliticsInteraction

HumanHumanfacefaceArt

Sign Language

faceface

Language

CognitiveCognitiveScience Computer Vision

Models of Face Perception

• Features: Shape vs. texture. ……

• 2D vs. 3D

• Form of the computational space:p p

Continuous vs. Categorical

What we are going to show

• What is the form of the computational space in human face perception? Hybrid approach:in human face perception? Hybrid approach: Linear combination of continuous representations of categoriesrepresentations of categories.

+ c2c1 + … + cn

• What are the dimensions? Mostly configural.

21 n

• In computer vision we need precise detailed detection of faces and facial features.detect o o aces a d ac a eatu es.

Identity

Same or different?

Identity

Same or different?

Identity

Same or different?Identity, expression, gender, etc.

Dimensions of the Face Space

Same or different?Configural processing

Form of the Computational Face SpaceComputational Face Space

Exemplar-based modelExemplar based model

Exemplar cells …

Norm-based modelMid-level

cells

vision

Low-level vision

Facial Expressions of Emotion

Muscle Positions Model

Muscle Positions Model

• Global shape (bone structure)determines identity – configural.y g

• But ONLY muscles are responsiblefor expression interactionfor expression, interaction …

Configural Processing

Emotion perception inl femotionless faces

NeutralNeutralAngry

Sad

Neth & Martinez, JOV, 2009.

Stimuli

25%

50% 100%

75%

Neth & Martinez, JOV, 2009.

ExperimentExperiment

Less, same, more.

Configural Processing

Sad

*

**

** *

*80

90

**

50

60

70

LessSame

*

* ** * *

*

20

30

40More

0

10

-100% -75% -50% -25% 0% 25% 50% 75% 100%

Neth & Martinez, JOV, 2009.

Configural Processing

Angry

* **

* *80

90

*

**

*

50

60

70

LessSame

***

* *

***20

30

40More

0

10

-100% -75% -50% -25% 0% 25% 50% 75% 100%

Neth & Martinez, JOV, 2009.

Norm-based Face SpaceSadness

MultidimensionalS 75%

100%

Face Space

- density+ density

50%

75%

+ density 25%

Easier

+ density- density

MEAN

density100%

More difficult

AngerNeth & Martinez, JOV, 2009.

Configural Processing

Neth & Martinez, JOV, 2009.

Computational Space

Neth & Martinez, Vision Research, 2010

Computational Space

Thinner faceThinner face

Wider face

Neth & Martinez, Vision Research, 2010

American Gothic Illusion

Neth & Martinez, Vision Research, 2010

Why Configural Features?

15 x 10 pixels

Why Configural cues?

sad neutral angry

Neth & Martinez, Vision Research, 2010; Du & Martinez, 2011

Proposed Hybrid Model:Recognizing other emotion labelsRecognizing other emotion labels

+ cc + + c+ c2c1 + … + cn

Happily Angrily surprised

g ysurprised

Martinez, CVPR, 2011

Configural Processing = Precise detection of facial featuresdetection of facial features

4 2 pixels3,930

images

4.2 pixelserror

(1.5%)(1.5%)

Ding & Martinez, PAMI, 2010

Face Detection

Features VS contextObservation: Most detections are near the correct location – they are not incorrect, they are imprecise.location they are not incorrect, they are imprecise.

Key idea: Use context information to train where nott d t t f d f i l f t

Ding & Martinez, CVPR, 2008; PAMI, 2010

to detect faces and facial features.

Features VS contextObservation: Most detections are near the correct location – they are not incorrect, they are imprecise.location they are not incorrect, they are imprecise.

Key idea: Use context information to train where nott d t t f d f i l f tto detect faces and facial features.

Ding & Martinez, CVPR, 2008; PAMI, 2010

Features VS contextObservation: Most detections are near the correct location – they are not incorrect, they are imprecise.location they are not incorrect, they are imprecise.

Key idea: Use context information to train where nott d t t f d f i l f tto detect faces and facial features.

Ding & Martinez, CVPR, 2008; PAMI, 2010

Subclass Discriminant Analysisy

Between subclassBetween-subclass scatter matrix:

( ) ( )∑∑C H

Ti

Σ ( ) ( ).1 1∑∑= =

−−=i j

ijT

ijijB p μμμμΣ

Basis vectors:

.Λ= VΣVΣ XB

Basis vectors:

How many subclasses (H):Minimize the conflict, K.

Zhu & Martinez, PAMI, 2006

Precise Detailed Detection

E 6 2 i l (2%) M l 4 2 (1 5%)Error: 6.2 pixels (2%) vs Manual: 4.2 (1.5%)

Ding & Martinez, CVPR, 2008; PAMI, 2010

Detection + non-rigid SfM

Gotardo & Martinez, PAMI, 2011; Gotardo & Martinez, CVPR, 2011.

36

Take Home Messages

• What is the form of the computational space in human face perception? Linear combination of known categories.

+ c2c1 + … + cn

Wh t th di i ? M tl fi l

21 n

• What are the dimensions? Mostly configural.• Precise detection of facial features.

CBCSL

Paulo Gotardo, Shichuan Du, Don Neth, Liya Ding, OnurPaulo Gotardo, Shichuan Du, Don Neth, Liya Ding, Onur Hamsici, Samuel Rivera, Fabian Benitez, Hongjun Jia, Di You.

National Institutes of HealthNational Institutes of HealthNational Science Foundation