Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Analysis of Facial Dynamics for Affect and Face RecognitionAnalysis of Facial Dynamics for Affect and Face Recognition
Douglas Douglas FidaleoFidaleoMohan Mohan TrivediTrivedi
Computer Vision and Robotics Research Computer Vision and Robotics Research Laboratory,Laboratory,
University of California, San DiegoUniversity of California, San Diego
MotivationMotivation
Extracting events from Extracting events from video sequences Vehicle Full Bodyvideo sequences I
•• GG--folds: Appearance based facial folds: Appearance based facial gesture model for gesture intensity gesture model for gesture intensity analysis (applied to gesture and face analysis (applied to gesture and face detection/verification)detection/verification)
•• Thin plateThin plate splinespline features for features for affect analysis.
Head Face II
Affect Gesturesaffect analysis.
III
Modeling of Facial Gesture DynamicsModeling of Facial Gesture Dynamics
We seek a model that:We seek a model that:•• Normalizes the temporal component of gesturesNormalizes the temporal component of gestures•• Allows for analysis of dynamicsAllows for analysis of dynamics•• Enables video AND static image analysisEnables video AND static image analysis•• Is computationally efficientIs computationally efficient
intensity
timeSmile 2Smile 1
Gesture Manifolds (G-folds)Gesture Manifolds (G-folds)
Low dimensional continuous Low dimensional continuous parameterization of the parameterization of the appearance of facial gesturesappearance of facial gestures
Gesture data are samples of a Gesture data are samples of a continuous manifold.continuous manifold.
Structure of manifold dependent on Structure of manifold dependent on the appearance characteristics of the appearance characteristics of the data.the data.
Gesture modeled as a curve Gesture modeled as a curve parameterized by gesture parameterized by gesture intensity.
Smile
Grimace
Frown
Neutral
intensity.
[Fidaleo and Neumann 2003]
Gfolds OverviewGfolds Overview
•• CoarticulationCoarticulation regions [regions [FidaleoFidaleo and Neumann 2002]and Neumann 2002]•• Appearance Manifolds [Appearance Manifolds [MuraseMurase, , NayarNayar, , NeneNene 1996]1996]•• Principal Curves [Principal Curves [HastieHastie andand StuetzleStuetzle 1989]1989]
Gfolds Overview: DataGfolds Overview: Data
Gesture samples extracted Gesture samples extracted from neutral to maximum. from neutral to maximum. (repeated)(repeated)
X
Gfolds Overview (cont)Gfolds Overview (cont)
Gesture data modeled by continuous curveGesture data modeled by continuous curveProjection on curve determines gesture Projection on curve determines gesture
intensityintensity
t=0
t=1
t=.6
G-folds Origin G-folds Origin
Designed for person specific analysis for performance Designed for person specific analysis for performance driven facial animationdriven facial animation
[[FidaleoFidaleo and Neumann 2003]and Neumann 2003]
Example G-folds for different subjects and RegionsExample G-folds for different subjects and Regions
SID_0001 SID_0002 SID_0003
More G-folds…More G-folds…SID_0001 SID_0002 SID_0003
Current explorations: G-folds for DetectionCurrent explorations: G-folds for Detection
Can we exploit the GCan we exploit the G--fold structure fold structure differences to:differences to:a) detect the occurrence of specific a) detect the occurrence of specific
gestures?gestures?b) detect faces in video sequences?b) detect faces in video sequences?c) discriminate between people?c) discriminate between people?
Current experiments: TestbedCurrent experiments: Testbed
Thermal
Video
Audio
[[FidaleoFidaleo andand TrivediTrivedi 2003]2003]
First round: Face verificationFirst round: Face verification
4 Gestures4 Gestures
To evaluate: To evaluate: a) Which gestures can be most consistently posed? a) Which gestures can be most consistently posed? b) Which gestures contain the most discrimination power? b) Which gestures contain the most discrimination power? c) How well do these perform for gesture detection and face c) How well do these perform for gesture detection and face
verification?
6 Subjects6 Subjects
verification?
Gfold comparisonGfold comparison
Training: Compute each subject’s Training: Compute each subject’s gfold gfold for each gesture, for each gesture, retain parameterized manifold and basis.retain parameterized manifold and basis.
Testing: Project new gesture data into selected Testing: Project new gesture data into selected gfoldgfoldbasis. Construct estimated principal curve for gesture basis. Construct estimated principal curve for gesture using using polylinepolyline algorithm [algorithm [KeglKegl 2000].2000].
t=0
t=1
Comparison (cont.)Comparison (cont.)
Geometric templatesGeometric templatesUniformly resample curvesUniformly resample curvesCompute similarity (aggregate Compute similarity (aggregate
distance of each point to distance of each point to curve)curve)
Results Results
Eyebrow Raise (L,R,C)
Smile (L,R)
Brow Furrow
Blink (L,R)
100%
100%
100%
66%
• 10 actuations, • 5 test/5 train sequences• Same session• Average over 6 subjects and all combinations of train/test sequences
Results (cont.)Results (cont.)
Problems with eye blink: too many Problems with eye blink: too many DOF’sDOF’s
Eyeball rotationshifts blink gestureacross the manifold, so a 1Dparameterization isinsufficient
LimitationsLimitations
Small errors in registration lead to large (relatively) shifts Small errors in registration lead to large (relatively) shifts in manifold. (as evident by eye motion effects)in manifold. (as evident by eye motion effects)
Visibility of important feature such as wrinkles are Visibility of important feature such as wrinkles are dependent on the lighting angledependent on the lighting angle
Face partitioning does not allow for speech gesturesFace partitioning does not allow for speech gesturesMay not scale well with number of subjects (3 PC’s may May not scale well with number of subjects (3 PC’s may
be insufficient)be insufficient)
Future Work: Short Term –GFolds specificFuture Work: Short Term –GFolds specific
DataData•• Different acquisition sessionsDifferent acquisition sessions•• False alarmFalse alarm--capable sessionscapable sessionsEvaluationEvaluation•• Sensitivity to nonSensitivity to non--ideal conditionsideal conditions•• Gesture detectionGesture detection•• Model dimensionalityModel dimensionality•• More thorough theoretical analysis ofMore thorough theoretical analysis of gfoldsgfolds modelmodel•• More robustMore robust gfoldgfold distance metricdistance metric
Future Work: Online SystemFuture Work: Online System
•• Generation of principal curves onlineGeneration of principal curves online•• Integration with robust online face Integration with robust online face
trackingtracking•• Connection to inConnection to in--group face detection and group face detection and
capturecapture
Future Work: Longer TermFuture Work: Longer Term
•• MultiMulti--modal gesture analysismodal gesture analysis•• Auto enrolling of subjectsAuto enrolling of subjects•• Asymmetry profiles using gesture intensity Asymmetry profiles using gesture intensity
extractionextraction•• Speech GesturesSpeech Gestures
PART II: Pose-Invariant Facial Affect Analysis Using Thin-Plate SplinesPART II: Pose-Invariant Facial Affect Analysis Using Thin-Plate Splines
Joel McCallJoel McCallAdvisor: Prof. Mohan TrivediAdvisor: Prof. Mohan Trivedi
Computer Vision and Robotics Research Laboratory,Computer Vision and Robotics Research Laboratory,University of California, San DiegoUniversity of California, San Diego
MotivationMotivationApplicationsApplications
• User Interface/Human Computer Interaction
• Remote Communications/Messaging
Requirements Requirements –– Ideal SolutionIdeal Solution• Robust to lighting and other environmental changes
• Robustness to rigid body motion while not removing dependence on non-rigid motions from facial affects
• Real-Time operation (efficient algorithms)
SolutionSolution• Thin-Plate Splines!
System OverviewSystem Overview
Feature Vector ExtractionFeature Vector Extraction
ThinThin--Plate Plate SplineSpline Warping ParametersWarping Parameters• Closed form solution can be calculated quickly
• Parameters can be separated into affine warping and Non-linear warping
•Affine warping can approximate prospective projections of planar surfaces undergoing rotations
•Assume the points used to generate the model are nearly planar
ResultsResults
Tests on CohnTests on Cohn--KanadeKanade Facial Expression Facial Expression Database Database • Over 200 subject with FACS ground truth