Facial Type, Expression, and Viseme Generation Josh McCoy, James Skorupski, and Jerry Yee

Facial Type, Expression, Facial Type, Expression, and Viseme Generation and Viseme Generation

Josh McCoy, James Josh McCoy, James Skorupski, and Jerry YeeSkorupski, and Jerry Yee

IntroductionIntroduction

Virtual Human FacesVirtual Human Faces– Hard to generateHard to generate– Easy to criticizeEasy to criticize

MotivationMotivation– MoviesMovies– GamesGames

ProblemsProblems– Hand-made models take timeHand-made models take time– Physically-based models look weirdPhysically-based models look weird

ContributionContribution

Data-driven facial face generationData-driven facial face generation User-guided categorizationUser-guided categorization Real-time pose generation from dataReal-time pose generation from data

Related Work: Face RetargetingRelated Work: Face Retargeting

V. Blanz, C. Basso, and T. VetterV. Blanz, C. Basso, and T. Vetter Reanimating Faces in Images and Video.Reanimating Faces in Images and Video.

– Use a morphable model to synthesize a 3D Use a morphable model to synthesize a 3D face of the 2D image.face of the 2D image.

– Capture 35 scans of static face poses Capture 35 scans of static face poses (expressions and visemes in neutral (expressions and visemes in neutral expression) from a source actor.expression) from a source actor.

– Find dense point-to-point correspondencesFind dense point-to-point correspondences– Retarget facial movements to the 3D face.Retarget facial movements to the 3D face.– Render the 3D face back into the 2D image.Render the 3D face back into the 2D image.

Related Work: Face RetargetingRelated Work: Face Retargeting

ProblemsProblems– Does not generate new expressions that are Does not generate new expressions that are

not in the source data set.not in the source data set.– Does not combine and retarget expressions Does not combine and retarget expressions

and visemes together.and visemes together.

Related Work: Bilinear ModelRelated Work: Bilinear Model E. Chuang and C. BreglerE. Chuang and C. Bregler Mood Swings: Expressive Speech AnimationMood Swings: Expressive Speech Animation

– Capture a video of an actor reading script under three Capture a video of an actor reading script under three different expressions (happy, angry, neutral)different expressions (happy, angry, neutral)

– Create a bilinear model, factoring expressions and Create a bilinear model, factoring expressions and visemes into two separate components.visemes into two separate components.

– Synthesize new facial movements with any expression Synthesize new facial movements with any expression and viseme.and viseme.

Related Work: Bilinear ModelRelated Work: Bilinear Model

ProblemsProblems– Requires a full Cartesian product of facial Requires a full Cartesian product of facial

expressions and visemes.expressions and visemes.– Does not generate new expressions that are Does not generate new expressions that are

not in the source data set.not in the source data set.– Does not change the facial characteristics Does not change the facial characteristics

(identity).(identity). Pres Videos\Jerry\Pres Videos\Jerry\moodswings.movmoodswings.mov

Related Work: Multilinear ModelRelated Work: Multilinear Model

D. Vlasic, M. Brand, H. Pfister, & J. PopovicD. Vlasic, M. Brand, H. Pfister, & J. Popovic Face Transfer with Multilinear ModelsFace Transfer with Multilinear Models

– Capture videos of 16 actors, each performing 5 Capture videos of 16 actors, each performing 5 visemes under 5 different expressions.visemes under 5 different expressions.

– Create a multilinear model, factoring Create a multilinear model, factoring expressions, visemes, and identity into three expressions, visemes, and identity into three separate components.separate components.

– Synthesize new facial movements with any Synthesize new facial movements with any expression, viseme, and identityexpression, viseme, and identity

Related Work: Multilinear ModelRelated Work: Multilinear Model

ProblemsProblems– Requires a full Cartesian product of facial expressions, Requires a full Cartesian product of facial expressions,

visemes, and identity.visemes, and identity.– Limitations in the missing data imputation process.Limitations in the missing data imputation process.– Does not generate new expressions that are not in the Does not generate new expressions that are not in the

source data set.source data set. Pres Videos\Jerry\vlasic-2005-ftm-sing.mp4Pres Videos\Jerry\vlasic-2005-ftm-sing.mp4

MethodsMethods

Acquire and CategorizeAcquire and Categorize LearnLearn GenerateGenerate

Acquire and CategorizeAcquire and Categorize

Three data sets Three data sets are needed to fill are needed to fill the model spacethe model space– Set of many Set of many

neutral facesneutral faces– Set of one face in Set of one face in

many posesmany poses– Set of Visemes Set of Visemes

with reference with reference faceface

Vertex Vertex CorrespondenceCorrespondence

User “rates” User “rates” attributes of attributes of each face each face

VideoVideo

Acquire and CategorizeAcquire and Categorize

LearnLearn

kR

Expression deformation

Viseme deformati

on

Type deformation

Reference Face

kjkkkkj vQSRv ,,'

Analyze each triangle and transform type separatelyAnalyze each triangle and transform type separately

LearnLearn

Low-dimensional subspace (PCA)

poly

gons

individuals

Compare each pose to reference faceCompare each pose to reference face Principle Component Analysis (PCA)Principle Component Analysis (PCA)

– Apply to each axis of variationApply to each axis of variation– Analyze transformation of every face in meshAnalyze transformation of every face in mesh

Infer variation of single attribute from combination of manyInfer variation of single attribute from combination of many

GenerateGenerate

Same sliders as categorization UISame sliders as categorization UI Generate any combination of Generate any combination of

attributesattributes Runs in real-timeRuns in real-time

ResultsResults

ConclusionConclusion

Realistic face poses from real-world Realistic face poses from real-world basis databasis data

Arbitrary faces from sparse data setArbitrary faces from sparse data set Future WorkFuture Work

– Use high res data to drive low res Use high res data to drive low res morphingmorphing

– Incorporate more biologically accurate Incorporate more biologically accurate face modelface model

Documents

Facial Type, Expression, and Viseme Generation Josh McCoy, James Skorupski, and Jerry Yee