1
Tracking and Retargeting Facial Performances with a Data-Augmented Anatomical Prior Michael Bao, Ronald Fedkiw Stanford University Abstract Blendshape rigs are crucial to state-of-the-art facial anima- tion systems at high-end visual effect studios. These mod- els are used as geometric priors for tracking facial perfor- mances and are critical for retargeting a performance from an actor/actress to a digital character. However, the lim- ited, linear, and non-physical nature of the blendshape sys- tem causes many inaccuracies which results in an “uncanny valley”-esque performance. We instead propose the use of an anatomically-based model. The anatomical model can used to target the facial performance given by the actor/actress; however, unlike the blendshapes, the non-linear anatomical model built on simulation will preserve physical properties such as volume preservation which results in superior results. The model can furthermore be augmented by captured data to better capture the subtle nuances of the face. The captured facial performances on the anatomical model can then be eas- ily transferred to any digital character with a corresponding anatomical model in a semantically meaningful manner. Previous Work Figure 1: Top Left: An artist sculpted blendshape pose for an actor. Top Middle: The corresponding pose obtained by simulating using modified muscle tracks. Top Right: The modified muscle tracks captured from the blendshape pose. Bottom Left: The creature pose obtained by retargeting the modified muscle tracks and simulating. Bottom Right: The retargeted muscle tracks.[1] References [1] Matthew Cong, Kiran S Bhat, and Ronald Fedkiw. Art-directed muscle simulation for high-end facial animation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 119–127. Eurographics Association, 2016. [2] Thabo Beeler and Derek Bradley. Rigid stabilization of facial expressions. ACM Transactions on Graphics (TOG), 33(4):44, 2014. [3] Matthew M Loper and Michael J Black. Opendr: An approximate differentiable renderer. In European Conference on Computer Vision, pages 154–169. Springer, 2014. Capture We will utilize a three-camera setup to record RGB video data of facial performances. Each frame triplet will be used to reconstruct a point cloud/mesh of the face. This data, along with freely available 3D facial databases, will be at the core of each step of our algorithm. Figure 2: RGB images captured for a facial performance. Figure 3: Reconstructed point cloud of the RGB images. Rigid Tracking Given a point-cloud that contains a face, local geometric features can be used to determine an initial alignment. Rigid Iterative-Closest-Point algorithms can then refine and propagate the fit through time in a temporally consistent manner. min R, t 1 2 n X i R p s + t - p t 2 2 (1) Figure 4: Manual rigid tracking of the point cloud. Anatomical Prior: As seen in [2], the cranium can be used as an additional anatomical constraint in the rigid tracking min- imization to produce better results. Data Augmentation: Alignment/registration algorithms of- ten rely on detecting persistent local features as correspondences. Since we are dealing exclusively with the face, we can train a de- tector to identify prominent features such as the nose and corners of the eye. Non-Rigid Tracking min θ E sfs + E reconstruction + E prior (2) The goal of non-rigid tracking is to determine the anatomical simulation parameters ( θ ) that causes the face simulation to match the RGB images (E sfs ) and the 3-D point cloud (E reconstruction ) while being limited by anatomical constraints (E prior ). Using a combination of shape-from-shading techniques such as OpenDR [3] and the 3-D reconstructed point cloud will allow us to accurately match the given data. The anatomical prior will prevent the shape from going “off-model.” Data Augmentation: By using the method from [1], our anatomical model will have the expressiveness necessary to tar- get data. However, oftentimes the 2-D and/or 3-D data will be unreliable due to adverse lighting conditions, motion blur, occlu- sions, etc. Extra robustness can be baked into the anatomical prior by training it on a large number of poses captured in a controlled environment. Retargeting Blendshapes are commonly used to retarget facial performances. An animation created on one blendshape rig can be directly transferred to an identical blendshape rig for another model. However, the target model is oftentimes a digital actor for which one cannot capture shapes. As a result, artists must spend time creating shapes that may or may not be physically plausible. Furthermore, the linear nature of the blendshapes results in “uncanny-valley”-esque performances. As a result, we propose the use of anatomical models for retar- geting as in [1]. They use deformation transfer to control the muscle deformations on the target model. A simulation is then applied on top on the target model’s flesh to introduce nonlin- earities into the result. By introducing physical constraints such as volume preservation and collision, they are able to generate significantly improved results in the area around the lips. Anatomical Model An anatomical model can be generated for an actor by morphing from a template model. Figure 5: Top: The simulation surface for the actor. Middle: The underlying cranium and jaw. Bottom: The underlying facial muscles. Future Work Our lab is pursuing three projects focused on coupling simulation and computer vision: cloth, trees, and faces. We believe that the use of real-world data can add a level of detail previously not found in computer graphics. As a result, our goal is to develop a general model for combining data with simulation models and apply this technique to a wider variety of projects such as fluid simulations. Contact Information Web: http://physbam.stanford.edu/~fedkiw/ Email: {mikebao, rfedkiw}@stanford.edu

Tracking and Retargeting Facial Performances with …forum.stanford.edu/events/posterslides/Trackingand...Tracking and Retargeting Facial Performances with a Data-Augmented Anatomical

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tracking and Retargeting Facial Performances with …forum.stanford.edu/events/posterslides/Trackingand...Tracking and Retargeting Facial Performances with a Data-Augmented Anatomical

Tracking and Retargeting Facial Performanceswith a Data-Augmented Anatomical Prior

Michael Bao, Ronald FedkiwStanford University

Abstract

Blendshape rigs are crucial to state-of-the-art facial anima-tion systems at high-end visual effect studios. These mod-els are used as geometric priors for tracking facial perfor-mances and are critical for retargeting a performance froman actor/actress to a digital character. However, the lim-ited, linear, and non-physical nature of the blendshape sys-tem causes many inaccuracies which results in an “uncannyvalley”-esque performance. We instead propose the use of ananatomically-based model. The anatomical model can usedto target the facial performance given by the actor/actress;however, unlike the blendshapes, the non-linear anatomicalmodel built on simulation will preserve physical propertiessuch as volume preservation which results in superior results.The model can furthermore be augmented by captured datato better capture the subtle nuances of the face. The capturedfacial performances on the anatomical model can then be eas-ily transferred to any digital character with a correspondinganatomical model in a semantically meaningful manner.

Previous Work

Figure 1: Top Left: An artist sculpted blendshape pose for an actor. TopMiddle: The corresponding pose obtained by simulating using modified muscletracks. Top Right: The modified muscle tracks captured from the blendshapepose. Bottom Left: The creature pose obtained by retargeting the modifiedmuscle tracks and simulating. Bottom Right: The retargeted muscle tracks.[1]

References

[1] Matthew Cong, Kiran S Bhat, and Ronald Fedkiw.Art-directed muscle simulation for high-end facial animation.In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on ComputerAnimation, pages 119–127. Eurographics Association, 2016.

[2] Thabo Beeler and Derek Bradley.Rigid stabilization of facial expressions.ACM Transactions on Graphics (TOG), 33(4):44, 2014.

[3] Matthew M Loper and Michael J Black.Opendr: An approximate differentiable renderer.In European Conference on Computer Vision, pages 154–169. Springer, 2014.

Capture

We will utilize a three-camera setup to record RGB videodata of facial performances. Each frame triplet will be used toreconstruct a point cloud/mesh of the face. This data, alongwith freely available 3D facial databases, will be at the core ofeach step of our algorithm.

Figure 2: RGB images captured for a facial performance.

Figure 3: Reconstructed point cloud of the RGB images.

Rigid Tracking

Given a point-cloud that contains a face, local geometricfeatures can be used to determine an initial alignment. RigidIterative-Closest-Point algorithms can then refine and propagatethe fit through time in a temporally consistent manner.

minR,~t12

n∑i

∥∥∥R~ps + ~t− ~pt∥∥∥2

2(1)

Figure 4: Manual rigid tracking of the point cloud.

Anatomical Prior: As seen in [2], the cranium can be usedas an additional anatomical constraint in the rigid tracking min-imization to produce better results.Data Augmentation: Alignment/registration algorithms of-ten rely on detecting persistent local features as correspondences.Since we are dealing exclusively with the face, we can train a de-tector to identify prominent features such as the nose and cornersof the eye.

Non-Rigid Tracking

min~θ Esfs + Ereconstruction + Eprior (2)

The goal of non-rigid tracking is to determine the anatomicalsimulation parameters (~θ) that causes the face simulationto match the RGB images (Esfs) and the 3-D point cloud(Ereconstruction) while being limited by anatomical constraints(Eprior). Using a combination of shape-from-shading techniquessuch as OpenDR [3] and the 3-D reconstructed point cloud willallow us to accurately match the given data. The anatomicalprior will prevent the shape from going “off-model.”

Data Augmentation: By using the method from [1], ouranatomical model will have the expressiveness necessary to tar-get data. However, oftentimes the 2-D and/or 3-D data will beunreliable due to adverse lighting conditions, motion blur, occlu-sions, etc. Extra robustness can be baked into the anatomicalprior by training it on a large number of poses captured in acontrolled environment.

Retargeting

Blendshapes are commonly used to retarget facial performances.An animation created on one blendshape rig can be directlytransferred to an identical blendshape rig for another model.However, the target model is oftentimes a digital actor for whichone cannot capture shapes. As a result, artists must spend timecreating shapes that may or may not be physically plausible.Furthermore, the linear nature of the blendshapes results in“uncanny-valley”-esque performances.

As a result, we propose the use of anatomical models for retar-geting as in [1]. They use deformation transfer to control themuscle deformations on the target model. A simulation is thenapplied on top on the target model’s flesh to introduce nonlin-earities into the result. By introducing physical constraints suchas volume preservation and collision, they are able to generatesignificantly improved results in the area around the lips.

Anatomical Model

An anatomical model can be generated for an actor by morphingfrom a template model.

Figure 5: Top: The simulation surface for the actor. Middle: The underlyingcranium and jaw. Bottom: The underlying facial muscles.

Future Work

Our lab is pursuing three projects focused on coupling simulationand computer vision: cloth, trees, and faces. We believe that theuse of real-world data can add a level of detail previously notfound in computer graphics. As a result, our goal is to developa general model for combining data with simulation models andapply this technique to a wider variety of projects such as fluidsimulations.

Contact Information

Web: http://physbam.stanford.edu/~fedkiw/Email: {mikebao, rfedkiw}@stanford.edu