Upload
yu-zhang
View
215
Download
2
Embed Size (px)
Citation preview
Synthesis of 3D faces using region-basedmorphing under intuitive control
By Yu Zhang* and Norman I. Badler* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
This paper presents a new region-based method for automatically synthesizing varied,
natural looking 3D human face models by morphing local facial features according to
intuitive control parameters. We automatically compute a one-to-one vertex correspondence
among the unregistered face scans in a large database by deforming a generic mesh to fit
the specific person’s face geometry in a global-to-local fashion. With the obtained
correspondence, we transform the generated data sets of feature shapes into vector space
representations. We parameterize the example models using the face anthropometric
measurements that reflect the facial physical structure, and predefine the interpolation
functions for the parameterized example models based on radial basis functions. At runtime,
the interpolation functions are evaluated to efficiently generate the appropriate feature shape
by taking the anthropometric parameters as input. We use a shape blending approach
for generating a seamlessly deformed mesh around the feature region boundary. The
correspondence among all example textures is established by parameterizing the 3D generic
mesh over a 2D image domain. The new feature texture with desired attributes is synthesized
by interpolating the example textures. Our 3D face synthesis method has several advantages:
(1) it regulates the naturalness of synthesized faces, maintaining the quality existing in the
real face examples; (2) the region-based morphing and comprehensive face shape and texture
control parameters allow more diverse faces to be generated readily; and (3) the automatic
runtime face synthesis is efficient in time complexity and performs fast. Copyright # 2006
John Wiley & Sons, Ltd.
Received: 10 April 2006; Revised: 2 May 2006; Accepted: 10 May 2006
KEY WORDS: face modeling; facial features; region-based morphing; anthropometry; texture;3D scanned data
Introduction
Generation of realistic human face models is one of the
most interesting problems in computer animation.
Many applications such as character animation for films
and advertisement, computer games, video teleconfer-
ence, user-interface agents, and avatars require a large
number of different faces. However, generating diverse
3D face models for these applications is a difficult and
time-consuming task, particularly if realism is desired.
One avenue for creating 3D human face models is
manual construction by deforming an existing model or
having an artist design one from scratch. Without a
proper parameterization closely tied to the facial
physical structure and constraints from real human
faces, it usually requires a great deal of expertize and
time-consuming manual control to avoid unrealistic
results. With a significant increase in the quality and
availability of 3D capture methods, a common approach
towards creating face models of real humans uses laser
range scanners to acquire both the face geometry and
texture simultaneously. The scanning technology
coupled with software for model reconstruction works
well for capturing static faces, but often requires
significant effort to process the noisy and incomplete
captured surface data. More limiting, however, is that
the resulting model corresponds to a single individual
that is difficult to be automatically modified to yield a
realistic, novel face.
COMPUTER ANIMATION AND VIRTUAL WORLDS
Comp. Anim. Virtual Worlds 2006; 17: 421–432
Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cav.145* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*Correspondence to: Y. Zhang, Center for Human Modelingand Simulation, Computer and Information ScienceDepartment, University of Pennsylvania, Philadelphia, PA19104-6389, USA. E-mail: [email protected]
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd.
The reported psychophysical evidence1 suggests that
internal facial features (eyes, nose, mouth and chin) are
important for discriminating faces. Thus, we construct a
face synthesis system by perceiving the face as a set of
feature regions. Such a region-based face synthesis
allows us to generate more diverse faces through
various combinations of the synthesized features. In
this paper, we present a new region-based method for
automatically synthesizing varied realistic face models
by morphing local facial features according to intuitive
control parameters. Our method takes as examples 3D
scanned data to exploit the variations of the real human
faces. We use a three-step model fitting approach for the
3D registration problem. The obtained correspondence
enables the application of principal component analysis
(PCA) to exemplar shapes of each facial feature to build
a local shape space. We parameterize the example
models using the face anthropometric measurements,
and predefine the interpolation functions for the
parameterized example models based on radial basis
functions. At runtime, the interpolation functions are
evaluated to efficiently generate the appropriate feature
shapes by taking the anthropometric parameters as
input. We use a shape blending approach to generate a
seamlessly deformed mesh around the feature region
borders. We determine correspondence among the
example textures by constructing a parameterization
of the 3D generic mesh over a 2D image domain. Having
the texture interpolators formulated, the runtime texture
synthesis becomes the interpolation function evaluation
according to the texture attribute parameters.
Previous and RelatedWork
Face modeling and animation is an active area of
research in computer graphics (see Reference [2] for an
excellent survey). For our purpose, we focus on
modeling of static faces which is directly related to
our work. In this category, several methods are
documented in the literature. The desire of the early
parametric models3–6 was to create an encapsulated
model that could generate awide range of faces based on
a small set of input parameters. However, manual
parameter tuning without constaints from real human
faces for generating a realistic face is difficult and time-
consuming. Furthermore, the choice of the parameter set
depends on the face mesh topology and therefore the
manual association of a group of vertices to a specific
parameter is required. Control of face texture is also
ignored in these models.
The image-based technique7–12 utilizes an existing 3D
face model and information from few pictures for the
reconstruction of both geometry and texture. Although
this technique can provide reconstructed face models
easily, its drawbacks are the inaccurate geometry
reconstruction and inability to generate new faces
without image counterparts.
Decarlo et al.13 construct a range of face models using
a variational constrained optimization technique with
the anthropometric measurements as constraints. How-
ever, this approach requires 1minute of computation for
the optimization process to generate a new face and
lacks the texture control. In contrast, our method is
efficient to generate a face with desired shape and
texture within a second. Moreover, we utilize the prior
knowledge of the face shape in relation with the given
measurements to regulate the naturalness of modeled
faces, maintaining the quality that exists in the real faces
of individuals. Kahler et al.14 use statistic data of face
anthropometric measurements to drive the landmark-
based face deformation according to growth and aging.
Our goal is to rapidly generate variation of the face
shape using anthropometric measurements as direct
control.
Blanz and Vetter15 present a process for estimating the
shape of a face in a single photograph, and a set of
controls for intuitive manipulation of appearance
attributes. There are several key differences from our
work. First, they manually assign the attribute values to
face shape and texture, and devise attribute controls for
a single variable using linear regression. We automati-
cally compute the anthropometric measurements for
face shape and relate several variables simultaneously
by learning a mapping between the measurement space
and the shape space through scattered data interp-
olation. Second, they use a 3D variant of a gradient-
based optical flow algorithm to derive the point-to-point
correspondence. This approach will not work well for
faces of different races or in different illumination given
the inherent problem of using static textures. We
present a robust method of determining correspon-
dences that does not depend on the texture information.
Third, our goal here is to synthesize faces from direct
control by perceiving the face as a set of independent
features.
Example-based synthesis is another stream of
research related to our method. Rose et al.16 and Sloan
et al.17 propose example-based motion blending frame-
works, employing scattered data interpolation. Lewis
et al.18 introduce an example-based pose space defor-
mation technique, and Allen et al.19 apply a similar
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 422 Comp. Anim. Virtual Worlds 2006; 17: 421–432
Y. ZHANG AND N. I. BADLER* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
technique to range scan data for creating a new pose
model. Allen et al.20 and Seo et al.21 present methods for
generating a variety of human body shapes with
intuitive parameters based on unorganized scanned
data. Our method differs from these approaches in the
following respects. First, we efficiently identify feature
points for scanned data with much less user interaction
and use a robust model fitting approach. Also, our
framework is capable of generating variation of the
model texture. Third, we use a more comprehensive set
of control parameters to characterize the model
appearance. Last but not least, we focus on modeling
the face, especially facial features. We present effective
algorithms to address the problem of geometry and
texture blending that arises in this context.
Face Data and Feature PointIdentif|cation
We use the USF face database22 that consists of
Cyberware scans of 186 human faces (126 male and
60 female) with amixture of race and age. Each subject is
captured wearing a bathing cap and with a neutral
expression. The laser scans provide face structure
data (see Figure. 1(a)) and a 360� 524 RGB image for
texture mapping (see Figure. 1(b) and (c)). We use a
generic head model created with Autodesk Maya (see
Figure. 1(h)). It consists of 2274 triangles.
Let each 3D face in the database be Fi(i¼ 1,. . .,M). Since
the number of vertices inFi varies, we resample all faces
in the database so that they have the same number of
vertices all in mutual correspondence. Feature points
are identified to guide the resampling. In our method,
the feature points are identified semi-automatically.
Figure. 1(d)–(g) depicts the process. A 2D feature
template consisting of polylines groups a set of 83
feature points that correspond to the facial features such
as the eyes, eyebrows, nose, mouth, and face outline. It is
superimposed onto the front-view face image obtained
by orthographic projection of a textured face scan. The
facial features in this image are identified by using the
Active Shape Models (ASM)23 and the feature template
is fitted to the features automatically. Little user
interaction is needed to tune the feature point positions
due to slight inaccuracy of the automatic facial feature
detection. The 3D positions of the feature points on the
scanned surface are then recovered by re-projection to
the 3D space. In this way, we efficiently define a set of
feature points in a face Fi as Ui¼ ui, 1,. . ., ui,n, where
n¼ 83. Our generic model G is already tagged with the
corresponding set of feature points V¼ v1,. . ., vn by
default.
Figure 1. Face data and semi-automatic feature point identification: (a) scanned face geometry; (b) acquired color image; (c)
textured face scan; (d) initial outline of the feature template; (e) after automatic facial feature detection; (f) after interactive user
tuning; (g) and (h) feature points identified on the scanned data and generic model, respectively.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 423 Comp. Anim. Virtual Worlds 2006; 17: 421–432
SYNTHESIS OF 3D FACES UNDER INTUITIVE CONTROL* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Model Fitting
The problem of deriving full correspondence for all
models Fi can be stated as: resample the surface for all
Fi using G under the constraint that vj is mapped to ui,j.
We define an RBF interpolation function that gives a
mapping between vj and ui,j:
fðxÞ ¼Xnj¼1
wjfjðkx� vjkÞ þMxþ t (1)
where x2R3 is a vertex on the generic headmodel and f
is a radial basis function. wj, M and t are the unknown
parameters. Among them, wj 2 R3 are the interpolation
weights, M 2 R3� 3 represents rotation and scaling
transformations, and t 2 R3 represents translation
transformation. Many different functions for f(r) have
been proposed.24 We had better results with the multi-
quadric function fðrÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffir2 þ r2
p, where r is the locality
parameter and determined as the Euclidean distance to
the nearest other feature point. To determinewj,M and t,
we solve the following equations:
ui;j ¼ fðvjÞjnj¼1;Xnj¼1
wj ¼ 0;Xnj¼1
wTj vj ¼ 0 (2)
This system of linear equations is solved using an LU
decomposition. The generic model is then deformed by
feeding all its vertices into Equation. 1 (see Figure. 2(a)).
We further improve the shape using a local defor-
mation which ensures that all the generic mesh vertices
are truly embedded in the scanned surface. The local
deformation is based on the closest points on the
surfaces of the generic model and the scanned data. The
vertices of the generic model are displaced towards their
closest positions on the surface of the scanned data. The
polygons of the scanned data are organized into a binary
space partition tree to speed up the process of the closest
point identification. Each generic mesh vertex takes the
texture coordinates of the sampling point. Figure. 2(b)
and (c) show the result of local deformation.
In order to accurately represent the high-resolution
surface detail, we use a quaternary subdivision scheme
to construct the subdivision hierarchy on top of the base
mesh resulting from the local deformation process. The
newly generated vertices by subdivision are not
necessarily lying on the scanned surface. We employ
a normal mesh25 to represent a hierarchy of surface
details. For each newly generated vertex, we compute
the distance from it along its normal direction
to the nearest point on the scanned surface. By apply-
ing the computed distances as displacements to the
vertices, the subdivision mesh is refined (see
Figure 2(d)). In our experiments, normal meshes up
to level 2 are used for a good trade-off between
approximation accuracy and computational cost.
Region-Based Face ShapeMorphing
Forming Local Shape Spaces
To morph local shapes of facial features, we form local
shape spaces using PCA. The model fitting process
generates the necessary vertex-to-vertex correspon-
dence across 3D faces in the database, which is the
prerequisite of PCA. We divide faces into subregions
Figure 2. Model fitting: (a) generic model after global warp-
ing; (b) after local deformation; (c) textured; (d) level-by-level
model refinement using normal mesh; (e) meshes of four
features decomposed from the level 2 normal mesh shown
in (d).
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 424 Comp. Anim. Virtual Worlds 2006; 17: 421–432
Y. ZHANG AND N. I. BADLER* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
that can be morphed independently. Since all face scans
are in correspondence through mapping onto the
generic model, it is sufficient to define these regions
only on the generic model. We partition the generic
mesh into four regions: eyes, nose, mouth, and chin. The
segmentation is transferred to the multi-resolution
normal meshes to generate individualized feature
shapes with correspondence (see Figure. 2(e)). Note
that in order to isolate the shape variation from the
position variation, we normalize all face scans with
respect to the rotation and translation of the face before
the model fitting step. Thus, PCA can be performed
directly on the obtained data sets of feature shapes.
Given the set {F} of features, we obtain a compact
representation for themeshes of each facial feature using
PCA. Let {Fi}i¼ 1,. . .,M be a set of example meshes of
feature F, each mesh being associated to one of the M
meshes of the database. Thesemeshes are represented as
vectors that contain the x, y, z coordinates of the nFvertices Fi ¼ ðxi1; yi1; zi1; . . . ; xinF ; yinF ; zinFÞ 2 R3nF . Each
mesh can be expressed as a linear combination of Mþ 1
meshes fFFj gj¼0;...;M:
Fi ¼ FF0 þ
XMj¼1
aFij FFj (3)
where FF0 is the mean shape, FF
j are the eigenvectors of
the covariance matrix of the set fFi �FF0g. By truncating
the expansion of Equation. 3 at j¼ kF we introduce an
error whose magnitude decreases when kF is increased.
We choose the kF such thatPkF
j¼1 lFj � t
PMj¼1 l
Fj , where t
defines the proportion of the total variation (98% for
each feature in our experiments).
Each eigenvector is a new coordinate axis for our
existing data; thus each feature mesh can be restated as a
point in the space spanned by the PCA-yielded
orthogonal mesh basis. We call these axes eigenmeshes.
aFij ðj ¼ 1; . . . ; kFÞ give the coordinates of the feature mesh
in terms of the reduced eigenmesh basis.
Anthropometric Parameters
Although eigenmeshes represent the most salient
directions of the feature shape variation in the dataset,
they bear little resemblance to the underlying inter-
dependent structure of biological forms. Arguably, face
anthropometry provides a set of meaningful measure-
ments or shape parameters that allow the most complete
control over the shape of the face. Anthropometric
study26 describes a set of 132 measurements for
characterizing the human face. The measurements are
taken between the landmarks defined in terms of
visually-identifiable or palpable features on the subject’s
face. Such measurements use a total of 47 landmarks for
describing the face. Following the conventions laid out
in Reference [26], we have chosen a subset of 38
landmarks from the standard landmark set for anthro-
pometric measurements (see Figure. 3).
Instead of supporting all 132 measurements, in this
paper, we are only concerned with those related to four
facial features. The example models are placed in the
standard posture for measurements. The measurements
are computed using the Euclidean coordinates of
landmarks. Particularly, the axial distances correspond
to the x, y, and z axes of the world coordinate system.
Such a systematic collection of anthropometric measure-
ments are taken through all example models to
determine their locations in a multi-dimensional
measurement space.
Feature Shape Synthesis
From the previous stage we obtain a set of examples of
each feature with measured shape characteristics, each
of them consisting of the same set of dimensions, where
every dimension is an anthropometric measurement.
We assume that an example model Fi of feature F hasmF
dimensions, where each dimension is represented by a
value in the interval ð0; 1�. A value of 1 corresponds to
the maximum measurement value of the dimension.
That is, the example measurements are normalized. The
measurements of Fi can then be represented by the
vector
qFi ¼ ½qFi1; . . . ; qFimF
�; 8j 2 ½1;mF� : qFij 2 ð0; 1� (4)
Figure 3. Anthropometric landmarks (green dots). The land-
mark names are taken from Reference [26].
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 425 Comp. Anim. Virtual Worlds 2006; 17: 421–432
SYNTHESIS OF 3D FACES UNDER INTUITIVE CONTROL* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
This is equivalent to projecting each example model Fiinto a measurement space spanned by the mF selected
anthropometric measurements.
With the input shape control thus parameterized, our
goal is to generate a new deformation of the generic
mesh by computing the corresponding eigenmesh
coordinates with control through the measurement
parameter. Given an input measurement qF in the
measurement space, such controlled deformation
should interpolate the example models. To do this,
we interpolate the eigenmesh coordinates of the
example models. The interpolation is multi-dimen-
sional. Consider a RmF!R mapping, the interpolated
eigenmesh coordinates aFj ð�Þ 2 R; 1 � j � kF at an input
measurement vector qF 2 RmF are computed as:
aFj ðqFÞ ¼XMi¼1
g ijRiðqFÞ for 1 � j � kF (5)
where gij 2 R are the radial coefficients and M is the
number of example models. Let qFi be the measurement
vector of an example model. The radial basis function
Ri(qF) is a multi-quadric function of the Euclidean
distance between qF and qFi in the measurement space:
RiðqFÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffikqF � qF
i k2 þ r2i
qfor 1 � i � M (6)
where ri are the locality parameter used to control the
behavior of the basis function
ri ¼ mini6¼jkqFi � qF
j k i; j ¼ 1; . . . ;M (7)
The jth eigenmesh coordinate of the ith example
model, aFij corresponds to the measurement vector of the
ith example model, qFi . Equation. 5 should be satisfied
for qFi and a
Fij . Hence, by substituting qF
i and aFij for qF
and aFj respectively in Equation. 5, we have
aFij ðqFi Þ ¼
XMi¼1
g ijRiðqFi Þ for 1 � j � kF (8)
gij are obtained by solving Equation. 8 using an LU
decomposition. We can now generate the eigenmesh
coordinates, hence the shape, corresponding to the input
measurement vector qF according to Equation. 5.
Smooth Shape Blending
After the shape interpolation procedure, the surround-
ing facial areas should be blended with the deformed
facial features to generate a seamlessly smooth mesh.
The position of a vertex xi in the feature region F after
deformation is xi0. LetV denote the set of vertices of the
head mesh. For smooth blending, positions of the subset
VF ¼ V nVF of vertices that are not inside the feature
region should be updated with deformation of the facial
features. For each vertex xj 2 VF, the vertex in each
feature region that exerts influence on it, xFki , is the one
of minimal distance to it, that is k xj � xFki k¼minfiji2VFg k xj � xi k . Note that the distance is
measured offline in the original undeformed generic mesh.
For each non-feature vertex xj, the displacement vector
for its corresponding closest feature vertex xFki is used to
update its position in shape blending. The displacement
is weighted by an exponential fall-off function according
to the distance between xj and xFki:
x0j ¼ xj þXF2G
exp � 1
akxj � xFkik
� �kx0ki F � xFkik (9)
where G is the set of features and a controls the size of
the region influenced by the blending. We set a to 1/10
of the diagonal length of the bounding box of the head
model.
Region-Based FaceTextureMorphing
Mesh Parameterization
Tomorph textures of facial features, we form local texture
spaces by using PCA. Again, in general, applying PCA to
a set of face images requires normalization to remove
texture variation due to shape difference, and corre-
spondences must be found between face images. In our
case, however, correspondences between the two
textures are implicit in the texture coordinates of the
two associated face meshes. Since every face generated
from one generic model has a similar characteristic for
texture coordinates, we can produce shape-free face
textures by constructing a parameterization of the 3D
generic mesh over a 2D image plane.
Given the vertex-wise correspondence between a
fitted generic mesh (base mesh) and the original
undeformed generic mesh, it is trivial to transfer a
texture map between them. Each vertex on the original
generic mesh simply takes the texture coordinates of its
corresponding vertex on the base mesh for texture
mapping (see Figure. 4(b)). We parameterize the 3D
generic head mesh over a 2D domain [0, 1]2 in order to
obtain a shape-free texture map. We project the original
generic mesh rendered with the transferred texture to a
2D image plane by implementing a cylindrical projec-
tion. The resulting cylindrical coordinates map to a
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 426 Comp. Anim. Virtual Worlds 2006; 17: 421–432
Y. ZHANG AND N. I. BADLER* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
suitable resolution cylindrical texture image (512� 512 in
our experiments) in which each pixel value represents
the surface color of the texture-mapped face surface in
cylindrical coordinates (see Figure. 4(c)). The generic
mesh can be textured with this cylindrical texture image
using normalized cylindrical coordinates as the texture
coordinates.
Forming Local Texture Spaces
As the generic mesh has been partitioned into four
feature regions, the cylindrical texture image can
be divided into corresponding texture patches (see
Figure. 4(d)). PCA is used to parameterize the local
textures in a low dimensional space. We represent the
shape-free texture of a facial feature by a texture vector
I¼ðR1;G1;B1; . . . ;Rn;Gn;BnÞ 2 R3n that contains the R,
G, B color values, where n is the number of pixels in the
texture image. A texturemodel is then constructed using
a data set of M (M�n) exemplar local textures:
I ¼ C0 þXMj¼1
bjCj ¼ C0 þCb (10)
where C0 is the mean texture and C ¼ ðC1j . . . jCMÞ isthe orthogonal texture basis consisting of eigentextures.
The vector b defines a set of parameters of a texture. In
the obtained M eigentextures, we choose top k modes,
which correspond to the largest eigenvalues.
FeatureTexture Synthesis
While facial feature shapes can easily be related to
anthropometric measurements, texture attributes can
hardly be measured quantitatively. For each facial
feature, we define a set of distinct texture attributes to
build a texture attribute space where each attribute
represents an axis in the space. We manually assign
the attribute values (in the interval (0,1]) that describe
the marked-ness of the attributes to each example
texture, projecting it into the texture attribute space. We
then map the high-level texture control parameters onto
the eigentexture coefficients through the scattered data
interpolation. Given a new set of input texture attribute
values, the desired characteristics can be synthesized on
the cylindrical full-head texture image (see Figure. 4(e))
by blending the example local textures through RBF-
based interpolation. The reader is referred to Reference
[27] for detailed description of our full-head texture
generation technique.
SmoothTexture Blending
We perform a gradual blend with the surrounding area
for region-based texture morphing. The pixels on the
outermost ring of the feature region are grouped into
the boundary pixel set P0 ¼ fp01; . . . ; p0n0g, where n0 is the
number of boundary pixels. We then identify N rings of
pixels around the feature region to define the
blending region (see Figure. 4(e)). We denote CðP0Þ ¼fCðp01Þ; . . . ;Cðp0n0Þg and C0ðP0Þ ¼ fC0ðp01Þ; . . . ;C0ðp0n0Þg,the color sets of boundary pixels before and after
texture morphing, respectively. The change of boundary
pixel colors is used to update colors of the set
Pj ¼ fpjkjk ¼ 1; . . . ; njg of pixels that are in the jth ring
around the region, where j ranges from 1 to N, and nj is
the number of pixels in the jth ring. The color updating is
Figure 4. (a) Base mesh with texture mapping. (b) Texture transferred to the original undeformed generic model. (c) Cylindrical
texture image. (d) Segmented textures of four features. (e) Facial feature regions and blending regions in the mean cylindrical full-
head texture image: the local feature regions are in red with their boundaries in blue. The white areas are the texture blending
regions. The overlap of two blending regions is in green.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 427 Comp. Anim. Virtual Worlds 2006; 17: 421–432
SYNTHESIS OF 3D FACES UNDER INTUITIVE CONTROL* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
executed in an order starting at the 1st ring and
expanding towards the Nth ring:
C0ðpjkÞ ¼ CðpjkÞ þWj
sjk
Xsjkl¼1
ðC0ðpj�1l Þ � Cðpj�1
l ÞÞ
j ¼ 1; . . . ;N
(11)
where pj�1l are the pixels that are adjacent to the pixel p
jk
and in the (j� 1)th ring, sjk is the number of all found
pixels pj�1l for p
jk and Wj is a weight function to
attenuate the color updating. Wj is defined according to
the distance between the pixel and the region boundary
measured in terms of the number of the ring inwhich the
pixel is located:
Wj ¼ fðjÞfðj� 1Þ j ¼ 1; . . . ;N;
fðxÞ ¼ e�kx � e�kN
1� e�kN
(12)
where function f defines an attenuation profile which
dictates the way the change of the texture color
decreases from the feature region boundary to the
surrounding Nth-ring pixels. The parameter k controls
the profile of attenuation. We use k¼ 0.2 and N¼ 10 in
our experiments. As shown in Figure. 4(e), the blending
regions of every two features overlap. To obtain a
smooth color transition, we compute aweighted average
of the change of the pixel color contributed from each
blending region for pixels in the overlapping zone.
Results
Table 1 shows the number of eigenmeshes/eigentex-
tures and input shape/texture control parameters used
for feature morphing. The shape/texture parameters
provide an 82-D combined appearance control for face
synthesis. The user can select the feature to work on
using a GUI. The parameter values are interactively
chosen within [0,1].
Figure. 5 illustrates a number of distinct facial shapes
produced by region-based shape morphing. Figure. 6
shows some examples of new facial appearances
generated by region-based texture morphing and the
snapshots of dynamic feature texture morphing. Note
that it is not necessary to begin with the average model.
We can start with any facemodel of a specific person and
edit various aspects of its shape and texture. Figure. 7
illustrates face editing results on two subjects.
In order to quantify performance, we arbitrarily
selected ten examples in the database for the cross-
validation. Each example has been excluded from the
example database in training the face synthesis system,
and its shape and texture measurements were used as a
test input to the system. We assess the reconstruction
by measuring the maximum, mean, and root mean
square (RMS) errors from the feature regions of the
output model to those of the input model. For the
feature shapes, errors are computed by the distances
between the corresponding vertex positions. Table 2
shows the average errors measured for the ten
reconstructed models. The errors are given using both
absolute measures (/mm) and as a percentage of the
diameter of the output head model bounding box. For
textures, errors are measured as the differences
between colors of the corresponding pixels of the
input and output cylindrical full-head textures in terms
of Euclidean distance in the RGB color space. The
average errors in absolute measurements (the color
value of each channel is in the interval [0, 255]) are
given in Table 2.
Our system is implemented on a 2.8 GHz PC with an
Nvidia Quadro FX 3450 graphics board. Even though
the preprocessing steps (model fitting, PCA of feature
shapes and textures, computing anthropometric mea-
surements, and LUdecomposition) take up considerable
time, this does not impair usability due to the auto-
mation (beyond initial feature point identification and
assignment of texture attribute values) and their one-
time computation nature. At runtime, our scheme
spends about 1 second in generating a new face using
Shape Texture
Eyes Nose Mouth Chin Eyes Nose Mouth Chin
Numer of eigenmodes used formorphing 23 26 20 18 32 21 26 17Number of control parameters 13 20 12 7 10 5 9 6
Table 1. Number of eigenmodes and high-level control parameters used in our system
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 428 Comp. Anim. Virtual Worlds 2006; 17: 421–432
Y. ZHANG AND N. I. BADLER* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
the level 2 norm mesh (36, 392 triangles) rendered with
a 512� 512 texture image upon receiving the input
parameters.
Conclusion and FutureWork
We have presented a new region-based method for
synthesizing realistic faces by morphing local facial
features according to intuitive control parameters. The
original contribution of our method is given in terms of
the following advantages:
� As correlations between control parameters and the
face shape and texture are estimated by exploiting
the real faces of individuals, our method regulates
the naturalness of synthesized faces.
� Our system provides sets of comprehensive anthro-
pometric parameters to easily control face shape
characteristics by taking into account the physical
structure of real faces.
Figure 5. (a) Automatically generated face models by morphing the shapes of four facial features on the average model (outlined)
according to the input anthropometric parameters. (b) Close view of the synthesized shapes of each individual facial feature.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 429 Comp. Anim. Virtual Worlds 2006; 17: 421–432
SYNTHESIS OF 3D FACES UNDER INTUITIVE CONTROL* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Figure 7. Variation of facial features of two individual faces. Top row: region-based shape morphing. Bottom row: region-based
texture morphing. The original model is in the first image of each example.
Figure 6. (a) Some novel facial appearances generated by morphing the textures of four facial features on the average model
rendered with the mean texture (outlined). (b) Region-based face texture morphing (left to right in each example). Face shape is
unchanged.
Shape Texture
Eyes Nose Mouth Chin Eyes Nose Mouth Chin
Averagemax. 3.85 (0.91%) 3.55 (0.84%) 6.58 (1.65%) 4.46 (1.06%) 18.3 15.9 23.5 25.7Averagemean 1.37 (0.33%) 1.62 (0.38%) 2.04 (0.49%) 2.57 (0.57%) 7.8 7.2 11.7 9.4Average RMS 1.93 (0.46%) 2.23 (0.53%) 2.84 (0.67%) 3.62 (0.86%) 10.6 10.1 15.8 13.1
Table 2. Cross validation results of our 3D face synthesismethod
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 430 Comp. Anim. Virtual Worlds 2006; 17: 421–432
Y. ZHANG AND N. I. BADLER* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
� The augmentation of the control from face shape to
texture allows more diverse face appearances to be
generated.
� The automatic runtime face synthesis is efficient in
time complexity and performs fast.
The scanned face data provides the best available
resource for face synthesis. On the other hand, itmight be
a limitation since the morphing is limited in its expres-
sive power by the variety of the faces in the training
database. We would like to extend our current database
to incorporate more faces of different races as well as to
increase diversity of age. We also plan to extend our
system on morphing other face regions such as the
forehead, cheeks, and upper jaw. In order to fully auto-
mate the system implementation, we would like to use
some extensions of theASM to achievemore robust facial
feature detection.Automatic determination of the texture
attribute values is also one of the future challenges.
References1. Young A, Hay D. Configurational information in face per-
ception. Experimental Psychology Society, January 1986.2. Parke FI, Waters K. Computer Facial Animation. AK Peters:
Wellesley, MA, 1996.3. DiPaola S. Extending the range of facial types. Journal of
Visualization and Computer Animation 1991; 2(4): 129–131.4. Magnenat-Thalmann N, Minh H, deAngelis M, Thalmann
D. Design, transformation and animation of human faces.The Visual Computer 1989; 5: 32–39.
5. Parke FI. Parameterized models for facial animation. IEEEComputer Graphics and Application 1982; 2(9): 61–68.
6. Patel M, Willis P. FACES: the facial animation, constructionand editing system. Eurographics’91, 1991, pp. 33–45.
7. Akimoto T, Suenaga Y, Wallace RS. Automatic creation of3D facial models. IEEE Computer Graphics and Application,1993; 13(5): 16–22.
8. Pighin F, Hecker J, Lischinski D, Szeliski R, Salesin DH.Synthesizing realistic facial expressions from photographs.In Proceedings of SIGGRAPH’98, July 1998, pp. 75–84.
9. Guenter B, Grimm C, Wood D, Malvar H, Pighin F.Making faces. In Proceedings of SIGGRAPH’98, July 1998,pp. 55–66.
10. Lee WS, Magnenat-Thalmann N. Fast head modeling foranimation. Journal Image and Vision Computing 2000; 18(4):355–364.
11. Liu Z, Zhang Z, Jacobs C, Cohen M. Rapid modeling ofanimated faces from video. Journal of Visualization andComputer Animation 2001; 12(4): 227–240.
12. Park IK, Zhang H, Vezhnevets V, Choh HK. Image-basedphotorealistic 3D face modeling. In Proceedings of IEEEAutomatic Face and Gesture Recognition, 2004, pp. 49–54.
13. DeCarlo D, Metaxas D, Stone M. An anthropometric facemodel using variational techniques. In Proceedings ofSIGGRAPH’98, July 1998, pp. 67–74.
14. Kahler K, Haber J, Yamauchi H, Seidel H-P. Head shop:generating animated head models with anatomical struc-ture. In Proceedings of ACM SIGGRAPH Symposium onComputer Animation, 2002, pp. 55–64.
15. Blanz V, Vetter T. A morphable model for the synthesis of3D faces. In Proceedings of SIGGRAPH’99, August 1999,pp. 187–194,
16. Rose C, Cohen M, Bodenheimer B. Verbs and adverbs:multidimensional motion interpolation using RBF. IEEEComputer Graphics and Application, 1998; 18(5): 32–40.
17. Sloan P-P, Rose CF, Cohen MF. Shape by example. InProceedings of ACM SIGGRAPH Symposium on Interactive3D Graphics, 2001, pp. 135–143.
18. Lewis J, Cordner M, Fong N. Pose space deformations: aunified approach to shape interpolation and skeleton-driven deformation. In Proceedings of SIGGRAPH’00, July2000, pp. 165–172.
19. Allen B, Curless B, Popovic Z. Articulated body defor-mation from range scan data. In Proceedings of SIG-GRAPH’02, 2002, pp. 612–619.
20. Allen B, Curless B, Popovic Z. The space of human bodyshapes: Reconstruction and parameterization from rangescans. Proc. SIGGRAPH’03, 2003, pp. 587–594.
21. Seo H, Magnenat-Thalmann N. Automatic modeling ofhuman bodies from sizing parameters. In Proceedings ofACM SIGGRAPH Symposium on Interactive 3D Graphics,2003, pp. 19–26.
22. USF DARPA HumanID 3D Face Database, Courtesy ofProfessor Sudeep Sarkar, University of South Florida.
23. Cootes TF, Taylor CJ, Cooper DH, Graham J. Active shapemodels: their training and applications. Computer Vision andImage Understanding 1995; 61(1): 38–59.
24. Carr JC, Beatson RK, Cherrie JB, Mitchell TJ, Fright WR,MccallumBC, Evans TR. Reconstruction and representationof 3d objects with radial basis functions. In Proceedings ofSIGGRAPH 2001, August 2001, pp. 67–76.
25. Guskov I, Vidimce K, Sweldens W, Schroder P. Normalmeshes. Proceedings of SIGGRAPH’00, July 2000. pp. 95–102.
26. Farkas LG. Anthropometry of the Head and Face. Raven Press:New York, 1994.
27. Zhang Y. An efficient texture generation technique forhuman head cloning and morphing. In Proceedings of Inter-national Conference on Computer Graphics Theory and Appli-cations, February 2006.
Authors’ biographies:
Yu Zhang is currently a postdoctoral researcher in theComputer and Information Science Department atthe University of Pennsylvania. He received his B.E.and M.E. degrees from Northwestern Polytechnical
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 431 Comp. Anim. Virtual Worlds 2006; 17: 421–432
SYNTHESIS OF 3D FACES UNDER INTUITIVE CONTROL* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
University, Xi an, China, in 1997 and 1999, respectively.He received his Ph.D. from Nanyang TechnologicalUniversity, Singapore, in 2004. From 2003 to 2005, hewas a research fellow in the Department of ComputerScience, School of Computing, National University ofSingapore. In 2005, he worked as a research scientist atthe Genex Technologies, Inc., USA. His research inter-ests include computer graphics, computer animation,physically-based modeling, visualization, and virtualreality. He is a member of the IEEE Computer Societyand the ACM SIGGRAPH.
Norman I. Badler is a professor of Computer andInformation Science at the University of Pennsylvania,and has been on the faculty since 1974. His researchfocuses on animation via simulation, embodied agentsoftware, and computational connections betweenlanguage, instructions, and action. He directs the CenterforHumanModeling and Simulation at Penn and is a co-editor of Graphical Models.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2006 John Wiley & Sons, Ltd. 432 Comp. Anim. Virtual Worlds 2006; 17: 421–432
Y. ZHANG AND N. I. BADLER* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *