Upload
rhiannon-reese
View
46
Download
3
Embed Size (px)
DESCRIPTION
Realistic Speech Animation on Synthetic Faces. Barış Uz, Uğur Güdükbay, Bülent Özgüç. Bilkent University Dept. of Computer Eng. and Information Science Bilkent 06533 Ankara Turkey. Previous Work Facial Animation Speech Animation Face Model Facial Muscles Linear Muscles Orbicularis Oris - PowerPoint PPT Presentation
Citation preview
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 1
Realistic Speech Animation on Synthetic Faces
Barış Uz, Uğur Güdükbay, Bülent Özgüç
Bilkent UniversityDept. of Computer Eng. and Information Science
Bilkent 06533 Ankara Turkey
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 2
Outline Previous Work
– Facial Animation– Speech Animation
Face Model Facial Muscles
– Linear Muscles– Orbicularis Oris
Tongue Model Speech Animation Synchronizing Speech with Expressions Future Work
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 3
Facial AnimationKeyframing [Parke’72]: Each keyframe must be
completely specified. Tedious for 3D facial animation.
Parametric [Parke’82]: A set of parameters for the face is defined. – Expression parameters: apply to different parts; jaw
rotation angle, width of the mouth, eyelid opening, eyebrow position and shape, etc.
– Conformation parameters: apply globally to the whole face; aspect ratio of the face, skin color, etc. Since each parameter effects a disjoint set of vertices, cannot easily blend facial expressions.
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 4
Facial Animation (cont.)
Structure based [Platt’85]: the face is divided into regions based on anatomy of the face.
Physically-based [Terzopoulos and Waters’90]: the face is modeled in a layered fashion; an anatomically-based muscle model is incorporated with a physically based layered tissue model.
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 5
Speech AnimationGenerating Speaking Face Models: model mouth and
lip postures and interpolate them– Parametric approach [Parke75]– Image-based approach [Watson96]: morphing
algorithm to interpolate phoneme images– [Waters and Frisbie’95] Coordinated 2D muscle
model to model muscle interactions– [Basu’97] 3D model of human lips and a framework
for training it from real data. Mainly for reconstruction of lip shapes from real data but can be used for lip shape synthesis
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 6
Synchronizing Speech with Animation Non-automated techniques: changing the audio
requires the whole synchronization process to be repeated.– Parke’75– Pearce et al. 86
Automatic techniques: An audio server is queried for each phoneme so that a mouth shape is computed synchronously.– DecFace [Waters and Levergood’93]
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 7
Generation of Facial Expressions
Layered abstractions [Kalra et al.’91] : Higher layers allow abstract manipulations; speech is synchronized with eye motion and emotions using a synchronization mechanism provided by a high-level language
Animated conversation [Cassell et al.’94]: animated conversation between multiple human-like agents with synchronized speech , intonation, facial expressions and hand gestures.
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 8
Face Model Face model consists of 888 triangles (1700 including
eyes, teeth and tongue) Face is divided into three regions
– Upper (610 triangles), lower (240 triangles) and intermediate
Changes to the original model– Repeated the mouth vertices to open and close the mouth– Added some polygons to close the nose – Added eyes and teeth to the model– Added a simple tongue
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 9
Regions of Face
Upper region
Intermediate region
Lower region
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 10
Motion (Muscle) and Vertex Relationships
Vertex TagMotion orMuscle Tag
UPPER LOWER BOTH NONE JAWROT
UPPER + - + - -
LOWER - + + - -
BOTH + + + - -
JAWROT - + + - +
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 11
Facial Muscle Vectors
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 12
Major Facial Muscles Orbicularis Oris : most significant role in composing
the shape of the mouth; a sphincter muscle. Mentalis, buccinator, depressor anguli oris major,
depressor labii inferioris : lower lip and lower face is controlled by these muscles
Zygomatic minor, levator labii superioris alaeque nasii : Upper face muscles; rarely used for speech; important for expressions
Zygomatic major, risorius : located around cheeks; important for expressions
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 13
Modeling of Muscles
Types of muscles– Linear: e.g., zygomatic
major (smiling)
– Sphincter: e.g., orbicularis oris (mouth opener)
– Sheet: e.g., orbicularis oculi major (eyelid opener)
Muscle parameters– Influence zone: between
35 and 65 degrees– Influence start: Muscle’s
influence starts at this tension
– Influence end: After this tension, skin resists deformation
– Contraction value: muscle tension
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 14
Modeling of Muscles (cont.) P: original position P’: new position Rs and Rf influence start
and finish radii maximum zone of
influence D: distance of P from
muscle head : angular displacement
from muscle vector
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 15
Muscle deformation [Waters’87]If P is in V1P3P4
P P karPV
PV1
1
r
D
RD R
R R
s
s
f s
cos(( ) )
cos(( ) )
1
2
2
where
k is a muscle spring constant,
a=cos(a),
if P in (V1 P3 P4)
if P in (P1 P2 P3 P4)
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 16
Orbicularis Oris
Modeled as 4 linear muscles Horizontal ones have 40 degrees of influence; vertical ones have
140 degrees of influence Very practical to implement A pseudo-muscle is added to simulate protrusion and purse
effects for lower lip (f-tuck); necessary to say letters “f” and “v”
Real placement ofOrbicularis Oris
Our abstraction forOrbicularis Oris
Muscle for f-tuck
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 17
The Tongue Model Tongue is composed of 4 sections of 20 polygons
+ 12 polygons to close the tip Tongue is reconstructed for each change
in the parameters Each section has the following parameters
– height: height of the section floor from the tonguebase
– width: total width of the section
– length: length of the section
– thickness: thickness of the tongue
– midline: height of the middle line of the tongue
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 18
The Tongue Model (cont.)
Top view Frontal view
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 19
The Tongue Model (cont.) For example, to say the letter “l”,
– there is no change in section 1 (the farthest section)
– In section 2, the width will be reduced to 1/2 of the relaxed width
– In section 3, the width will be reduced to 1/3 of the relaxed width and the height will be increased properly
– In section 4, the width will be reduced to 1/4 of the relaxed width and the height of the tongue will be equal to the bottom of upper teeth. The midline will be equal to the thickness of the tongue in section 4.
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 20
Speech Animation Keyframing based on muscle parameters around the
mouth and jaw rotation. Each keyframe is a mouth shape dictated by the
current expression setting and the current letter. Cosine interpolation is used to generate inbetweens The database for mouth shapes contain
– the letter: the key field
– the muscle contraction values: determining which muscles are active while pronouncing this letter
– jaw rotation angle: necessary for some letters
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 21
Speech Animation System
Input Text Parser
Expression
LetterFacial
AnimationSystem
Database
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 22
Synchronizing Speech with ExpressionsGuessing from text
– Punctuation marks
– Keywords
– From the meanings of words
Ambiguous! different meanings of the same word, punctuation marks and keywords
By inserting tags into text Insert tags into text to specify
expressions and their degrees explicitly
\b{expression level}: starts an expression of degree level. If the expression is set before, it is used to increase the degree of the expression.
\e{expression level}: ends or decreases the degree of an expression by level. If level is -1, the expression is removed from the face.
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 23
Speech Animation AlgorithmWhile not all of the text is processed 1. Read a character 2. If a tag is beginning /* "\" is read */ 2.1 Read tag /* name and degree of expression */ 3. If degree is -1 3.1 Remove expression from the face else 3.2 Set face according to expression with specified degree 4. If a valid character /* a letter or a punctuation mark */ 4.1 If this is the first character to say 4.1.1 Set face using current expression and letter settings 4.1.2 Display face else 4.2 for each in-between 4.2.1 Calculate vertex coords using cosine interpolation 4.2.2 Display face 4.3 Store vertex coords for future reference
Barış Uz, Uğur Güdükbay, Bülent Özgüç / Bilkent University
Computer Animation’98 / 24
Future Work
Better mouth postures Implementation of coarticulation Synchronization of synthetic speech with
facial animation (Turkish speech synthesizer is syllable-based; we should form a database of mouth postures for 2000 Turkish syllables and group them with respect to similar mouth postures)