Upload
jongju-shin
View
1.058
Download
1
Tags:
Embed Size (px)
DESCRIPTION
My PhD proposal
Citation preview
POSE & OCCLUSION ROBUST FACE
ALIGNMENT USING MULTIPLE SHAPE MODELS
AND PARTIAL INFERENCE
- PhD Thesis Proposal -
Jongju Shin
Advisor : Daijin Kim
2013.01.03
I.M. Lab.
Dept. of CSE
1
Outline
• Introduction
• Previous Work
• Proposed Method • Shape Representation
• Formulation
• Multiple Shape Models
• Local Feature Detection
• Hypothesizing Transformation Parameters
• Hypothesizing Shape Parameters
• Model Hypotheses Evaluation
• Experimental Results
• Conclusion
• Future Work
2
INTRODUCTION
3
What is face alignment?
• Face alignment is to extract facial feature points : • , and from the given image
Eye
Mouth
Eyebrow
Nose
Chin
* “The POSTECH Face Database (PF07) and Performance Evaluation”, FG 2008
4
introduction
Why is it important?
• Face alignment is pre-requisite for many face-related
problem.
5
introduction
Face Recognition Face Expression Recognition Head Pose Estimation
Angry Happy
Surprise Neutral
0° +25° -25°
Challenges
Illumination Pose
Expression Occlusion
6
introduction
PREVIOUS WORK
7
Previous work
• Two approaches
• 1. Discriminative approach
• Active Shape Model
• The shape parameters are iteratively updated by locally finding the best
nearby match for each feature point.
• 2. Generative approach
• Active Appearance Model
• The shape parameters are iteratively updated by minimizing the error
between appearance instance and input image.
8
Previous work
• 1. Discriminative approach
• They assume that all the feature points are visible.
• By the wrong detected feature points, alignment fails.
Previous work
9
[1] Jason et al., “Face Alignment through Subspace Constrained Mean-Shifts”, ICCV 2009
[2] Yi et al., “Bayesian Tangent Shape Model:Estimating Shape and Pose Parameters via Bayesian Inference”, CVPR 2003
Constrained Local Model[1] Bayesian Tangent Shape Model[2]
• Feature detector : Linear SVM
• Alignment algorithm : Mean-shifts
• Feature detector : gradient along normal vector
• Alignment algorithm : Bayesian Inference
Previous work
• 2. Generative approach
• Due to high dimensional solution space, it has large number of
local minimums.
• They need good initialization by eye detection.
Previous work
10
[3] Xiaoming Liu, “Generic Face Alignment using Boosted Appearance Model”, CVPR 2007
[4] Rajitha, et al., “Fourier Active Appearance Models”, ICCV 2011
Boosted Appearance Model[3]
• Appearance model : Haar-like feature
and boosting.
• Weak classifier : discriminate aligned
images from not-aligned images.
Fourier Active Appearance Model[4]
• Appearance model : Fourier transformed
appearance
• Alignment algorithm : gradient descent
Previous work
PROPOSED METHOD
11
Motivation
• We follow discriminative approach.
• Determine whether a feature point is visible or not.
• Only visible feature points are involved alignment step.
• Invisible feature points are estimated by visible feature points using partial
inference (PI) algorithm.
• Using the multiple shape models, we solve pose problem.
Visible
Invisible
12
Proposed method
We propose pose and occlusion robust face alignment !
Shape Representation
• Point Distribution Model
• The non-rigid shape :
• is represented by linear combination of shape bases with the
mean shape as
13
: mean shape associated to
: eigenvectors associated to
: shape parameter
: scale
: rotation
: translation(x, y)
Proposed method
Formulation
• Shape Model with parameter, p ={s, R, q, t}
• Energy function
14
denotes whether the is aligned(visible) or not,
is the number of local features.
Proposed method
Multiple Shape Models
• To cover various pose and expression, we build multiple
shape models.
• We build eigenvectors for nth pose, mth expression,
• Given n and m, shape is
15
Proposed method
Formulation with multiple shape models
• Energy function
16
Proposed method
Algorithm Overview
17
Proposed method
[Input]
[Output]
Face
Detection
Model Hypotheses
Evaluation Local Feature Detection
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Hypothesizing
Shape Parameters
Local Feature Detection
18
Proposed method
[Input]
[Output]
Face
Detection
Model Hypotheses
Evaluation Local Feature Detection
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Hypothesizing
Shape Parameters
Local Feature Detection
• Goal
• Based on MCT+Adaboost algorithm [5],
• We propose Hierarchical MCT to increase detection
performance.
19
Proposed method
Local feature detection
[5] Jun, and Kim, “Robust Real-Time Face Detection Using Face Certainty Map”, ICB, 2007
Detect feature point candidates with Gaussian Model!
Feature Descriptor
• Modified Census Transform (MCT)
I1 I2 I3
I4 I5 I6
I7 I8 I9
9
1xI
9
1
x
M
0
1
x
x
B
B if MxI
otherwise
B1 B2 B3
B4 B5 B6
B7 B8 B9
9
1
2*x
xxBC
20
Proposed method
Local feature detection
102 105 118
120 111 101
123 119 109
0 0 0
1 0 0
1 1 0
2240111000002
Feature Descriptor
• Modified Census Transform (MCT)
• Transformed result
• MCT is point feature
• Represents local intensity’s difference
• Very sensitive to noise
21
Proposed method
Local feature detection
Gray image MCT
Feature Descriptor
• Regional feature
• To represent regional difference
• Robust to noise
22
Proposed method
Local feature detection
Partition Average
I1 I2 I3
I4 I5 I6
I7 I8 I9
MCT
9
1
2*x
xxBC
We propose Hierarchical MCT
Training procedure
• Hierarchical MCT + Adaboost
23
Input image MCT
Adaboost
Training
Image pyramid
By Integral Image
Concatenated
vector
35
35
25
15
5
Proposed method
Local feature detection
Feature Response
• Feature response by Adaboost with different feature
descriptor
Training
Image
Test
Image
Conventional
MCT
Conventional
LBP
Hierarchical
LBP
Hierarchical
MCT
24
Proposed method
Local feature detection
Process of local feature detection
[Input] Search region Hierarchical MCT
Adaboost Response
Regressed Response
25
Proposed method
Local feature detection
How to obtain feature point candidates?
Representation of Feature Response
• How to obtain feature point candidates?
• Local maximum points in candidate search region
26
[Input] Response Segmented
region
Proposed method
Local feature detection
of center is ,0xp and , ,maxarg xyyxx
Representation of Feature Response
• How to obtain feature point candidates?
• We compute distribution of segmented region through convex
quadratic function
• We obtain and : feature candidate’s distribution and centroid.
• Independent Gaussian distribution
27
is kth segmented region in ith feature point.
is the centroid of
is the inverted feature response function.
Proposed method
Local feature detection
Kronecker delta function which is visible.
Feature clustering
• Mouth corner’s appearance varies according to facial
expression according.
• The detection performance degrades when only one detector is
used to train for all the mouth shapes and appearances.
Neutral Smile Surprise
28
Proposed method
Local feature detection
Feature clustering
• Train each detector with each clustered feature
• Run detectors and combine results
29
Proposed method
Local feature detection
Local feature detection
30
Proposed method
Local feature detection
..…
…..
[Input] [Search region] [Adaboost
Response]
[Candidates
with Gaussian] [output of detection]
Hypothesizing Transformation Parameters
31
Proposed method
[Input]
[Output]
Face
Detection
Model Hypotheses
Evaluation Local Feature Detection
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Hypothesizing
Shape Parameters
• Goal
• Assumption for occlusion
• We assume that at least half of feature points are not occluded.
• Let be N is total number of features points.
• N/2 feature points can be assumed to be visible ones.
Hypothesizing
32
[Feature point candidates]
Proposed method
Hypo. trans. param.
Find a best combination of the
local feature point candidates
which represents input image well.
Hypothesizing
• Coarse-to-fine approach
– The hypothesis space of visibility of feature p
oints is HUGE.
– Partial Inference (PI) Algorithm
• 1. Transformation parameters (s, R, t) are estimate
d by RANSAC.
• 2. Shape parameters (q) are estimated, also transfo
rmation parameters are updated by RANSAC
33
Proposed method
Hypo. trans. param.
Hypothesizing Transformation Parameters
34
Proposed method
Hypo. trans. param.
Algorithm 1. Partial Inference (PI) algorithm for transformation parameters
[PI algorithm]
Hypothesizing Shape Parameters
35
Proposed method
[Input]
[Output]
Face
Detection
Model Hypotheses
Evaluation Local Feature Detection
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Hypothesizing
Shape Parameters
Hypothesizing Shape Parameters
• From the selected feature points , we calculate parameters p
in closed form by
• Visibility indicator
• to and to are selected candidate’s Gaussian parameters.
36
Proposed method
Hypothesizing shape parameters
37
Proposed method
Hypo. shp. param.
Algorithm 2. Partial Inference (PI) algorithm for shape parameters
[Hallucinated shape]
[Selected feature points]
Hypothesizing for all pose and expression
• Run two hypothesizing steps for all shape mod
els (of face pose and expression)
38
Proposed method
Model Hypothesis Evaluation
39
Proposed method
[Input]
[Output]
Face
Detection
Model Hypotheses
Evaluation Local Feature Detection
[Hypothesis-and-test]
Hypothesizing
Transformation Parameters
Hypothesizing
Shape Parameters
Model Hypotheses Evaluation
• We should select best pose and expression from all the
hypotheses.
• Hypothesis error is mean error of inliers(E) over number of
inliers(v).
40
Num. of Inliers
Error of inliers
54
2.9755
52
3.23
43
3.37
40
2.95
Proposed method
Video
41
EXPERIMENTAL RESULTS
42
Training database
• CMU Multi-PIE [7]
• Various pose, expression and illumination
• We used 10,948 images among 750,000 images
• 5 Pose models
• 0°, 15°~30°, 30°~45° (70 feature points)
• 60°~75°, and 75°~90° (40 feature points)
• 2 Expression models
• Neutral and smile
• surprise
43
[7] Ralph et al., “Guide to the CMU Multi-pie database”, Technical report, CMU, 2007
Experimental results
Test database
• ARDB [8]
• Occlusion (Sunglasses, and scarf)
• CMU Multi-PIE
• Various pose, expression, illumination
• For artificial occlusion
• LFPW(Labeled Face Parts in the Wild) [9]
• Various pose, expression, illumination, and partial occlusion.
• 29 feature points
• To compare our algorithm with other state-of-the art one
44
[8] A.M. Martinez and R. Benavente. The AR Face Database. CVC Technical Report #24, June 1998
[9] P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011
AR DB LFPW
Experimental results
Alignment Accuracy
• Normalized error
• Euclidean distance between aligned feature and ground truth with
respect to face size.
• If Normalized error is 0.01 with 100 pixel size face,
• distance between aligned feature and ground truth is only one pixel.
45
Experimental results
AR database
• Test result
• 60 images
46
Experimental results
AR database
• Normalized error for
occlusion type
47
• Cumulative error
Normalized mean error for occlusion type
Non occlusion 0.0226
Scarf 0.0258
Sunglasses 0.0338
Experimental results
CMU Multi-PIE Database
• Test result
• Test for pose
• 321 images
48
Experimental results
CMU Multi-PIE Database
49
• Normalized mean error
for pose
• Cumulative error
*60°~90° shows a little poor than 0°~45°. Since large portion of the facial features
are covered by hair, the total number of
visible feature points detected is too small
to hallucinate correct facial shape.
0° 0.0263 60° 0.0352
15° 0.0253 75° 0.0336
30° 0.0273 90° 0.0368
45° 0.0267
Normalized mean error for pose
Experimental results
CMU Multi-PIE Database
• Test for artificial occlusion
• Face area is divided by 5-by-5.
• Among 25 regions, 1 to 15 regions are selected randomly and filled by
black.
• From 8 of occluded regions, the fraction of occlusion starts to be over 50%
of feature points.
• 2,100 images
50
Experimental results
CMU Multi-PIE Database
• Test result
51
Experimental results
CMU Multi-PIE Database
• Normalized error for pose
• For the profile(60°~90°) view, even small occlusion affects the alignment
badly because there are fewer strong features like eyes, mouth, and nostrils.
• However, with respect to the mean error, the proposed method shows stable alignment up to 7 degree of occlusion which is nearly 50% of occlusion.
52
Experimental results
LFPW database
* P. Belhumeur, et al., “Localizing parts of faces using a concensus of exemplars”, IEEE CVPR, 2011
53
• Mean error over inter-ocular distance for 21 feature points
• 240 of 300 images
Experimental results
54
Conclusion
• We proposed pose and occlusion robust face alignment
method.
• To solve pose problem, we used multiple shape models.
• To solve occlusion problem, we proposed partial
inference (PI) algorithm.
• We explicitly determine which part is occluded.
• We proposed Hierarchical MCT+Adaboost for local
feature detector to improve detection performance.
55
FUTURE WORK
56
Future work
• We combine generative approach (Active Appearance
Model) with discriminative approach (local feature detector).
• Current facial feature tracking
• AAM with temporal matching, template update, and motion
estimation
57
• Problem in facial feature tracking
• Drift problem
Future work
58
α,,,minarg AAMα,
pAp
nIE
-
[Input] [Output]
Appearance
Error
Iterative Update
Update
parameters
i0 s
ααα
ipxx
ppp
Condition
Future work
• By local feature detection result,
• we can constrain the aligned feature points by AAM to the local
feature detector.
59
Future work
60
[Input In]
…
Local feature
detector
Iterative Update
[Output]
Appearance
Error
α,,,minarg AAMα,
pAp
nIE
-
Feature point
selection
Point Error
n
22
11
yx
yx
yx
n
ptsE
Update
parameters
i0 s
ααα
ipxx
ppp
Condition
[Point constraint]
Future work
• By local feature detection result,
• We can make validation matrix of AAM for robust fitting.
• After alignment,
• We run feature detector on the aligned feature points.
• We determine whether each point is occluded or not.
• Based on feature-occlusion information, we make validation matrix
of AAM for robust fitting.
• Validation matrix is used for robust AAM from the next input image.
61
Future work
62
[Input In]
…
Local feature
detector
Iterative Update
[Output]
Appearance
Error
α,,,minarg AAMα,
pAp
nIE
-
Feature point
selection
Point Error
n
22
11
yx
yx
yx
n
ptsE
Update
parameters
i0 s
ααα
ipxx
ppp
Condition
[Point constraint]
Occlusion
Decision
x1-pos.
x2-neg.
…
xn-pos.
Validation
Matrix
Future work
63
[Input In+1]
…
Local feature
detector
Iterative Update
[Output]
Robust
App. Error Feature point
selection
Point Error
n
22
11
yx
yx
yx
n
ptsE
Update
parameters
i0 s
ααα
ipxx
ppp
Condition
[Point constraint]
Occlusion
Decision
x1-pos.
x2-neg.
…
xn-pos.
Validation
Matrix
* -
α,,,minarg AAMα,
pAp
nIE
Thank you.
64