HYBRID-BOOST LEARNING FOR MULTI-POSE FACE DETECTION AND FACIAL EXPRESSION RECOGNITION

HYBRID-BOOST LEARNING FOR MULTI-POSE FACE DETECTION AND FACIALEXPRESSION RECOGNITION

Hsiuao-Ying Chen Chung-Lin Huang Chih-Ming Fu

Pattern Recognition, Volume 41, Issue 3, March 2008, Pages 1173-1185

Aluna: Lourdes Ramírez Cerna.

INTRODUCTION

How we know, Face Detection has many applications such as surveillance, human computer interface, etc. Nevertheless, most of the published methods have many restrictions such as no varying pose or no noisy defocus problem.

This paper proposes a hybrid-boost learning which selects Gabor features and Harr-like features to provide the most discriminating information for the strong classifier in the final stage. Finally, they compare the experimental results with others methods.

2

SEGMENTATION OF POTENTIAL FACE REGIONS

Potential face regions segmentations consists of skin color detection and segmentation.

For skin-color detection, they analyze the color of the pixels in RGB color space to decrease the effect of illumination changes, and then classify the pixels into face-color or non-face color based on their hue component only.

Bayesian decision rule which can be expressed as: if

p(c(i)|face)/p(c(i)|non-face)> t, then pixel i (with c(i) = (r(i), g(i), b(i)) belongs to a face region, otherwise it is inside a non-face region, where t= p(non-face)/p(face).

3

SEGMENTATION OF POTENTIAL FACE REGIONS

4

HYBRID-BOOST LEARNING FOR FACE DETECTION

Gabor features (global feature): are obtained in the normalized image of 24 × 24 blocks, and include more detailed information of frequency and orientation.

Harr-like features (local features): are acquired in the various-sized blocks (include the width and length).

5

2D Gaussian and a complex exponential function

HYBRID-BOOST LEARNING FOR FACE DETECTION

6

Hybrid features: For Gabor features: position (x, y), σ, γ and θ.

For Harr-like feature height (H) and width (W).

Finally, the feature is defined as x = (t, x, y, p1, p2), where t = 1 indicates Gabor feature and t = 2–8 specifies Harr-like feature.

HYBRID-BOOST LEARNING FOR FACE DETECTION Soft-decision function for weak classifiers

We create a pool of 2D soft-decision function for weak classifiers for class wl before the hybrid-boost learning.

The soft decision function for weak classifier is denoted as:

7

where P(b(f(x))) is the histogram of the response of feature x for all training data, P(l) is the priori of class l, and P(b(f(x))|l) is the conditional probability.

Since there are many possible features and posteriori functions (i.e., 8 × 24 × 24 × 8 × 8 = 294912), the hybrid-boost learning algorithm selects only the most discriminant features from these hybrid features.

8

MULTI-POSE FACE DETECTION AND EXPRESSION RECOGNITION

9

As shown in Fig. 6, we illustrate the formulation for profile face detection and expression recognition, where the face data are categorized into five classes of different posed faces (i.e., pose angles: −90◦, −45◦, 0◦, 45◦, and 90◦) and six classes ofdifferent expressions (i.e., happy, anger, sad, surprise, fear, and disgust).


The multi-class hybrid-boost learning algorithm

10


Different-posed face detection

11

FERET face database, consists of 14051 eight-bit grayscale images of human faces with different poses ranging from frontal to left and right profiles.


Different-posed face detection

12

However, the strong classifier with the highest response does not necessarily indicate the correct class. So, we define the following two decision rules for selecting the correct strong classifier.


Facial expression recognition

Similar to multi-pose face detection, they apply the hybrid-boost learning for facial expression recognition and then compare the results with other facial expression systems.

The training data of the facial expression classifiers come from Cohn and Kanade Facial Expression Database.

The input training images are normalized to a standard size (24×24 pixels) with seven different expressions (happy, anger, sad, surprise, fear, disgust and neutral).

13

EXPERIMENTAL RESULTS AND DISCUSSIONS

14

DR = 99.44%

To decrease the false alarm rate, they add 2000 miss-classified and find that the falsealarm rate is reduced to 1.72%, and the average detection rate is 99.24%.

Here, they divide the whole training data, which consists of 1000 face images for each pose, and 5000 non-face images, into four groups (m=4). The average detection rate is reduced to about 94%.

EXPERIMENTAL RESULTS AND DISCUSSIONS

15

DT= 93.1% when the testing data is same as the training data.

Documents

HYBRID-BOOST LEARNING FOR MULTI-POSE FACE DETECTION AND FACIAL EXPRESSION RECOGNITION