Download pdf - Facial Image Analysis for age and gender and

WANGYUHENG

11812310D

Supervisor: Professor Kenneth Lam

Facial Image Analysis for age,

gender and ethnicity classification

Content

Background

Objective

Methodology

Result and

conclusion

Reference

Q&A

Background

Demographic information: Age, Ethnicity, Gender

Fundamental in human identification or verification

Wide application in security and human computer

interaction

Common features used: Grid feature , local binary pattern, Gabor feature

Common Learning algorithm: Support Vector Machine, Adaboost Algorithm

Objective

This project aims to classify demographic

information of human faces .

Discuss the performance of using PCA coefficient,

local binary pattern and local circular pattern

concerning demographic classification.

Gender

classificati

on

femal

emale

Ethnicity

classificati

on

Age

classificati

on

Non-

AsianAsian youthchild age

Methodology

Normalizatio

nIn this research, the normalized image is 192x168,

with 100 pixels between two eye centers.

Step 1: obtain normalized training face images

Step 2: represent each face image as a face vector

Step 3: compute mean face vector

Step 4: subtract mean face vector to get demeaned face vector

Step 5: Compute the covariance matrix ( ATA)

Step 6: Compute eigenvector Vi and eigenvalues

eigenvalues of ATA and AAT are same

relationship of eigenvectors are

Principal component analysis

Step 1 :Transform each training images into K coefficients

for k=1,…..,K

Each face in a database is transformed into its eigenface components

and the weights form a vector ,k represents its class

Step 2: With a query image, its corresponding weight vector .

The input face is classified to be class m if the distance between the

image and the face space (Euclidean distance) is minimum among the

classes

ek

Principal component analysis

basic LBP operator Multi-resolution LBP

operator

LBP4,1

LBP8,2

Uniform Pattern

If binary pattern contains at most two bitwise transitions from

0 to 1 or vice versa when the bit pattern is considered circular.

For LBPH, one single bin for non-uniform patterns, a separate bin

for every uniform pattern.

00000000(0 transition) 00000001(1 transition)

01110000(2 transitions) 01010011(6 transitions)

Local binary pattern

Chi Square Distance

Local binary pattern

Bilinear interpolation

• linear interpolation in x-direction

• Secondly, do the linear interpolation in y-direction

When the sampling points are not at

the center of the pixel, we apply

bilinear interpolation to obtain the

value.

Spatially enhanced histogram

Encode the appearance and the spatial relations of facial

regions

• Divide the face images into m facial regions

• Histogram is computed independently within each of m

regions

• Get the spatially enhanced histogram, its size is m x n,

where n is the length of a single histogram

Local circular pattern

Binary quantization Clustering quantization

For each pixel whose gray value is t with its P neighboring pixels, gray values are denoted

As {t1,t2,…,tp}. They are located on the circular neighborhood at the radius R.

The local circular pattern is denoted as p(LCPP,R)=(t1-t,t2-t,..,tp-t)T.

Given N training local circular patterns Pi, i=1,2,..,N. K-means clustering algorithm is

applied to find a partition C={c1,c2,….,ck} by minimizing the following function

Distance function: Euclidean distance (L2) City block distance (L1)

Given a local circular pattern p’=(p1’,p2’,..,Pp’)

Local circular pattern

K-means clusteringunsupervised learning algorithm for clustering problem

• Randomly select k local

circular pattern from sampling

LCPs as the initial means

• For every sampling pattern,

calculate its distance from each

cluster mean

• Assign the sampling data to the

nearest neighbor cluster

• Recalculate the cluster means,

and go back to the first step until

the finishing criteria is met 59 clusters

Adaboost Algorithm

Adaboost is an algorithm to

select good feature from a

huge set of features to form

an effective classifier. It is a

learning algorithm to boost

the classification

performance by combining

weak classifier which is just

better than random guess.

And each weight of the

training samples is

reweighted according to the

classification result after

each round of learning.weak classifier

Experiment setting

Task Database Number of training

images

Number of testing

images

Gender

classification

FERET database 200 50

Ethnicity

classification

FERET database 200 50

Age

classification--

child

FG-NET Aging

database

700 120(only child)

Age

classification--

youth

FG-NET Aging

database

700 34(only youth)

Age FG-NET Aging 700 180(include all)

Parameter of PCA Method

Weight vector

total

remove first 5 remove last 30



remove last 5 remove last 60




Parameter of LBP and LCP Method

P=8,R=1 P=8,R=2 P=8,R=3

Operator type:

Window sizes:

6x6 windows7x8 windows 4x4 windows

Basic LBP

LCP Method– Adaboost algorithm

6x6 windows7x8 windows 4x4 windows

Number of

features

selected

50

100

200

300

400

500

1000

2000

3000

Number of

features

selected

50

100

200

300

400

500

1000

2000

Number of

features

selected

50

100

200

300

400

500

1000

Treat each bin in the histogram as a single feature

Age classification specification

Child 0-14

Youth 15-45

Old >45

Age group standard

For each age group, there are two binary classifications(NA vs. A), each

group has its own classification rate

E.g. Child classification rate

c for child, d for non-child

PCA Flow chart

Input test image

Apply

Euclidean

distance to

match one

face in

database

Read gender, race

information of

matched face in

the database

Read gender,

race

information of

the testing

image

Compare information

Feature selection

(weight vector

on projecting on

low dimension)

Input test image

Apply Chi

square

distance to

match one face

in database

Read gender, race

and age

information of

matched face in

the database

Read gender,

race and age

information of

the testing

image

Compare information

Feature selection

(spatial

histogram on

LBP)

LBP Flow chart

LCP Flow chart

Input test image

Put all labels

and all data of

training images

and testing

images into a

csv file

Feed the data file

to adaboost and

train a strong

classifier

Read gender,

race and age

information of

the testing

image

Feature extraction

Feature selection

(one-dimension

spatial histogram

from LCP)

Face

database

Predict gender,

race and age

information of the

testing image

Compare information

Evaluation Method

Gender classification:Gender recognition rate=

number of hits on correct gender implication/ number of all testing images

Ethnicity classification:Ethnicity recognition rate=

number of hits on correct ethnicity implication/ number of all testing images

Age classification:Age recognition rate (child)=

number of hits on child implications/ number of all child testing images

Age recognition rate (youth)=

number of hits on youth implications/ number of all youth testing images

Age recognition rate (old)=

number of hits on old implications/ number of all testing images

Gender classification

64%74%

82%74%

48%52%34%36%36%32%38%

52%54%54%52%

0%

20%

40%

60%

80%

100%

PCA Gender classification rate

Fine difference between faces may bring negative effect in

classifying gender. Hence when we remove some of the last

elements in weight vectors, the classification rate improves

comparing to that uses entire weight vector.

72% 72%

62% 62%

55%

60%

65%

70%

75%

BasicLBP

LBP8,1 LBP8,2 LBP8,3

LBP Gender classification rate

72%

66% 64%60%65%70%75%

Window7x8

Window6x6

Window4x4

LBP Gender classification rate

• Texture information is more

representative than holistic

method.

• Basic LBP captures more

detailed information than other

local binary pattern. The order of

classification rate is same as the

order of information captured.

•The classification rate does not

change very much due to

different window size, but the

length of feature has a big

difference. For window 7x8, the

length of feature is

56x59=3304, where for window

6x6, the length of feature is

36x59=2124. The length of

feature has reduced by 1/3

while the accuracy only reduces

by 8% in gender classification.

74

84

60657075808590

LCP8,1 Gender classification rate

67

80

606570758085


6680

020406080

100


LCP8,1 can achieve 74% when

the number of feature selected is

100 out of 3304, and its best

performance is 84% classification

rate . LCP 8,2 can achieve 67%

when the number of feature

selected is only 50. And its best

performance is 80% which is lower

than LCP8,1. LCP8,3 can achieve

66% accuracy when the number of

features selected is 50. And its

best performance is 80% as well.

LBP operator Performance LCP operator Best performance

LBP8,1 72% LCP8,1 84%

LBP8,2 62% LCP8,2 80%

LBP8,3 62% LCP8,3 80%

78%

80%

82%

84%

86%

Window7x8

Window6x6

Window4x4

LCP Gender classification

LCP outperform than LBP method in classifying gender for all operator

types. And the classification rate improve by more than 10%. Clustering

quantization method and Adaboost learning algorithm can boost the

performance of grasping gender information of human images.

Window 7x8 can achieve 84%

classification rate at best, and

window 6x6 can achieve 81%

classification rate, window 4x4 can

achieve 80% classification rate. The

performance goes downward with

the increasing window size.

Ethnicity classification

5840

32

68 70 70 70

0

20

40

60

80

removelast 5

removelast 50

removelast 100

removefirst 5

removefirst 50

removefirst 100

total

PCA Ethnicity classification rate

fine differences between the faces are of the utmost

importance. Hence when we remove last elements of

weight vector make the classification rate decrease.

80%

85%

90%

95%

Windows7x8

Windows6x6

Windows4x4

LBP Ethnicity classification rate

86%

88%

90%

92%

94%

BasicLBP

LBP8,1 LBP8,2 LBP8,3

LBP Ethnicity classification rate One possible reason that LBP8,3

can achieve same performance as

basic LBP is that the scale of

information extracted can represent

ethnicity information well. Although

LBP 8,1 and LBP 8,2 can also

represent ethnicity information, but

the amount of effective ethnicity

information may be lower than LBP

8,3.

The classification rate does not

change very much due to different

window size. The classification rate

changes from 92% to 88%, and then

to 84%. The length of feature has

reduced by 1/3 while the accuracy

only reduces by 4% in gender

classification.

81%82%83%84%85%

LCP8,2 Ethnicity classification rate

80%

85%

90%

95%


8082848688


LCP8,1 can achieve 90% when

the number of feature selected is

100 out of 3304, and its best

performance is 94% classification

rate. But LCP8,2 and LCP8,3

cannot exceed performance of

LBP8,2 and LBP8,3.

80%

85%

90%

95%

Windows7x8

Windows6x6

Windows4x4

LCP Ethnicity classification rate

LBP operator Performance LCP operator Best performance

LBP8,1 90% LCP8,1 94%

LBP8,2 88% LCP8,2 84%

LBP8,3 92% LCP8,3 86%

LCP8,2 and LCP8,3 best performance is both lower than LBP

operators. For the LBP operator scale on radius 2 and radius 3,

binary quantization gives better quantization which means higher

classification rate. However, for the LBP operator scale on radius 1,

LCP classification rate is higher than LBP. It shows that LBP

operator scale on radius 1 better use clustering quantization.

Window 7x8 can achieve 94%

classification rate at best, and window

6x6 can achieve 90% classification

rate, window 4x4 can achieve 87%

classification rate. The performance

goes downward with the increasing

window size.

Age classification

60%65%70%75%80%85%90%95%

100%

50%60%70%80%90%

100%Childclassificationrate

Youthclassificatonrate

LBP 8,3 can be more

representative in age information

for child group than other three LBP

operators.Basic LBP can stand out in youth

classification because of its

detailed information captured and

it does not cause over-fitting. LBP

8,3 and LBP 8,2 are more

representative than LBP8,1 in age

information for youth group

Although in 4x4 window

situation(youth),

classification rate is a

little higher than that in

6x6 window size. We

may ignore this

because of the small

difference.

70.00%75.00%80.00%85.00%90.00%95.00%

100.00%

50 fe

atu

res

100 featu

res

20

0 f

eatu

res

300 featu

res

400 featu

res

500 featu

res

1000 featu

res

2000 featu

res

30

00

fe

atu

res

65%70%75%80%85%90%95%

100%

60%

70%

80%

90%

100%

Child classificationrate

Youth classificationrate

Old classificationrate

For child classification rate, LCP8,1 achieves best performance

comparing to LCP 8,2, LCP 8,3 The best performance of LCP8,1 is

87.5%, and the best performance of LCP8,2, LCP8,3 are both 84%.

Although in the end LCP8,2’s and LCP8,3’s best performance are the

same, we can for each feature selection option, the classification rate

of LCP8,2 is slightly better than that of LCP8,3 in general. Compare

to LBP method, LCP method perform better in general. K-means

clustering quantization is effective and it represents the image more

reasonably.LBP operator Performance LCP operator Best performance

LBP8,1 75% LCP8,1 87.5%

LBP8,2 78% LCP8,2 84%

LBP8,3 81% LCP8,3 84%

For youth classification rate, LCP8,1 again achieves best performance

comparing to LCP 8,2, LCP 8,3 The best performance of LCP8,1 is 91 %,

and the best performance of LCP8,2 is 80% while for LCP 8,3, the best

performance is 79%. Compare to LBP method, LCP perform much better.

The classification rate increase by 15% in average, which once again

demonstrate the effectiveness of clustering quantization method and

Adaboost algorithm for feature selection.LBP operator Performance LCP operator Best performance

LBP8,1 62% LCP8,1 91%

LBP8,2 68% LCP8,2 80%

LBP8,3 68% LCP8,3 79%

For old classification, through these experiments, we cannot see the old

classification rate follow any rule. It is mainly because of insufficient data for old

images. Although FG-NET Aging database has 1001 pictures for 82 peoples.

Age above 45 is not popular. Limited training patterns causes that we cannot

find conclusion regarding old classification

• With the increasing number of features selected, the classification

rate increases accordingly except several exceptions. But when the

number of feature is larger than a certain number, for example 500, or

1000, the classification rate will not increase any more. This rule can

be verified by each experiment in gender, race and age classification.

• The phenomenon that when the number of features increases, the

classification rate increases accordingly is due to training error. It is

only common when the number of features is smaller than 200.

General Observation—Adaboost Algorithm

Gender classification:

Ethnicity classification:

Conclusion

Operator Number of

features

used

Window

size

Classificati

on rate

LCP8,1 500 7x8 84%

Operator Number of

features

used

Window

size

Classificati

on rate

LCP8,1 200 7x8 94%

Age classification

Operator Number of

features used

Window size Classification

rate

Child LCP8,1 400 7x8 87.5%

Youth LCP8,1 1000 7x8 91%

Age classification

Operator Number of

features used

Window size Classification

rate

Child LCP8,1 400 7x8 87.5%

Youth LCP8,1 1000 7x8 91%

Test image

Child

Non-youth Youth

Non-child Old

Conclusion

• Texture information is more representative in classifying

demographic information. However, pure LBP is not enough

because of its sensitivity to noise and less descriptive power.

Binary quantization is the main reason to lead to these two

disadvantages. Clustering quantization method can alleviate the

problem and have a more descriptive ability to represent the image.

• When the image is divided into more windows, the classification

result should be better. But the computation complexity will

increase. In real experiment, we can find a proper operator, which

is a good tradeoff between recognition performance and

computation complexity.

• For feature selection by adaboost algorithm, not all histogram

bins are selected, we can still achieve good performance, because

the difference of demographic information lies in certain local

circular pattern.

Conclusion

Reference

[1] Paul Viola and Michael J. Jones. Robust real-time object detection. Proc. of

IEEE Workshop on Statistical and Computational Theories of Vision, 2001.

[2] H Bai, J. Wu, C. Liu. Motion and Haar-like Features Based Vehicle

Detection.12th International Multi-Media Modelling Conference Proceedings,

2006.

[3] T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation

invariant texture classification with local binary patterns, IEEE Trans. Pattern

Analysis Machine Inteligence.24, 2002

[4] D. Huang, H. Ding, C. Wang, Y. Wang, G. Zhang and L. Chen, Local circular

patterns for multi-modal facial gender and ethnicity classification, Image and

Vision Computing, vol. 32, no. 12, pp. 1181-1193, 2014.

[5] Lu, X., Chen, H., Jain, A.: Multimodal facial gender and ethnicity

identification. Proceedings of the 2006 international conference on Advances in

Biometrics, pp. 554–561 (2006)

[6] Y.Gao,Y.Wang,X.Feng,X.Zhou, Face recognition using most discriminative

local and global features , in:18th International Conferenceon Pattern

Recognition,2006,pp.351–354.

[7] S. Arivazhagan,J.Mumtaj,L.Ganesan, Face recognition using multi-

resolution transform, in: International Conference on Computational Intelligence

and Multimedia Applications,2007,pp.301–306.

[8] L.Wang,Y.Li,C.Wang,H.Zhang,2D Gabor face representation method for

Q&A