CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/35516/6/06... · 2018-07-02 · analysis includes face detection prior to face recognition. Face detection

1

CHAPTER 1

INTRODUCTION

Face recognition is a complex image-processing problem in real

world applications with multifaceted effects of illumination, occlusion,

expression, pose variation and imaging condition on the live images. Facial

analysis includes face detection prior to face recognition. Face detection finds

the position of the face in a given image. Face recognition identifies the given

images as a particular person by comparing with known structured properties

of the faces in the database and they are used commonly in most of the

computer vision applications. These images have some common properties

like same resolution, same facial feature components, similar eye alignment,

etc. These images are referred as standard image. Face detection detects the

faces and extracts the face images which include the major facial features

used for distinguishing the faces that include eyes, eyebrows, nose, and

mouth. Face recognition compares the test image with the standard images

using the common features extracted. A face recognition system is one of the

biometric information processing systems. Compared to other biometric

information processing systems i.e. fingerprint, iris scanning, signature, etc.

face recognition system has larger working range. Face detection involves

recognizing people using the essential characteristics of the faces. Face

detection is used for many authentication applications. Compared to other

biometrics, such as fingerprint, DNA, or voice, face recognition is more

natural, nonintrusive and can be used without the cooperation of the subject.

Due to the recent advances in pattern recognition and use of powerful

computers face recognition systems are extended to the real-time and it

2

achieves a satisfying performance under controlled conditions. This leads to

many potential applications. Automated face recognition includes various

techniques from different research fields, as computer vision, image

processing, pattern recognition, and machine learning. Computer vision

applications are universally used in digital camera, mobile phones, security

areas, cars, toys, hospitals, airports.

The primary applications of face recognition are:

• Person verification (matching): The face image of an

unknown individual is compared along with a claim of

identity to establish whether the individual is who he or she

claims to be.

• Person identification (one-to-many comparison) : The face

image of an unknown individual is compared to the face

image of known individuals in the database to establish the

identity of the person.

Face recognition can be used for these two purposes and it has

several application areas, a few of such applications are stated below.

• Security

Face recognition system are used to control accesses to buildings,

airports, harbor, ATM machines, border checkpoints, network security and

email authentication on multimedia workstations.

• Surveillance

A large number of CCTVs are used to monitor and look for known

criminals, drug offenders, etc. and on locating such cases the authorities can

be notified.

3

• General identity verification

In the case of electoral registration, banking, electronic commerce,

identifying newborns, national IDs, passports, drivers’ licenses, employee IDs

face recognition can be used.

• Criminal justice systems

Facial recognition systems are also useful for mug-shot,booking

systems, post-event analysis, and forensic analysis.

• Image database investigations

Data investigation such as search for licensed drivers benefits users,

database investigation are also used for identifying the missing children and

immigrants.

• “Smart Card” applications

Also in certain cases the face-images can be stored in a smart card,

barcode or magnetic stripe, authentication of which is performed by matching

the live image with the stored template.

In environments using multi-media with adaptive human-computer

interfaces face recognition system can also be a part of ubiquitous or context

aware systems, behavior monitoring at childcare or old people’s centers,

recognizing customers and assessing their needs. Face recognition system can

also be used in video indexing, labeling faces in the video and facial

reconstruction of a witness.

4

1.1 BIOMETRIC RECOGNITION

Biometric recognition refers to the automatic recognition of

individuals based on their physical and behavioral characteristics or either of

it. Humans intuitively use some common characteristics to recognize each

other. Any human physiological or behavioral measurements ( Jain et al

2005) can be used as a biometric characteristic if it satisfies the following

requirements:

• Universality, each person should have some unique

characteristic.

• Distinctiveness, each person should be sufficiently different in

terms of the characteristic.

• Permanence, the characteristic should not vary over a period

of time.

• Collectability, the characteristic can be quantitatively

measureable.

In a practical system, the following parameters are also important:

• Performance, which refers to the recognition accuracy and

speed.

• Acceptability, it should be possible use in everyday life.

• Circumvention, which indicates how easily the system can be

fooled using fraudulent methods.

Any biometric system includes two different modes:

• Verification

• And identification.

5

In the verification process, the system compares the captured

biometric data with the template stored in the database, like a one-to-one

comparison. This mode is typically used for positive recognition.

In the identification mode, the system recognizes the user by

searching the templates of all users in the database. In this case, the

comparison is one-to-many. This mode is typical a negative recognition

applications.

A typical biometric system includes four main modules

• The sensor module that captures the biometric data.

• The feature extraction module where the features are extracted

by processing the data.

• The matcher module in which the features extracted are

compared to the features stored in templates. The decision

making module may also be an integral part of matcher

module where the user’s identity is confirmed (verification) or

established (identification).

• The system database module is used to store the biometric

templates.

In any biometric system the main focus is on the feature extraction

module. A number of biometric systems used in real applications, each one

with its strengths and weakness, the choice mostly depends on the application.

Some of the commonly used biometric systems include

6

DNA System:

The DNA is a unique code for one’s individuality but it is used mostly in

forensic applications. It’s not useful in automatic real-time recognition

applications.

Ear Recognition System:

The ear recognition is based on matching the distance of salient points on the

pinna, but it is not very distinctive in establishing the identity of a user.

Finger Print System:

The finger print has a very high matching accuracy with a reasonable price,

but this system requires a large amount of computational resources and for

elderly persons fingerprint is changing or not recognized.

Gait System:

The gait is the peculiar way one walks and is a very complex biometric as it

cannot be very distinctive, but it can be used in low-security applications.

This method is computationally expansive.

Hand Geometry Recognition System:

The hand geometry recognition system is very simple, easy to use and

relatively cheap, but the geometry is not so distinctive. It can be used in

verification mode. It is one of the earliest automated biometric systems.

Iris System:

The iris is the annular region of the eye bounded by the pupil and the sclera

(the white of the eye). The iris texture carries very distinctive information

useful for recognition. The early iris-based recognition system required

considerable user participation and were expensive, but the newer systems

have become more user-friendly and cost-effective.

7

Retinal Scan System:

The retinal scan is one of the most secure biometric as it is not easy to change

or replicate and it is the characteristic of each individual and each eye. The

image acquisition requires the user to look into an eyepiece and focus on a

specific spot so that a predetermined part of the retinal vasculature can be

imaged. The image acquisition requires high cooperation of the subject and

contact with the eye-piece. These factors can affect the public acceptability of

retinal biometric.

Signature Recognition System:

The signature has been accepted as verification method, but it could change

over a period of time and it can be influenced by physical and emotional

conditions of the subject.

Voice Recognition System:

The voice is another biometric which is not very distinctive and it changes a

lot over a period of time and is not useful in large-scale identification.

The samples of the same biometric characteristic from the same

person may not always be equal due to imperfect imaging conditions, changes

in the user’s characteristic, ambient conditions and user’s interaction with the

sensor.

The two major possibilities of errors in a biometric verification system are

When biometric measurements from two different persons are

recognized to be from the same person the error is called false

match, and when two biometric measurements from the same

person are recognized to be from two different persons the

error is called false non-match. They can be also called false

acceptance and false rejection. There is a tradeoff between the

8

false match and false non-match rate, in fact, both are

functions of the system threshold t: if t is decreased to make

the system more tolerant to input variations and noise, then the

false match rate increases, on the other hand if t is increased to

make the system more secure, then the false non-match rate

increases.

Two more recognition errors include the failure to capture and the

failure to enroll; the first corresponds to the number of times the biometric

device fails to automatically capture a sample, and the second denotes the

number of times users cannot enroll in the recognition system.

The applications of biometrics can be divided into three main

groups namely commercial, government and forensic

applications. The commercial constructive recognition

applications can work both in verification and identification

mode. Whereas government and forensic pessimistic

applications requires mostly identification.

Another important aspect of a biometric system is the

interaction with the user and his or her privacy. If the

interaction is easy and comfortable the system will be easily

accepted and if there is little cooperation and participation

required to the user the system may be perceived more

convenient. On the other hand, systems that do not require

user participation may be perceived as a threat to privacy.

In addition to these applications, the underlying techniques in the

current face recognition system have also been modified and used for related

applications such as gender classification, expression recognition and pose

recognition and each of these has its utility in various domains.

9

Facial expression recognition can be utilized in the field of

medicine for intensive care monitoring while facial feature recognition and

detection can be exploited for tracking a vehicle driver’s eyes and thus

monitoring his fatigue, as well as for stress detection. Face recognition is also

being used in conjunction with other biometrics such as speech, iris,

fingerprint and ear and gait recognition in order to enhance the recognition

performance of these systems. This has been made possible by the

accessibility of robust algorithms.

Face detection is a straightforward task for humans. In reality

humans recognize human faces very quickly compared to the face recognition

systems. Nevertheless, this task becomes challenging for a machine as the

face image captured with a vision sensor, gets altered by pose variations

(rotation out-of-plane)due to variation in camera angle, illumination, facial

expressions, and occlusions (glasses, sunglasses, hat).This makes the face

recognition process more complex. The face detection problem is one of the

oldest problems in computer vision, dating as early as 1972. Though the

performance of the systems that are developed in the past few years for face

detection have been good, face detection is still an interesting problem.

Because, recognizing patterns by computer systems is not completely

understood when compared to the performance of a human being. The first

step for face recognition system is to acquire an image from a camera. Second

step is to detect the face region from the acquired image. The third step is face

recognition that takes the face images from output of detection part. The

various steps for the face recognition system are given in Figure 1.1.

Figure 1.1 Stages involved in face processing

Input Image FaceDetection

FaceRecognition

10

There are many closely related problems of face detection. A

general view of different face processing problems is shown in Figure 1.2.

Figure 1.2 Face Recognition problems

Face tracking methods continuously estimate the location and

possibly the orientation of a face in an image sequence, while facial

expression recognition concerns identifying the affective states (happy, sad,

and surprised) of humans. For any of these systems the first step is face

detection.

1.2 FACTORS THAT IMPINGE ON FACE RECOGNITION

SYSTEM

The factors affecting the overall performance of the face

recognition system and portable solutions are given below:

• Camera distortion and noise problems.

• A human face is a 3D object and a non-rigid body, at times it

changes due to the change of mood of subject generating

facial expressions. A smiling face and a frowning face are

completely different from the perspective of face recognition.

Input Image

Pose variation Illumination variation

Expression variation

Presence of Occlusion

11

To overcome these problems the searching area could be

reduced to eyes and nose, excluding mouth and ears.

• Changes in illumination direction can affect the quality and

description of the 2D image representation. There are different

types of light with different spectrums, it can be artificial or

natural and change during the day. They can be alleviated

using image enhancement techniques.

• The subject can change his/her appearance in time due to the

age and to voluntary changes in facial outlook.

• The pose variation problem is one of the hardest problems in

face recognition. There could be variations in translation, scale

and rotation. According to Chan et al (2010) Translation can

be easily solved using a windowing method. The scaling

problem is also easy to solve creating an image pyramid (a

collection of the same image with different resolutions) to

represent the input image. The rotation along the axis

perpendicular to the image plane can be solved by rotating

back the image. The hardest problem is to handle rotations out

of the image plane because they can cause occlusions. An

occluded face usually is not suitable for recognition and it is

not selected from images databases or videos.

• A complex image background can also affect face recognition

system; it can be removed using a face detector before face

recognition to reduce the searching area.

• Facial makeup and hair style are less influential than other

facial variations. Usually a face recognition system requires

user’s cooperation on this problem.

12

The face biometric is accepted with a number of intrinsic

(e.g., expression and age) and extrinsic (e.g., pose and lighting) variations.

There has been a significant improvement in face recognition performance

during the past decade, but it is still below acceptable level for use in many

applications. The first problem that needs to be addressed in face recognition

is face detection. More efforts have been devoted to 2D face recognition

because of the availability of commodity: 2D cameras and deployment

opportunities in many security scenarios. However, 2D face recognition is

prone to a variety of factors encountered in practice, including pose and

lighting variations, expression variations, age variations, and facial

occlusions. The various problems that affect the face detection and

recognition system are discussed in detail.

1.2.1 Lighting Variation

The difference in face images of the same person due to severe

lighting variation can be more significant than the difference in features of

face images of different persons. The face being a 3D object, different

lighting sources generates various illumination conditions and shadings on the

face image. Many methods have been developed to study the invariant facial

features that are robust against lighting variations, and these methods

compensate for the lighting variations using prior knowledge of lighting

sources based on data meant for training. The examples of the face image

affected by illumination are as shown in Figure 1.3.

Figure 1.3 Example of varying illumination

13

These methods provide visually enhanced face images after lighting

being normalization and show improved recognition accuracy. Although the

performance of face recognition systems in indoor platforms has reached a

certain level, face recognition in outdoor platforms still remains as a

challenging topic because of illumination problem. Many illumination

invariant face recognition approaches are introduced to overcome illumination

challenge. On the other hand, various illumination normalization methods are

proposed. These illumination normalization methods remove the illumination

variation to some extent facilitating the face recognition process.

1.2.2 Occlusion

Figure 1.4 Example of occluded face images

Figure 1.4 shows a glimpse of the face image affected by partial

occlusion. Face images often appear occluded by other objects or by the face

itself (i.e., self- occlusion), especially in surveillance videos. Most of the

commercial face recognition systems reject an input image when the eyes

cannot be detected. Occlusions cause erroneous facial feature localization.

Local feature based methods are proposed to overcome the occlusion

problem. When misalignment problem is solved, a very high correct

recognition rates can be achieved with a generic local appearance-based face

recognition algorithm. In case of a lower face occlusion, only a slight

decrease in the performance is observed, when a local appearance-based face

14

representation approach is used. This indicates the importance of local

processing when dealing with partial face occlusion. Moreover, improved

alignment increases the correct recognition rate even in the experiments

performed against the lower face occlusion. This shows that the face

registration plays a key role on face recognition performance. The challenge

is on how to obtain the alignment points. Normally, eyes points are used for

face alignment. However, a sunglass or occlusion prevents the detection of

eye coordinates. They need many points from different parts of a face.

However, detecting many points from a face is not realistic in real-world

conditions. This problem is overcome by the proposed patch-based feature

extraction.

1.2.3 Facial Expression

Facial expression is an internal variation that causes large intra-

class variation. There are some local feature based approaches and 3D model

based approaches designed to handle the expression problem. On the other

hand, the recognition of facial expressions is an active research area in human

computer interaction and communications. Face expression is less significant

issue when compared with occlusion, angle and illumination but it affects the

face recognition results.

Figure1.5Example of varying expression

An example of the expression varying face image of a single person

is shown in figure 1.5.Although a closed eye or smiling face does affect the

15

recognition rate by 1% to 10%, a face with large laugh has an influence as

high as 30% since a laughing face changes the face appearance and distorts

the correlation of eyes, mouth and nose. Hence, the features are grouped into

different classes. This suddenly increases the false alarm rate. Many research

work focus on small changes on the face surface. However, huge changes in

expression are still an unsolved problem. The reconstruction of the face

components solves this challenging problem. The latest research which remodels

the face by using texture, shape and spatial frequency decomposition

overcomes this challenge.

1.2.4 Pose Variation

Pose variation degrades the performance of face recognition

system. The face image that may appear different depending on the direction

in which the face is imaged. Thus, it is possible that images taken at two

different viewpoints of the same subject (intra-user variation) may appear

more different than two images taken from the same view point for two

different subjects (inter-user variation). The different face images affected by

variation in poses are as shown in Figure 1.6.

Figure1.6Example of varying pose

In a surveillance system, the camera is mostly mounted to a

location where the people cannot reach to the camera. Since the camera is

mounted at high location, the faces are viewed by the camera are in different

poses (i.e with varying angle).

16

This is the simplest case in city surveillance applications. The most

difficult case is that people pass through the camera view and do not look at

the camera lens directly. Restriction cannot be given to people for their

behaviors in public places. Recognition in such cases must be done in an

accurate way. However, even state-of-the-art-techniques have 10 or 15 degree

angle limitation to recognize a face. Recognizing faces from more angles is

another challenge. The most significant face features are lost at an angle of 25

degree or 30 degree. Hence, the system reliability decreases exponentially.

The techniques proposed until now did not give good results in actual

working conditions since there are many other factors that are added to face

pose problem in outdoor environments. In many face recognition scenarios

the pose of the probe and gallery images is different. For example, the gallery

image might be a frontal ‘‘mug-shot’’ and the probe image might be a 3/4

view captured from a camera in the corner of a room. Approaches addressing

pose variation can be classified into two main categories depending on the

type of gallery images they use.

Multi-view face recognition is a direct extension of frontal face

recognition in which the algorithms require gallery images of every subject at

every pose. In face recognition across pose the concerns are with the problem

of building algorithms to recognize a face from a novel viewpoint, i.e., a

viewpoint from which it has not previously been seen.

The method for acquiring face images depends upon the underlying

application. Surveillance applications may best be served by capturing face

images by means of a video camera while image database investigations may

require static intensity images taken by a standard camera. Some other

applications, such as access to top security domains, may even necessitate the

forgoing of the non intrusive quality of face recognition by requiring the user

to stand in front of a 3D scanner or an infra-red sensor.

17

Face Detection

Image - Based Methods

Knowledge - Based Methods

1.3 FACE DETECTION

Depending on the face data acquisition methodology, face detection

techniques can be broadly divided into three categories: methods that operate

on intensity images, those that deal with video sequences, and those that

require other sensory data such as 3D information or infra-red imagery. The

various techniques of face detection includes

Feature based method

Template matching

Figure 1.7 Face detection methods

1.3.1 Feature Based Method

Invariant features of the face image are extracted in the Feature

based method. In this method features are to be invariable over the variability

of the human face expression and pose. The feature in this technique includes

the distance between the eyes, eyebrows, size of the lips, nose etc. Many

methods were proposed to extract the features of the face image. Based on the

extracted features a lot of statistical models were developed which would be

used for face detection.

18

Figure 1.8 Geometrical features (white) of the face image

The main advantage of the feature-based techniques is that such

methods are relatively robust to position variations in the input image. In

principle, feature-based schemes can be made invariant to size, orientation

and/or lighting. Other benefits of these schemes include the compactness of

representation of the face images and high speed matching. The major

disadvantage of these approaches is the difficulty of automatic feature

detection (as discussed above) and the fact that the implementer of any of

these techniques has to make arbitrary decisions about which features are

important. After all, if the feature set lacks discrimination ability, no amount

of subsequent processing can compensate for that intrinsic deficiency.

1.3.2 Template Matching

Template matching can be subdivided between two approaches:

feature-based and template-based matching. The feature-based approach uses

the features of the search and template image, such as edges or corners, as the

primary match-measuring metrics to find the best matching location of the

template in the source image. The template-based, or global, approach, uses

19

the entire template, with generally a sum-comparing metric

(using SAD,SSD, cross-correlation, etc.) that determines the best location by

testing all or a sample of the viable test locations within the search image that

the template image may match up to.

1.3.2.1 Feature based

If the template image has strong features, a feature-based approach

may be considered. The approach may prove further useful if the match in the

search image might be transformed in some fashion. Since this approach does

not consider the entire template image, it can be more computationally

efficient when working with source images of high resolution. As an

alternative method, template-based approach, may require searching

potentially large amounts of points in order to determine the best matching

location.

1.3.2.2 Template based approach

For templates without strong features, or for when the bulk of the

template image constitutes the matching image, a template-based approach

may be effective. As aforementioned, since template-based template matching

may potentially require sampling of a large number of points, it is possible to

reduce the number of sampling points by reducing the resolution of the search

and template images by the same factor and performing the operation on the

resultant downsized images (multi resolution, or pyramid, image processing),

providing a search window of data points within the search image so that the

template does not have to search every viable data point, or a combination of

both.

20

1.4 FACE RECOGNITION

Face recognition is the process of identifying a particular person if

he or she belongs to the group of members present in the database. There are

numerous methods for face recognition. The various face recognition

algorithms are classified as shown in Figure 1.9.

Figure 1.9 Face recognition methods

1.4.1 2D Face Recognition

1.4.1.1 Linear/Non-linear

Automatic face recognition is a kind of pattern recognition

problem, and it is very hard to solve due to its nonlinearity. Particularly, it is a

template matching problem, where recognition has to be performed in a high-

dimensional space. Higher the dimension of the space, computation time to

find a match is higher, so a dimensionality reduction technique can be used to

project the problem in a lower-dimensionality space. Some of the

dimensionality reduction techniques are discussed below.

The Eigenfaces method (Kirby & Sirovich 1990) can be considered

as is the first approaches in this sense. In the eigenface method an N x N

image I is linearized in a N2 vector, so that it represents a point in a

N2-dimensional space. A low dimensional space is found by means of a

dimensionality reduction technique.

Face

2D Recognition

3D Recognition

Linear/Non-Linear

Neural Network

21

The problem was overcome in the principal component analysis

method in which after the linearization the mean vector is calculated, among

all images, and subtracted from all the vectors, corresponding to the original

faces. The covariance matrix is then computed, in order to extract a limited

number of its eigenvectors, corresponding to the greatest eigen values. These

few eigenvectors, also referred to as eigenfaces, represent a base in a low-

dimensionality space. When a new image has to be tested, the corresponding

eigenface expansion is computed and compared against the entire database,

according to such a distance measure. As the PCA is performed only for

training the system, this method works very fast, for testing new face images.

The PCA has been intensively exploited in face recognition applications.

Linear Discriminant Analysis (LDA) (Martinez & Kak 2001) is

another better alternative to the PCA. It provides discrimination among the

classes, while the PCA deals with the input data in their entirety, without

paying any attention for the underlying structure. The main aim of the LDA is

to find a base vector to provide the best discrimination among the classes and

try to maximize the between-class differences and minimizing the within-

class ones. According to Martinez & Kak (2001), though the LDA outperform

the PCA, LDA provides better classification performances only when a wide

training set is available. Besides recent studies also strengthen this argument

especially of this problem referred to as the SSS (Small Sample Size)

problem.

Some approaches, such as the Fisherfaces, the PCA is considered as

a preliminary step in order to reduce the dimensionality of the input space,

and then the LDA is applied to the resulting space, in order to perform the real

classification. In the work of (Chen et al 2000; Yu & Yang 2001), combining

PCA and LDA, discriminated information together with redundant one is

22

discarded. Thus, in some cases, the LDA is applied directly on the input space

(Chen et al 2000; Yu & Yang 2001).

The DCV (Discriminant Common Vectors) (Cevikalp et al 2005)

represents a further development to this approach. The main idea of the DCV

is to collect the similarities among the elements in the same class and drop

their dissimilarities. In this way each class can be represented by a common

vector computed from the within scatter matrix. When an unknown face has

to be tested, the corresponding feature vector is computed and associated to

the class with the nearest common vector. The main disadvantage of the PCA,

LDA, Fisher-faces is their linearity Particularly the PCA extracts a low-

dimensional representation of the input data only exploiting the

covariancematrix, so that no more than first- and second order statistics are

used.

In (Bartlett Marian et al 2002) proposed that the first- and second

order statistics hold information only about the amplitude spectrum of an

image, discarding the phase-spectrum, but the human capability in

recognizing objects is mainly driven by the phase-spectrum. To overcome this

problem (Bartlett Marian et al 2002) the ICA was introduced as a more

powerful classification tool for the face recognition problem. The ICA can be

considered as a generalization of the PCA, but providing three main

advantages: (1) It allows a better characterization of data in an n-dimensional

space; (2) the vectors found by the ICA are not necessarily orthogonals, so

that they also reduce the reconstruction error; (3) they capture Discriminant

features not only exploiting the covariance matrix, but also considering the

high-order statistics

23

1.4.1.2 Neural networks

The other nonlinear solution for the face recognition problem is the

use of neural networks. Neural network is largely used in many other pattern

recognition problems, and readapted to cope the people authentication task.

The advantage of neural classifiers over linear ones is that they can reduce

misclassifications among the neighboring classes. The basic idea is to

consider a net with a neuron for every pixel in the image. Nevertheless,

because of the pattern dimensions (an image has a dimension of about 256 X

256 pixels) neural networks are not directly trained with the input images, but

they are preceded by the application of such a dimensionality reduction

technique.

Cottrell & Fleming (1990), introduced a neural net, that operates in

auto-association (AA) mode. At first, the face image, represented by a vector

x, is approximated by a new vector h with smaller dimensions by the first

network (auto-association), and then h is finally used as input for the

classification net. According to Cottrell and Fleming the AA neural network

does not perform better than the eigen faces even if in optimal circumstances

other kind of neural networks are also tested for face recognition, in order to

exploit their particular properties.

Self Organizing Map (SOM) is invariant with respect to minor

changes in the image sample, while convolution networks provide a partial

invariance with respect to rotations, translations and scaling. In general, the

structure of the network is strongly dependent on its application field, so that

different contexts result in quite different networks. Lin et al (1997)proposed

the Probabilistic Decision Based Neural Network, which they modeled for

three different applications namely a face detector, an eyes localizer and a

face recognizer. The flexibility of these networks is due to their hierarchical

structure with nonlinear basis functions and a competitive credit assignment

24

scheme. Meng et al (2002) introduced a hybrid approach, in which, through

the PCA, the most discriminating features are extracted and used as the input

of a Radial basis function (RBF) neural network. The RBFs perform well for

face recognition problems, as they have a compact topology and learning

speed is fast. This RBF neural network leads to the problem of the over

fitting. The dimension of the network input is comparable to the size of the

training set. High dimension of the input results in slow convergence. The

sample size has to exponentially grow for having a real estimate of the

multivariate densities when the dimension increases. In case of the singular

problem if the number of training patterns is less than the number of features,

the covariance matrix is singular. In general, neural networks based

approaches encounter problems when the number of classes increases.

Moreover, they are not suitable for a single model image recognition task,

because multiple model images per person are necessary in order for training

the system to optimal parameter setting.

1.4.2 3D Face Recognition

Majority of face recognition methods based on 2D image

processing using monochrome or color images, reached a recognition rate

higher than 90% under lighting controlled conditions, and whenever subjects

are consentient. Unfortunately in case of pose, illumination and expression

variations the system performances drop, because 2D face recognition

methods still encounter difficulties. Xu et al (2004) compared intensity

images against depth images with respect to the discriminating power of

recognizing people. The depth maps give a more robust face representation,

because intensity images are heavily affected by changes in illumination.

Generally, for 3D face recognition it is intended that a class of methods that

work on a three-dimensional dataset, representing both face and head shape as

range data or polygonal meshes. The main advantage of the 3D based

25

approaches is that the 3D model retains all the information about the face

geometry. Moreover, 3D face recognition also grows to be a further evolution

of 2D recognition problem, because a more accurate representation of the

facial features leads to a potentially higher discriminating power. In a 3D face

model, facial features are represented by local and global curvatures that can

be considered as the real signature for identifying persons. The 3D facial

representation is a promising tool coping many of the human face variations,

extra-personal as well as intrapersonal. Two main representations are

commonly used to model faces in 3D applications that are 2.5D and 3D

images as shown in Figure. 1.10. A 2.5D image (range image) consists of a

two dimensional representation of a 3D points set (x,y, z), where each pixel in

the X–Y plane stores the depth value z. One can think of a 2.5D image as a

grey-scale image, where the black pixel corresponds to the background, while

the white pixel represents the surface point that is nearest to the camera. In

particular, a 2.5D image taken from a single viewpoint only allows facial

surface modeling, instead of the whole head. This problem is solved by taking

several scans from different viewpoints, building a 3D head model during a

training stage. On the contrary, 3D images are a global representation of the

whole head, and the facial surface is further related to the internal anatomical

structure, while 2.5D images depend on the external appearance as well as

environmental conditions. The simplest 3D face representation is a 3D

polygonal mesh, that consists of a list of points (vertices) connected by edges

(polygons). There are many ways to built a 3D mesh, the most used are

combining several 2.5D images, properly tuning a 3D morphable model or

exploiting a 3D acquisition system (3D scanner). A further difference

between 2.5D and 3D images is that last ones are not affected by self-

occlusions of the face, when the pose is not full-frontal.

26

Figure 1.10 (a) 2D image, (b) 2.5 image and (c) 3D image

Many criteria can be adopted to compare existing 3D face

algorithms by taking into account the type of problems they address or their

intrinsic properties. Some approaches perform very well only on faces with

neutral expression, while some others try also to deal with expression

changes. An additional parameter to measure 3D models based robustness is

represented by sensitive size variation. In fact, sometimes the distance

between the target and the camera can affect the size of the facial surface, as

well as its height, depth, etc. Therefore, approaches exploiting a curvature-

based representation cannot distinguish between two faces with similar shape,

but different size. In order to overcome this problem some methods based on

point-to-point comparison or on volume approximation are used. However,

the absence of an appropriate standard dataset containing large number and

variety of people, whose images were taken with a significant time delay and

with meaningful changes in expression, pose and illumination, is one of the

great limitations to empirical experimentation for existing algorithms.

In particular, 3D face recognition systems are tested on proprietary

databases, with few models and with a limited number of variations per

model. Consequently comparing the performances of different algorithms

27

often turns into a difficult task. . Nevertheless, they can be classified based on

the type of problems they address such as mesh alignment, morphing, etc.

1.5 MOTIVATION

The number of vision applications in various digital devices are

increasing. Each of these applications requires efficient processing of image

sequences using more and more complex pattern recognition algorithms.

Speeding up any of these tasks gives way to integrate more vision algorithms

in a device. There has been a different approach to improve detection

algorithms. In face detection task cascade of classifiers has been a popular

choice. Object detection has been speeded up using branch and bound

algorithm within a particular detection framework. The time consumed for

feature computation was reduced by approximating the features at different

scales. Some have implemented the existing methods in GPU’s which makes

use of parallel computing to speed up the detection task. A new alternative

search approach is suggested to target face detection and recognition which

may add value to some of the above mentioned methods.

1.6 OBJECTIVE

An alternative search strategy is devised to detect faces from an

image which can perform reasonably well even when faces differ in their

pose, variation in expression or occlusion in faces.

1. To detect the face region from the given face image with

maximum accuracy and reduce the false alarm rate using skin

color and the statistical features of the simplified local binary

mean with support vector machine.

2. To detect the occlusion present in the partially occluded face

image and recognize the face by comparing the SLBM and

28

MBWM features with the features of the images in the face

database.

3. To validate the proposed algorithms SLBM and MBWM for

face expression recognition.

4. To validate the proposed algorithms SLBM and MBWM for

pose detection.

1.7 CONTRIBUTIONS

The main contributions of this thesis are as follows:

1. Face detection system

A face detection system is proposed to reduce the number of miss

detections using skin region detection and extracted SLBM features. The

features of the skin region are used to predict the location of the face region

and are further verified by SVM classifier. Theoretical insight on LBP

approach and the proposed SLBM approach with respect to detection rate,

accuracy, false detection rate and precision are discussed in detail.

2. Pre-Processing for face recognition

Pre-processing helps in increasing the recognition rate. Many

pre-processing algorithms are applied on the detected face image which helps

in the face recognition process. Illumination correction is almost completely

done using the pre-processing technique. Pre-processing including gamma

correction, log transform, histogram equalization and local histogram

equalization are performed. Recognition is performed on the pre-processed

image which gave a better result compared to the original image without pre-

processing.

29

3. An alternate occlusion detection and face recognition technique

More importance is given to occlusion detection problem. Novel

algorithm is proposed for the detection of occlusion and the recognition of the

face. Statistical LBP, SLBM and MBWM features are extracted, and SVM is

used to detect the occluded region. The features of the occlusion free region

are used for the face image recognition.

4. Expression and Pose estimation

The proposed algorithms are extended to estimate the varying

expressions in the face image. The expression variation mostly affects the eye

region and the mouth region. So, weighted feature extraction is employed to

estimate the expression in the face image. Also, the MBWMH algorithm is

used to find the pose of the image. These features are also compared with the

LBPH and the SLBMH algorithms.

1.8 ORGANIZATION OF THE ENTIRE THESIS

The introduction describes about face detection, recognition and the

purpose of the thesis. Chapter 1 also gives a detailed explanation on

applications of the face detection and recognition system and the various

methods used for face detection and recognition system.

Chapter 2 details about the background of face detection and

recognition in the past.

The chapter also compares the various methods proposed for face

detection and recognition with its merits and demerits.

Chapter 3 is about the face detection process and the preprocessing

technique. In this chapter, skin color based face detection, illumination

30

normalization method using gamma intensity correction; log transform,

histogram equalization and local histogram equalization are explained in

detail.

Chapter 4 gives information on occlusion detection and face

recognition using LBP, SLBM and MBWM features. Also, the feature

analysis technique is described in detail. Finally, feature classification method

is described for face recognition.

Chapter 5 deals with some of the applications of SLBM and

MBWM features for face expression and pose estimation. The components of

chapter 3 and chapter 4 are used for the expression and pose estimation.

Chapter 6 gives the results of testing carried out by using major

face databases. Most of the major databases are used to measure the

efficiency and robustness of the methods. In chapter 6, a comparison with

other systems is also provided with graphs and tables.

Chapter 7 includes the conclusion and the summary of the overall

work and the extension of the current work.

Documents

CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/35516/6/06... · 2018-07-02 · analysis includes face detection prior to face recognition. Face detection