1
RESEARCH POSTER PRESENTATION DESIGN © 2011 www.PosterPresentati ons.com Design of a Pattern Recognition and Analysis System for 3-D Imaging Humans have an extremely well developed sense of detecting, recognizing and analyzing patterns around them. Extending this ability to machines has proved to be a very challenging task. In this project, a visual object detection framework was developed that is capable of processing images extremely rapidly while achieving high detection rates. First, a face detection system for 2D, frontal, upright and grayscale images was constructed, which was further extended for colored, multi-pose and 3D imaging. The system was tested successfully over a large database of standard images and gave results comparable to best similar systems available. INTRODUCTION METHODS OF SOLUTION EXTENSION OF SYSTEM FOR 3-D IMAGES 3D face recognition methods can achieve significantly higher accuracy than their 2D counterparts, rivaling fingerprint recognition as they depend upon geometry of rigid features on the face. This avoids such pitfalls of 2D face recognition algorithms as change in lighting, different facial expressions, make-up and head orientation. In this project, 3-D face recognition is carried on a real time video stream. Video streams are generally of low resolution and contain mostly non-frontal faces. This drawback is overcome by using 3D models. The proposed system consists of three stages: (i) reconstructing a 3D face model from multiple non-frontal frames in a video (ii) generating a frontal view from the derived 3D model (iii) using the Viola Jones 2D face recognition engine to recognize the synthesized frontal view Figure: Structure from motion (SfM) method for producing 3D models and synthesis of frontal view from 2D models Structure from Motion Obtaining a 3D face model from a sequence of 2D images is an active research problem. Morphable model (MM), stereography, and Structure from Motion (SfM) are well known methods in 3D face model construction from 2D images or video. In this project, SFM was used for producing a 3D model from a monocular video stream. SFM consists of four main elements: acquiring 2D video data, identification of points track and tracking those points over time to produce motion tracks. Finally, the motion tracks are converted into a viable 3D model using 3D factorization. 72 facial feature points that outline the eyes, eyebrows, nose, mouth and facial boundary were used. REFERENCES To improve the initial method of face detection, three additional properties were added in the system: Skin colour, Multi-View face detection and Cost-sensitive AdaBoost algorithm. This increased both the accuracy and speed of the detector. The algorithms for face recognition were implemented on MATLAB. The ROC (Receiver operating characteristic) curves of both the implementations: 2-D and 3-D are shown below. For figure 1) As the detection rate increases, number of false positives also increase. For figure 2) Best accuracy is achieved, as expected, by the frontal video frames, followed by Non-frontal video frames with 3D modeling, followed by Non-frontal 2D video frames 1. Paul Viola, Michael J. Jones, Robust Real-Time Face Detection, International Journal of Computer Vision 57(2), 2004. 2. Unsang Park, Anil K. Jain, 3D Model-Based Face Recognition in Video, 2 nd International Conference on Biometrics, Seoul, Korea, 2007 3. Deng Peng, Pei Mingtao, Multi-view Face Detection Based on AdaBoost and Skin Color, First International Conference on Intelligent Networks and Intelligent Systems Sahil Bhatia Third Year Undergraduate Department of Applied Physics Delhi Technological University New Delhi, India Email Id: [email protected] Contact Number: +91 965 062 6665 MODIFICATIONS IN THE VIOLA-JONES METHOD Undergraduate Student, Delhi Technological University Sahil Bhatia AUTHOR CONTACT DETAILS IMPLEMENTATION AND RESULTS Initially, for face detection, grayscale and frontal images were considered as the input problem set and a facial recognition system for these 2-D images was implemented using Viola -Jones method. The Viola-Jones method, given in year 2001, is world’s first real-time face detection system. The method consists of two stages as mentioned below: Training and Face Detection. For training, the CMU+MIT standard database of facial images was used. 1. 1 Creation of dataset CMU+MIT Dataset 4547- Face images, 2529 Non Face images Each of size 19x19 pixels in Grayscale 1.2 Calculation of feature intensity values Using rectangular features, calculate the intensity values of images in dataset Calculate Training Error: Total, Negative and Positive Errors 1.3 AdaBoosting of weak classifiers Update weights of misclassified classifiers A linear combination of weak classifier works as a strong classifier APPLICATIONS OF THE CURRENT SYSTEM Medical Image Analysis Figure- Left ventricle detection results on 2D MRI images. Red boxes show the ground truth, while cyan boxes show the detection results. 3-D Television Figure- A camera is used to detect if there are users wearing special 3D viewing glasses. Based on detection, display automatically switches between 2D and 3D viewing modes. Crowd Surveillance Figure- Video quality in crowd surveillance is poor and there is a great variety in pose and lighting. It is somehow compensated by using continuous feature detection License Plate Detection and Recognition Figure-Video stream in real-time. System consists of a detection and a character recognition module. Detector is based on AdaBoost CONCLUSION AND FUTURE WORK STAGE 1: Training STAGE 2: Face Detection Figure: Binary images after skin binarization Figure: Sample Multi-View image 2. 1 Image acquisition and pre-processing Image acquisition is done through a normal webcam On-line image input by triggering a video every 1 second and it is rescaled It is also converted into grayscale and variance normalized :Mean=0, Variance=1 2.2 Calculation of feature intensity values A sub-window is passed over entire image and the calculated feature values are compared with corresponding values of classifiers starting from first cascade If image passes through all 38 cascaded classifiers, a rectangle is drawn at that location to indicate a face 2.3 Removal of multiple face detections Mainly four kinds of regions in a face: eye region, nose region, mouth region and cheek region. A true face will be detected multiple times . Detected sub-windows are post-processed in order to combine overlapping detections into a single detection. A method of pattern recognition and analysis for human face recognition was successfully developed and tested. It was shown that the speed and accuracy of the system was greatly enhanced by getting additional information through skin color, use of cost effective AdaBoost and by construction of 3D models from multi-view 2D images. Future work can include use of Support Vector Machines (SVM) and Neural Networks to further enhance the system. Figure: Sample images in CMU+MIT dataset Figure: Rectangular features Figure: AdaBoost algorithm Figure: A typical input image which has been variance normalized Figure: A cascaded classifier. Green circles are faces and red are non faces Figure: Detection of four kinds of regions and removal of multiple faces. PROBLEM ANALYSIS Face recognition is the problem of detecting the location and size of face images in a digital image. It is a challenging problem due to great variation in pose, expressions, ambient lighting conditions, occlusion and orientation in a data image and it has not been completely solved yet. The proposed final scheme has been tested on CMU’s Face In Action (FIA) video database with 221 subjects. Experimental results show a 40% improvement in matching performance as a result of using the 3D models. Skin Colour 1) Skin colour is an important feature of human face used to detect faces faster. 2) All the possible face regions are extracted by skin colour. As much area of the image is excluded by the skin-color information, speed of the algorithm is improved greatly. 3) A pixel in converted YCbCr image is considered as a skin-pixel if its Cb and Cr values satisfy following equation: 137<Cr<177, 77<Cb<111, Multi-view face detection 1) Two types of pose variations are considered: non-frontal faces- rotated out of image plane, non-upright faces- rotated in the image plane 2) Different detectors are built for different views of face. A decision tree is trained to determine viewpoint class for a given window of image being examined. 3) Appropriate detector for that viewpoint can then be run instead of Cost-Sensitive AdaBoost algorithm 1) Two main differences between CS-AdaBoost algorithm and the naive AdaBoost are: (I) unequal initial weights are given to each training sample according to its misclassification cost (2) the weights are updated separately for positives and negatives at each boosting step 2) Due to more effectively focus on face Figure: Skin binarization Figure: Sample input for multi-view Figure: ROC curve of naïve and CS AdaBoost Figure: ROC plot for Viola- Jones method of face detection for 2D, frontal images Figure: ROC plot for frontal, non-frontal video frames with and without 3D modeling

Poster ions

Embed Size (px)

Citation preview

Page 1: Poster ions

RESEARCH POSTER PRESENTATION DESIGN © 2011

www.PosterPresentations.com

Design of a Pattern Recognition and Analysis System for 3-D Imaging

Humans have an extremely well developed sense of detecting, recognizing and analyzing patterns around them. Extending this ability to machines has proved to be a very challenging task. In this project, a visual object detection framework was developed that is capable of processing images extremely rapidly while achieving high detection rates. First, a face detection system for 2D, frontal, upright and grayscale images was constructed, which was further extended for colored, multi-pose and 3D imaging. The system was tested successfully over a large database of standard images and gave results comparable to best similar systems available.

INTRODUCTION

METHODS OF SOLUTION

EXTENSION OF SYSTEM FOR 3-D IMAGES 3D face recognition methods can achieve significantly higher accuracy than their 2D counterparts, rivaling fingerprint recognition as they depend upon geometry of rigid features on the face. This avoids such pitfalls of 2D face recognition algorithms as change in lighting, different facial expressions, make-up and head orientation.

In this project, 3-D face recognition is carried on a real time video stream. Video streams are generally of low resolution and contain mostly non-frontal faces. This drawback is overcome by using 3D models. The proposed system consists of three stages: (i) reconstructing a 3D face model from multiple non-frontal frames in a video (ii) generating a frontal view from the derived 3D model (iii) using the Viola Jones 2D face recognition engine to recognize the synthesized frontal view

Figure: Structure from motion (SfM) method for producing 3D models and synthesis of frontal view from 2D models

Structure from Motion Obtaining a 3D face model from a sequence of 2D images is an active research problem. Morphable model (MM), stereography, and Structure from Motion (SfM) are well known methods in 3D face model construction from 2D images or video. In this project, SFM was used for producing a 3D model from a monocular video stream. SFM consists of four main elements: acquiring 2D video data, identification of points track and tracking those points over time to produce motion tracks. Finally, the motion tracks are converted into a viable 3D model using 3D factorization. 72 facial feature points that outline the eyes, eyebrows, nose, mouth and facial boundary were used.

 

REFERENCES

To improve the initial method of face detection, three additional properties were added in the system: Skin colour, Multi-View face detection and Cost-sensitive AdaBoost algorithm. This increased both the accuracy and speed of the detector.

The algorithms for face recognition were implemented on MATLAB. The ROC (Receiver operating characteristic) curves of both the implementations: 2-D and 3-D are shown below. For figure 1) As the detection rate increases, number of false positives also increase.For figure 2) Best accuracy is achieved, as expected, by the frontal video frames, followed by Non-frontal video frames with 3D modeling, followed by Non-frontal 2D video frames

1. Paul Viola, Michael J. Jones, Robust Real-Time Face Detection, International Journal of Computer Vision 57(2), 2004.

2. Unsang Park, Anil K. Jain, 3D Model-Based Face Recognition in Video, 2nd International Conference on Biometrics, Seoul, Korea, 2007

3. Deng Peng, Pei Mingtao, Multi-view Face Detection Based on AdaBoost and Skin Color, First International Conference on Intelligent Networks and Intelligent Systems

Sahil BhatiaThird Year UndergraduateDepartment of Applied PhysicsDelhi Technological UniversityNew Delhi, IndiaEmail Id: [email protected] Number: +91 965 062 6665

MODIFICATIONS IN THE VIOLA-JONES METHOD

Undergraduate Student, Delhi Technological UniversitySahil Bhatia

AUTHOR CONTACT DETAILS

IMPLEMENTATION AND RESULTS

Initially, for face detection, grayscale and frontal images were considered as the input problem set and a facial recognition system for these 2-D images was implemented using Viola -Jones method. The Viola-Jones method, given in year 2001, is world’s first real-time face detection system. The method consists of two stages as mentioned below: Training and Face Detection. For training, the CMU+MIT standard database of facial images was used.

1. 1 Creation of dataset•CMU+MIT Dataset•4547- Face images, 2529 Non Face images

•Each of size 19x19 pixels in Grayscale

1.2 Calculation of feature intensity values•Using rectangular features, calculate the intensity values of images in dataset

•Calculate Training Error: Total, Negative and Positive Errors

1.3 AdaBoosting of weak classifiers•Update weights of misclassified classifiers

•A linear combination of weak classifier works as a strong classifier

APPLICATIONS OF THE CURRENT SYSTEM

Medical Image AnalysisFigure- Left ventricle detection results

on 2D MRI images. Red boxes show the ground truth, while cyan boxes

show the detection results.

3-D TelevisionFigure- A camera is used to detect if there are users wearing special 3D

viewing glasses. Based on detection, display automatically switches

between 2D and 3D viewing modes.

Crowd SurveillanceFigure- Video quality in crowd

surveillance is poor and there is a great variety in pose and lighting. It is somehow compensated by using

continuous feature detection

License Plate Detection and Recognition

Figure-Video stream in real-time. System consists of a detection and a

character recognition module. Detector is based on AdaBoost

CONCLUSION AND FUTURE WORK

STAGE 1: Training

STAGE 2: Face Detection

Figure: Binary images after skin binarization Figure: Sample Multi-View image

2. 1 Image acquisition and pre-processing• Image acquisition is done

through a normal webcam• On-line image input by

triggering a video every 1 second and it is rescaled

• It is also converted into grayscale and variance normalized :Mean=0, Variance=1

2.2 Calculation of feature intensity values• A sub-window is passed over

entire image and the calculated feature values are compared with corresponding values of classifiers starting from first cascade

• If image passes through all 38 cascaded classifiers, a rectangle is drawn at that location to indicate a face

2.3 Removal of multiple face detections•Mainly four kinds of regions in a face: eye region, nose region, mouth region and cheek region.

•A true face will be detected multiple times .

•Detected sub-windows are post-processed in order to combine overlapping detections into a single detection.

A method of pattern recognition and analysis for human face recognition was successfully developed and tested. It was

shown that the speed and accuracy of the system was greatly enhanced by getting additional information through skin

color, use of cost effective AdaBoost and by construction of 3D models from multi-view 2D images. Future work can include

use of Support Vector Machines (SVM) and Neural Networks to further enhance the system.

Figure: Sample images in CMU+MIT dataset

Figure: Rectangular features Figure: AdaBoost algorithm

Figure: A typical input image which has been variance normalized

Figure: A cascaded classifier. Green circles are faces and red are non faces

Figure: Detection of four kinds of regions and removal of multiple faces.

PROBLEM ANALYSISFace recognition is the problem of detecting the location and size of face images in a digital image. It is a challenging problem due to great variation in pose, expressions, ambient lighting conditions, occlusion and orientation in a data image and it has not been completely solved yet.

The proposed final scheme has been tested on CMU’s Face In Action (FIA) video database with 221 subjects. Experimental results show a 40% improvement in matching performance as a result of using the 3D models.

Skin Colour1) Skin colour is an important feature of human face used to

detect faces faster. 2) All the possible face regions are extracted by skin colour. As much

area of the image is excluded by the skin-color information, speed of the

algorithm is improved greatly. 3) A pixel in converted YCbCr image is considered as a skin-pixel if its Cb

and Cr values satisfy following equation:

137<Cr<177, 77<Cb<111, 190<Cb+Cr<215

Multi-view face detection

1) Two types of pose variations are considered: non-frontal faces-

rotated out of image plane, non-upright faces- rotated in the image

plane2) Different detectors are built for different views of face. A decision

tree is trained to determine viewpoint class for a given window

of image being examined. 3) Appropriate detector for that

viewpoint can then be run instead of running all detectors on all

windows.

Cost-Sensitive AdaBoost algorithm

1) Two main differences between CS-AdaBoost algorithm and the naive

AdaBoost are: (I) unequal initial weights are given to each training sample according to

its misclassification cost (2) the weights are updated separately for positives and

negatives at each boosting step2) Due to more effectively focus on face samples, it achieves robust and

high detection rate with modest false alarm rate

Figure: Skin binarization Figure: Sample input for multi-view Figure: ROC curve of naïve and CS AdaBoost

Figure: ROC plot for Viola-Jones method of face detection for 2D, frontal images

Figure: ROC plot for frontal, non-frontal video frames with and without 3D modeling