Modeling Faces with Active Shape Models

MITRE Corporation is a federally-funded research-and-development corporation that has developed their own facial recognition system, known as MITRE Matcher. Non-frontal facial images create a significant challenge to the recognition process for both MITRE Matcher and other facial recognition systems, even if the degree of the pose variation is as small as ten or twenty degrees. This project's goal was to research, implement, and evaluate facial-landmarking algorithms and approaches to pose- analysis and pose-correction.

Modeling Faces with Active Shape Models

Background

2011-2012 MITRE Computer Science Clinic Landmarking and Pose Correction for

Face Recognition

An Active Shape Model (ASM) uses a dataset's statistics to capture the possible shapes that objects of a certain class can take. We used face shapes consisting of sets of landmarks (nose tip, mouth corners, etc.) and trained the model to recognize the configuration of the average face in our training data. We also find the most significant parameters that describe the ways a face can vary from the average while still representing a viable face.

Landmarking heuristics

Pose correction pipeline

Facial LandmarkingASEF

The Average of Synthetic Exact Filters (ASEF) is a texture-based method for landmarking. To create an ASEF filter we specify a desired synthetic output for each training image. That desired output consists of a Gaussian dot centered at the ground truth landmark position in the training image. This results in a filter that exactly transforms the training image into the synthetic image. We average all of a dataset's exact filters to get the final ASEF filter, which can then be applied to facial images to locate the landmark.

Image warping

The team is delivering to MITRE: • Code implementing ASEF and UMACE landmarking, Active Shape Models and its component algorithms, and our pose-correction technique, as well as scripts and applications for testing and demonstrating all of these algorithms.• Accuracy results for landmarking and pose-corrected match scores. Because full-image pose-correction can lead to lower match scores, improved face recognition may result from comparing feature-relative patches instead of warped full images. The team's feature extraction routines will form the basis of that process.

We use thin-plate splines to smoothly warp from one face set of landmarks to another. By isolating the yaw from ASM, we transform a landmarked face into a neutral, frontal pose. The other ASM-derived vectors enable other transformations.

Results and Deliverables

AcknowledgmentsTeam MembersElliot Godzich '12Dylan Marriner '12

Emily Myers-Stanhope '12Emma Taborsky '12 (PM)

Heather Williams '12

MITRE LiaisonsJoshua Klontz '10

Mark Burge

Faculty AdvisorZachary Dodds

We also investigated and implemented a feature-detection algorithm involving the use of Unconstrained Minimum Average Correlation Energy (UMACE) filters. We divide the average values of a standard square region around the ground truth eye location of each training image by the average power spectrum for that same region. This gives us a correlation filter that we can apply to a standard eye-containing region to determine possible eye locations within that region.

UMACE

Original off-pose image Best landmarks

The original and a forward-facing comparison image and the resulting match scores. For

reference, the self-match score is about 6.29

Overview: The original image is cropped to the face and landmarked to determine possible feature locations on the face. The best combination of feature locations is selected using a combination of spatial heuristics and statistical estimation. Using a statistical model of landmark variation with pose, the landmarks are neutrally posed. These landmarks are used to warp the image to a neutral pose.

combining features

geometric constraints

multiple responses

Example yaw warp: off-pose (right) to frontal pose (middle)

Circles showing 5%, 10%, and 25% of inter-

ocular distance

Accuracy results for the best-match feature in each of six locations

The mean face (green) computed from 300 faces

(white), after alignment via the Procrustes algorithm the

team implemented.

Three standard deviations from the mean along the

largest source of variation, roughly corresponding to yaw

King of the Hill is a technique for finding the n local maxima in a two-dimensional array. We use it to determine the top three possible locations in our UMACE or ASEF filter responses.

Facial features tend to end up in roughly the same area of each facial crop. We can take advantage of this by constraining the area in which we search for each landmark.

An Active Shape Model (at right) and/or feature-strength heuristics can provide a probability that a particular set of landmarks form a face shape. The current system uses only feature strength, but can support additional metrics in the future.

Training image

UMACE filterFacial image with right-eye region

Response of eye region to filter

Synthetic image

Average filter (ASEF)

Exact filter

1.97 0.93

Pose-corrected

image

Three standard deviations from the mean along the

largest source of variation, roughly representing pitch

Landmark map

Documents

Modeling Faces with Active Shape Models