View
51
Download
3
Category
Tags:
Preview:
Citation preview
Presented by: Anuj Mehra (2006
IPG13)
Emotion State Recognition System & Its Analysis Using
Soft Computing
Guided by: Prof. Anupam Shukla
IntroductionAutomatic recognition of facial gestures (i.e.,
facial muscle activity) is rapidly becoming an area of intense interest in the research field of machine vision.
FACIAL expressions play a significant role in our social and emotional lives. They are visually observable, conversational, and interactive signals that clarify our current focus of attention and regulate our interactions with the environment and other persons in our vicinity.
They are our direct and naturally preeminent means of communicating emotions.
Contd..Nonverbal communication plays a very important role
in human communication. Telephones have been mainly used to communicate in business, but recently, telephones are used more and more for everyday communication among family members and friends.
In addition to human-to-human communication, communication between human and computer agents has become more and more common. Computer agents that act as communication mediators will become common entities in our society. As such, the capability of communicating with humans using both verbal and nonverbal communication channels will be essential.
ObjectiveThe main aim of this research is to develop
and analyze an automatic emotion state recognition system and its applications using facial features and soft computing.
Today non verbal communication is the most important task to be done and with the help of this system it can be done easily.
Literature ReviewEkman and Friesen [7] postulated, Joy, Surprise,
Anger, Sadness, Fear and Disgust, six primary emotions in 1971. These emotions are referred as universally defined basic emotions.
Suwa et al. [8] presented a preliminary investigation on automatic facial expression from an image sequence in year 1978. Earlier to this, study on facial expression analysis was a subject for Psychologists only.
In 1994, Kobayashi and Hara [9] developed active human interface machine recognition of human emotions form facial expression by using Neural Network. They obtained high recognition rate of 90% for six basic emotions.
Contd..In 1997, Huang and Huang [19] introduced an
automatic facial expression recognition system, which consists of two parts: facial feature extraction and facial expression recognition. The system applies the point distribution model and the gray-level model to find the facial features. The position variations of certain designated points on the facial features were described by 10 action parameters.
Donat et al. [23] quantified facial movement in terms of component actions in 1999. They compared various techniques for automatically recognizing facial action in sequence of images. These techniques include analysis of facial motion through estimation of optical, holistic spatial analysis, independent components analysis etc.
Contd..In 2000, Pantic and Rothkrantz [26] described an
integrated system for facial expression recognition (ISFER). This performs recognition and emotional classification of human facial expression from a still full-face image.
In 2001, Evan smith et al. [27] described a Neural Network analog of the HMM interpolation methods to analyze facial expression. The Networked demonstrated robust recognition for the six upper facial action units whether they occurred individually or in combination.
Suzuki et al. [28], in 2001, described model of the interrelationship between physical feature of face and its emotional impression by suing a unique Neural Network.
Motivation In contrast to the previous approaches to automatic AU
detection, which did not dealt with static face images, the proposed research here addresses the problem of automatic AU coding from static face images.
The research is undertaken with two motivations:
1. While motion records are necessary for studying temporal dynamics of facial behavior, static images are important for obtaining configurationally information about facial expressions.
Since 100 still images or a minute of a video tape take approximately one hour to manually score in terms of AUs [5] it is obvious that automating facial expression measurement would be highly beneficial. While some efforts in automating FACS coding from face image sequences have been made, no such effort has been made for the case of static face images.
Contd..2. A basic understanding as how to achieve
automatic facial gesture analysis is necessary if facial expression analyzers capable of handling partial and inaccurate data are to be developed.
MethodologyThe system proposed recognizes five basic
emotions such as anger, happy, sad, neutral, disgust.
In this research work, two methodologies are proposed.Detection of face from the dataset
Applying PCA
Euclidean distance for the prediction to which class the test image belongs to
Neural networks for the prediction to which class the test image belongs to
Euclidean Distance to calculate distance from the neutral
DiscussionFrom several methods for recognition of facial
gestures, the facial action coding system (FACS) [5] is the best known and most commonly used.
FACS is an index of facial expressions, but does not actually provide any bio-mechanical information about the degree of muscle activation.
FACS [5] defines 32 AUs, which are a contraction or relaxation of one or more muscles.
Intensities of FACS are annotated by appending letters A-E (for minimal-maximal intensity) to the Action Unit number (e.g. AU 1A is the weakest trace of AU 1 and AU 1E is the maximum intensity possible for the individual person).
Feature Points
Feature Points [1]
Database
Sample Database [31,32]
Identified Face
Input
Image001.jpg,happyImage002.jpg,happyImage003.jpg,happyImage004.jpg,happyImage005.jpg,happyImage006.jpg,happyImage007.jpg,happyImage008.jpg,happyImage009.jpg,happyImage010.jpg,happyImage011.jpg,happyImage012.jpg,happyImage013.jpg,happy
Image014.jpg,disgustImage015.jpg,disgustImage016.jpg,disgustImage017.jpg,disgustImage018.jpg,disgustImage019.jpg,disgustImage020.jpg,disgustImage021.jpg,disgustImage022.jpg,disgustImage023.jpg,disgustImage024.jpg,disgust
Image025.jpg,angerImage026.jpg,angerImage027.jpg,angerImage028.jpg,angerImage029.jpg,angerImage030.jpg,angerImage031.jpg,angerImage032.jpg,angerImage033.jpg,angerImage034.jpg,anger
Image035.jpg,sadImage036.jpg,sadImage037.jpg,sadImage038.jpg,sadImage039.jpg,sadImage040.jpg,sadImage041.jpg,sadImage042.jpg,sadImage043.jpg,sad
Image044.jpg,neutralImage045.jpg,neutralImage046.jpg,neutralImage047.jpg,neutralImage048.jpg,neutralImage049.jpg,neutralImage050.jpg,neutral
Training Image, Expression
Algorithm1. The training images are utilized to create a low dimensional
face space. This is done by performing Principal Component Analysis (PCA) in the training image set and taking the principal components (i.e. Eigen vectors with greater Eigen values). In this process, projected versions of all the train images are also created.
2. The test images also are projected on the face space – as a result, all the test images are represented in terms of the selected principal components.
3. The Euclidean distance of a projected test image from all the projected train images are calculated and the minimum value is chosen in order to find out the train image which is most similar to the test image. The test image is assumed to fall in the same class that the closest train image belongs to.
Algorithm (Contd.) After calculating Principal Components BPA is
applied as a classifier. In order to determine the intensity of a
particular expression, its Euclidean distance from the mean of the projected neutral images is calculated. The more the distance - according to the assumption - the far it is from the neutral expression. As a result, it can be recognized as a stronger the expression.
The emotions extracted from the system are used to control a music player.
Output
Image001.jpg,2221,neutral, Image046.jpg Image017.jpg,4002,neutral,Image046.jpg
Image002.jpg,3669,happy,Image008.jpg Image018.jpg,6088,disgust,Image022.jpg
Image003.jpg,4764,disgust,Image014.jpg Image019.jpg,4331,disgust,Image022.jpg
Image004.jpg,4462,anger,Image029.jpg Image020.jpg,5274,anger,Image026.jpg
Image005.jpg,3933,anger,Image025.jpg Image021.jpg,5002,anger,Image029.jpg
Image006.jpg,4745,happy,Image003.jpg Image022.jpg,5135,disgust,Image021.jpg
Image007.jpg,5398,sad,Image041.jpg Image023.jpg,4134,disgust,Image018.jpg
Image008.jpg,5851,happy,Image010.jpg Image024.jpg,4570,disgust,Image022.jpg
Image009.jpg,2503,neutral,Image046.jpg Image025.jpg,4331,disgust,Image023.jpg
Image010.jpg,4183,happy,Image008.jpg Image026.jpg,3387,neutral,Image049.jpg
Image011.jpg,5135,sad,Image040.jpg Image027.jpg,4800,disgust,Image016.jpg
Image012.jpg,6319,anger,Image031.jpg Image028.jpg,4274,disgust,Image023.jpg
Image013.jpg,5292,happy,Image006.jpg Image029.jpg,5359,anger,Image029.jpg
Image014.jpg,6207,happy,Image012.jpg Image030.jpg,5994,disgust,Image022.jpg
Image015.jpg,5899,happy,Image006.jpg Image031.jpg,5921,anger,Image029.jpg
Image016.jpg,4163,sad,Image040.jpg
Testing Image, Distance From Neutral, Expression, Best Match
Result
PCA + Euclidean Distance
PCA + BPA
Training 100% 89%
Validation 98.3% 85.7%
Graphs
Future ScopeThe proposed system can also be made in such
a way that it can handle distractions like occlusions (i.e. by a hand), glasses and facial hair.
The output obtained from the above system is currently used with the music player and can also be used in the different areas and fields related to emotion recognition.
Previous Work[1] Mehra Anuj, Shukla Anupam, Tiwari Ritu,
“Intelligent Biometric System for Speaker Identification using Lip features with PCA and ICA”, Journal of Computing, Volume 2, Issue 4, April 2010, pp.120-127
[2]Mehra Anuj, Shukla Anupam, Tiwari Ritu,” Expert System for Speaker Identification Using Lip Features with PCA”, Intelligent Systems and Applications (ISA) 2010, IEEE, Wuhan, 22-23 May 2010.
[3] Mehra Anuj, Shukla Anupam, “Emotion State Recognition System using Euclidean distance and PCA”, Journal of Computing [in review].
References P. Ekman and E. Rosenberg (2005). “Basic and Applied Studies of Spontaneous Expression using the Facial
Action Coding System.” Oxford University Press, 2nd Edition, Feb. 2005.
R Gutierrez Osuna et al. (2005),” Speech Driven Facial Animation with Realistic Dynamics”, IEEE Transactions on Multimedia, Vol. 7 (1), pp 33-41.
A. V. Barbosa and H. C. Yehia (2001), “Measuring the Relation between Speech Acoustics and 2D Facial Motion” Speech Communication Vol. 26, pp 23-48.
L. C. De Silva, T. Miyasato and R. Nakatsu (1997), “Facial Emotion Recognition Using Multimodal Information” Proceedings of IEEE International Conference on Information Communication and Signal Processing (ICICS’97) Singapore, pp 397-401.
R. G. Osuna, P. K. Kakumanu, A. Esposito, O. N. Garcia, A. Bojorquez, J. L. Castillo and I. Rudomin (2005), “Speech- Driven Facial Animation with Realistic Dynamics” IEEE Transactions on Multimedia. Vol. 7 (01), pp 33-42.
C. Darwin (1965). , “The Expression of Emotion in Man and Animals” John Murray, Ed.1872. Reprinted by university Chicago Press, 1965.
P. Ekman and W. V. Friesen (1971), “Constants across cultures in the face and emotion” Journal of Personality and Social Psychology, Vol. 17(2), pp 124-129.
M. Suwa, N. Sugie, and K. Fujimora (1978), “A preliminary note on pattern recognition of human emotional expression,” Proceedings of the Fourth International Joint Conference on Pattern Recognition, Kyoto (Japan): pp 408-410.
H. Kobayashi and F. Hara (1994), “Analysis of the Neural Network and Recognition Characteristics of 6 Basic Facial Expressions” IEEE, International Workshop on Robot and Human Communication 1994, pp 222-227.
F. Kawakami, S. Morishima, H Yamada and H. Harashima (1994), “Construction of 3-D Emotion Space Based on Parameterised Faces” IEEE, International Workshop on Robot and Human Communication, 1994, pp 216-221.
Contd.. R. R. Advent, C. T. Ng and J. A. Nel (1994), “Machine Vision Recognition of Facial Affect Using Back-
propagation Neural Networks”, Proceedings of the 16th Annual International Conference of IEEE, Engineering in Medicine and Biology Society 1994, Engineering Advances: New Opportunities for Biomedical Engineers, pp 1364-1365.
S. Morishima (1996),” Modeling of facial expression and emotion for human communication system Displays,” IEEE Transactions, 1996, pp15-25.
R. Herpers, M. Michaelis, K. H. Lichtenauer and G. Sommer (1996), “Edge and Key point Detection in Facial Regions”, 2nd International Conference on Automatic Face and Gesture Recognition, pp 212-217.
H. Demirel, T. J. Clarke and P. Y. K. Cheung (1996), “Adaptive Automatic Facial Feature Segmentation”, 2nd International Conference on Automatic Face and Gesture Recognition, pp 277-282.
Y. Yacoob and L. S. Davis (1996), “Recognizing Human Facial Expressions From Long Image Sequences Using Optical Flow”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18 (06), pp 636-642.
M Rosenblum, Y. Yacoob and L. S. Davis (1996), “Human Expression Recognition from Motion Using a Radial Basis Function Network Architecture”, IEEE Transactions on Neural Networks, Vol. 7 (05), pp 1121-1138.
A. Essa and A. P. Pentland (1997), “Coding Analysis, Interpretation and Recognition of Facial Expression”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19(7), pp 757-763.
A. Lanitis, C. J. Taylor and T. F. Cootes (1997), “Automatic Interpretation and coding of Face Images using Flexible Models”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19(7), pp 743-756.
C. L. Huang and Y. M. Huang (1997), “Facial Expression Recognition Using Modal-Based Feature Extraction and Action Parameters Classification”, Journal of Visual Communication and Image Representation, Vol. 8 (03), pp 278-290.
P. Eisert and B. Girod (1997), “Facial Expression Analysis for Modal-Based Coding of Video Sequences”, Picture Coding Symposium, Berlin-1997: pp 33-38.
Contd.. G. A. Abrantes and F. Pereira (1998), “Interactive Analysis for MPEG-4 Facial Model Configuration”,
EUROGRAPHICS, 1998, Lisboa (Portugal), pp 1-4.
C. Gershenson (1999), “Modelling Emotions with Multidimensional Logic”, 18th International Conference of North American Fuzzy Information Processing Society, New York City, pp 42-46.
G. Donato, M. S. Bartlett, J. C. Hager, P. Ekaman and T. J. Sejnowski (1999), IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 21(10), pp 974-989.
M. D. Bonis, P. D. Boeck, F. Perez-Diaz and M. Nahas (1999), “A Two-process Theory of Facial Perception of Emotions”, Life Science, Vol. 322(8), pp 669-675.
J. J. J. Lien, T. Kanade, J. F. Cohn and C. C. Li (2000), “Detection, Tracking and Classification of action units in facial expression”, Journal of Robotics and Autonomous Systems, Vol. 31(2000), pp 131-146.
M. Pantic & L. J. M. Rothkrantz (2000), “Expert system for automatic analysis of facial expressions”, Image and Vision Computing, Vol. 18, pp 881-905.
E. Smith, M. S. Bartlett and J. Movellan (2001), “Computer Recognition of Facial Actions: A study of co-articulation effects”, Proceedings of 8th Joint Symposium on Neural Computations, May 2001, pp 1-6.
K. Suzuki, H. Yamada and S. Hashimoto (2001), “Interrelating physical features of facial expression and its impression”, Proceedings IEEE International Conference on Neural Network, 2001, pp1864-1869.
Y. L. Tian, T. Kanade, J. F. Cohn (2001), “Recognizing Action Units for Facial Expression Analysis”, Transactions on Pattern Analysis and Machine Intelligence, Vol. 23(02), pp 97-105.
M. Nahas and M. D. Bonis (2001), “Image Technology and Facial Expression of Emotions”, Proceedings of 10th IEEE, International Workshop on Robot and Human Interactive Communication, 2001, pp 524-527.
Vidit Jain, Amitabha Mukherjee. The Indian Face Database. http://vis-www.cs.umass.edu/~vidit/IndianFaceDatabase/, 2002.
Japanese Female Facial Expression (JAFFE) Database - http://www.kasrl.org/jaffe.html
Thank You
Recommended