Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Relational Learning based Happiness Intensity Analysis in a Group
Tuoerhongjiang Yusufu, Naifan Zhuang, Kai Li, Kien A. Hua
Department of Computer ScienceUniversity of Central Florida
Orlando,FloridaEmail: {yusufu, kaili,kienhua}@cs.ucf.edu [email protected]
Abstract—Pictures and videos from social events and gather-ings usually contain multiple people. Physiological and behav-ioral science studies indicate that there are strong emotionalconnections among group members. These emotional relationsamong group members are indispensable to better analyzingindividual emotions in a group. However, most of the existingaffective computing methods focus on estimating the emotion ofa single object only. In this work, we concentrate on estimatinghappiness intensities of group members while considering thereciprocities among them. We propose a novel facial descriptorthat effectively captures happiness related facial action units.We also introduce two different structural regression models,Continuous Conditional Random Fields (CCRF) and Continu-ous Conditional Neural Fields (CCNF), for estimating emotionsof group members. Our experimental results on HAPPEIdataset demonstrate the viability of proposed features and thetwo frameworks.
Keywords-Action Units, Happiness Intensity, Group, Proba-bilistic Graphic Model
I. INTRODUCTION
Millions of images and videos from different social event
and gatherings are uploaded and shared each day. In a social
event, such as party, wedding or graduation ceremony, many
pictures and videos are taken. These images and videos
usually contain multiple people. Techniques for analyzing
and understanding group images and videos have many
applications.
Recently, the study of a group of people in an image or
a video has received much attention in the computer vision
community for different research purposes. Callagher and
Chen [22] proposed contextual features based on the group
structure for computing the age and gender of individuals.
Eichner at el. [24] present a novel multi-person pose esti-
mation framework.
In this paper we are also interested in group pictures.
However, our topic is emotions in a group.
Human affect analysis is a long-studied problem for
its importance in human-computer interaction and affec-
tive computing. Most of the automatic affect analysis and
recognition algorithms in existing works, however, focus on
analyzing the expressions and emotions of an individual only
[3][4]. Although there are some works on analyzing group
affect [5][6][7], they are interested in inferring the emotional
intensity of a group as whole. Analyzing an individual’s
emotion in a group context is still an unexplored problem.
Figure 1: Group images from different social gatherings.
Based on human cognitive and behavioral researches
[1][2], group members bring their individual level emotional
experiences, such as dispositional affect, moods, emotions,
emotional intelligence, and sentiments, with them to a group
interaction. Then through a variety of explicit and implicit
processes, individual-level moods and emotions are spread
and shared among group members. In other words, in a
group, emotions of group members are connected to each
other. Assessing reciprocity among the group members is
indispensable to better understanding individual level emo-
tions of group members. In this paper, we focus on modeling
the relations among individual emotions in a group.
After extensive research, we find that HAPPEI [8] dataset
is the only suitable dataset for our research, as it include all
group images and each face is annotated with different level
of happiness intensity . Figure 1 shows some group images
from HAPPEI dataset. All pictures in this dataset are taken
from different social gatherings. Since we use the HAPPEI
dataset, in this paper we only study two types of basic human
expressions: happiness and neutral. Interestingly, as people
tend to present themselves in a favorable way [30], most of
2016 IEEE International Symposium on Multimedia
978-1-5090-4571-6/16 $31.00 © 2016 IEEE
DOI 10.1109/ISM.2016.115
353
the uploaded and shared pictures on websites are positive.
Studying happiness in the group has many real-world appli-
cations, such as emotion ranking, event summarization and
highlight summarization, image search and retrieval, etc.
The key contributions of this paper are as follows:
1) We propose a novel compact facial descriptor which
refers to happiness related ction units (AUs). This feature
effectively represent happiness intensities.
2) We introduce a Continuous Conditional Random Fields
(CCRF) based emotion prediction model. This model com-
bines Support Vector Regressors (SVR) and Continuous
Conditional Random Fields (CCRF) to model the relations
between different individuals emotions in a group image.
3) We also introduce a Continuous Conditional Neural
Fields (CCNF) model for directly estimating emotion inten-
sities of all group members together while considering the
relation among group members.
This paper is organized as follows: In section 2, we
discuss the previous related works. In Section 3, we intro-
duce the proposed feature extraction and emotion estimation
frameworks. In Section 4, we present the results of exam-
ining the proposed feature and structured regression models
for happiness level estimation in a group. Finally, we draw
our conclusions in Section 5.
II. RELATED WORKS
Facial image descriptors can be classified as appearance
features and geometric features.
Appearance features describe the skin texture of faces.
Because appearance features are usually extracted from
small regions, this type of features are robust to illumination
variations. Moreover, as most of the appearance features are
obtained by concatenating local histograms, and they are
also normalized, it increase the robustness of the overall rep-
resentation. They are also robust to registration errors as they
involve pooling over histogram. However, as appearance
features favor identity-related cues rather than expressions,
this kind of features are affected by identity bias. Most
popular appearance representations are local binary patterns
(LBP) [17] and local phase quantization (LPQ) [18]. Other
features such as Histogram of gradients(HoG) [19], pyramid
of histogram of gradients (PHOG) [29], quantized local
Zernike moments (QLZM) [20] and Gabor wavelets [21]
are also frequently used as facial descriptors.
Geometric features represent the facial geometry, such as
the shapes of face and the locations of facial landmarks
[9][10][11]. Since this kind of features are based on coordi-
nate values instead of pixel values, they are more robust
to illumination variations than appearance features. More
importantly, geometric features are less affected by identity
bias, which makes geometric features more suitable for ex-
pression analysis. However, the disadvantage of geometric-
based features is that they are vulnerable to registration
errors.
We want to model the affect continuously. Because dis-
cretization may lead to loss of information and relationships
between neighboring classes, the regression techniques are
the natural choice for our problem.
Most popular regression techniques are linear and logistic
regression, support vector regression, neural networks and
relevance vector machine (RVM) [26]. However, they are all
designed to model input-output dependencies disregarding
the output-output relations.
Recently, Conditional Random Fields (CRF) based struc-
tured regression models have received many attentions from
researchers. CRF technique is a powerful tool for relational
learning because it allows to model relations between objects
and contents of objects. As an extension to the classic
CRF to apply for continuous case, Continuous Conditional
Random Fields (CCRF)[31] has been successfully applied
to global ranking problems [31], emotion tracking in music
[32], and dimensional affect recognition in temporal data
[33], etc. Continuous Conditional Neural Fields (CCNF) [23]
is an extension of Conditional Neural Fields (CNF) . It
also can define temporal and spatial relationships. CCNF
has been applied for emotion prediction in music [25],
facial action unit recognition and facial landmark detection
tasks [35]. Both CCRF and CCNF can perform structured
regression, and they can easily define temporal and spatial
relationships.
III. PROPOSED FRAMEWORK
A. Facial Feature Extreaction
In the HAPPEI dataset, each face in a group image is
annotated with happiness intensity of 6 stages: Neutral,
Small Smile, Large Smile, Small Laugh, Large Laugh and
Thrilled. Since we are dealing with only two kind basic
human expressions- neutral and happiness, we propose a
problem specific and more efficient facial feature for happi-
ness intensity estimation.
Previous works in psychology and computer vision have
shown the value of using Action Unit (AUs) for analyzing
facial expressions [11][12][13]. In the Facial Action Coding
System (FACS) [27], AUs are related to the contractions
of specific facial muscles. Among 30 AUs, 12 of them are
for upper face and 18 are for the lower face. Any facial
expression can be explained as occurrence of an AU or
occurrence of a combination of several AUs.
In order to clearly show different happiness levels, in
Figure 2, we take some pictures of the same object from CK
database [34] and present four levels of happiness intensities
and corresponding AUs. We can see in a neutral face, the
eyes, brow and cheeks are relaxed, and lips are relaxed and
closed. When a person expresses his or her happiness, their
cheeks, upper and lower eyelids would be raised. At the
same time, the lip corners would be pulled obliquely, lips
would be relaxed and parted, mandible may be lowered. Any
354
(a) Neutral (b) AU6+12
(c) AU6+12+25 (d) AU6+7+12+25
Figure 2: Happiness expressions and corresponding AUs.
Figure 3: Facial Landmarks
level of happiness can be expressed as combination of AU5,
AU6, AU7, AU12, AU25 and AU26.
Inspired by previous works [11][14][15], we extract ge-
ometric facial features referring to happiness related AUs.
We call the new feature Happiness Related Facial Feature
(HRFF). The facial feature extraction steps are as follows:
1) Face detection: we use Viola-Jones [28] face detection
algorithm.
2) Facial landmark detection and non-face elimination:
Intraface [16] ia applied to detect 49 facial landmarks from
each detected faces. Using the landmark detection results we
also can eliminate most of the falsely detected faces. The
reason is that, we can’t extract expected landmarks from
non-face objects. Figure 3 shows the locations and indices
of the corresponding 2D facial landmarks.
3) Face resizing and alignment: each face is resized to
128× 128 pixels. Results from Intrafae are used to perform
face alignment.
4) Geometric features are calculated using aligned land-
marks. Table I presents the descriptions and measurements
of the 6 dimensional facial features that correspond to
happiness related AUs.
Table I: Happiness Related Facial Feature (HRFF)
Features Implication Measurement AUs
f1,2 Eye lid movementSum of distances between
corresponding landmarks onthe upper and lower lips
AU5,AU7
f3 Lip tightener
Sum of distances ofcorresponding points on the
upper and lower mouth outercontour
AU25,AU26
f4 Lip parted
Sum of distance ofcorresponding points on the
upper and lower mouth innercontour
AU25,AU26
f5 Lip DepressorAngle between mouth corners
and lip upper center A12
f6 Cheek raiserAngle between nose wing and
nose centerAU6
B. Group Happiness Intensity Estimation
We select CCRF and CCNF as happiness intensity esti-
mation model in a group, as it has shown promising results
for continuous variable modeling when the extra context is
required.
Both CCRF and CCNF are undirected graphical models
that can learn the conditional probability of a continuous
valued vector y depending on continuous X. They are
discriminative approaches, where the conditional probability
P (y|X) is modeled explicitly. The graphical models that
represents CCRF and CCNF for emotion prediction in a
group are presented in Figure 4.
The probability density function for CCRF and CCNF can
be written as below:
P (y|X) =exp(Ψ)∫∞−∞ exp(Ψ)
(1)
In the CCRF model, the Ψ is defined as:
Ψ =∑
i
K1∑
k=1
αkfk(yi, Xi) +∑
i,j
K2∑
k=1
βkgk(yi, yj , X) (2)
Above X = {X1, X2, ...., Xn} is the set of facial features
vectors that can be represented as a matrix. Each row
corresponding to a face feature vector for each detected face.
y = {y1, y2, ...., yn} is the output variables that we want to
predict. In our case, it is the happiness intensity of each
355
(a) CCRF Model
(b) CCNF Model
Figure 4: Proposed Frameworks
individual in a group image. In CCRF, two type of features
are defined. They are vertex features fk and edge features
gk.
fk(yi, Xi) = −(yi −Xi,k)2 (3)
gk(yi, yj , X) = −1
2S(k)i,j (yi − yj)
2 (4)
Vertex features fk represent the dependency between the
Xi,k and yk. In our case, it is dependency between a
happiness intensity prediction from a regressor and the actual
happiness intensity level. The parameter αk controls the
reliability of particular signal for a particular emotion.
Edge features gk represent the dependencies between
observations yi and yj , for example, how related is the
happiness intensity of person A and person B in a group.
This is also affected by the similarity measure Sk. The
parameter βk and similarities Sk allow us to control the
effect of such connections between emotions. αk and βk
are positive. We selected our similarity function as:
Si,j = exp(−‖Xi −Xj‖δ
) (5)
In the CCNF model the Ψ is defined as:
Ψ =∑
i
K1∑
k=1
αkfk(yi, Xi, θk) +∑
i,j
K2∑
k=1
βkgk(yi, yj , X)
(6)
Here again, αk and βk are positive, and Θ is uncon-
strained. Similar to CCRF, CCNF also has the same edge
feature, and also use the same similarity function to enforce
smoothness between neighboring nodes. But the vertex fea-
ture fk in CCNF represents the mapping from the Xi to yithrough a one layer neural network, and the new parameter
θk in CCNF represents the weight vector for a particular
neuron k. The number of vertex features k is determined
experimentally during cross-validation. The vertex feature
in CCNF can be written as:
fk(yi, Xi, θk) = −(yi − h(θk, Xi))2 (7)
where
h(θ,Xi) =1
1 + e−θTXi(8)
In the learning stage, we pick the α, β values for CCRF
model. For CCNF, we pick the α,β,Θ and k parameters to
optimize the conditional log-likelihood of the model on the
training images. All of the parameters are optimized jointly.
L(α, β,Θ) =n∑
q=1
logP (y(q)|x(q)) (9)
(α, β, Θ) = argmax(L(α, β,Θ)) (10)
Because both Eq.2 and Eq.6 are convex, the optimal
parameter values can be determined using standard tech-
niques such as stochastic gradient ascent or other general
optimization techniques. Both CCRF and CCNF models
can be viewed as multivariate Gaussian [33][36], inferring
output values that maximize Ψ(y|X) is straightforward and
efficient.
IV. EXPERIMENTAL ANALYSIS
Because the HAPPEI database is the only dataset related
to both group and happiness intensity levels, we examine
the performance of our new facial feature and introduced
emotion estimation frameworks at the same time. All ex-
periments are conducted on MATLAB 2015a, with 3.16Hz
CPU and 4GB RAM computer environment.
2000 group images, including 7248 faces are used in our
experiments. We conducted 4-fold cross-validation, where
1500 images are selected for training and 500 for testing.
The reported results are the average result of 4 folds.
First, We extracted LBP, LPQ, and PHOG features to
evaluate the computational complexity of HRFF.
Table II: Average Feature Extraction Time
Features Feature Dimension Execution time(second)LBP 256 0.0025LPQ 256 0.5250
PHOG 680 0.0286HRFF 6 0.0004
As we can see from Table II, LPQ feature takes highest
execution time. Although PHOG have highest dimension
of 680, but its extraction time is much smaller than LPQ.
356
LBP is faster than PHOG and LPQ because calculating
LBP don’t require any transformation. But LPQ is based on
computing the short-term Fourier transform (STFT) on each
local image patch. As an extension of HOG, PHOG is based
on simple gradient operation. That is why LBP is faster
than PHOG and PHOG is faster than LPQ. However, HRFF
outperformed all of those features in terms of extraction and
processing speed, because it only related to few calculation
on coordinate values. The compactness and fast extraction
time are highly desirable in real-time emotion analysis
systems, such as real time event satisfaction level analysis
and tracking.
Then, we use above extracted features to train and test
emotion estimation models we introduced. After that, we
evaluate the performance of each descriptor and structured
regression models at the same time. We compared the
performance of CCRF and CCNF with the most popular
regression model- Support Vector Regressors(SVR) to show
how relational learning models can improve the performance
compared to single face analysis methods.
For SVR-based experiments we used a 2-fold cross val-
idation on each fold of training data to pick the hyper-
parameters. Then chosen hyper-parameters are used to train
on the whole dataset.
For CCRF-based experiments, each fold of training data
split into two parts. One part is used for training SVR and
the other is for training CCRF. Then we performed a 2 fold
cross-validation on both SVR and CCRF training data to
choose the hyper-parameters. These hyper-parameters are
then used for training on the whole training data.
For CCNF-based experiments we also used a 2-fold cross
validation on each fold of training data to pick the hyper-
parameters. Similar to CCRF, we use the chosen hyper-
parameters for training on the whole dataset. BFGS Quasi-
Newton method is used for both cross validation and training
stages.
We used two different evaluation metrics. In terms of
prediction accuracy, we selected mean squared error (MSE).
For prediction structure, we selected average correlation
coefficients. They are most common evaluation metrics for
regression models. Notice, smaller MSE values correspond
to better performance, while the opposite is true for corre-
lation coefficients.
Table III. shows the average mean squared error (MSE)
for happiness intensity estimation with different models with
different facial features. And Table IV, presents the average
correlation coefficient of different models with different
descriptors.
Table III: Mean Squared Error
LBP LPQ PHOG HRFFSVR 1.549 1.441 0.811 0.588
SVR + CCRF 1.531 1.425 0.796 0.575CCNF 1.514 1.410 0.783 0.561
Table IV: Correlation Coefficient
LBP LPQ PHOG HRFFSVR 0.039 0.097 0.486 0.632
SVR + CCRF 0.043 0.104 0.491 0.635CCNF 0.041 0.107 0.496 0.640
We can see from Table III and Table IV, the best result
is achieved when CCNF and HRFF are combined. LBP
and LPQ obtained highest MSE and lowest correlation
coefficients. The performance of PHOG is in between HRFF
and other appearance features. The driving of LBP and LPQ
features are highly effected by identity bias. That makes
them not the good options for facial expression analysis.
The performance of PHOG is better than LBP and LPQ,
because it take both gradient orientations and spatial layouts
into consideration. Our geometric features outperforms all
other face descriptors on this wild collected images, because
HRFF is directly related to happiness related facial AUs.
We can also see from experiment results in the Table
III and Table IV, the combination of SVR and CCRF
obtained consistently better result than SVR alone in both
evaluation metrics. It proves that considering the relations
and reciprocities among group members will improve the
emotion estimation results. Among these two structured
regression models we introduced, CCNF achieves the best
result because of its learning capacity and the nonlinearity of
the neural network. Compared to CCRF, training process of
CCNF is not too complex, because it don’t have to combine
with another regression model. It take the facial features
as direct input and train the model while considering the
emotional relations from the beginning.
V. CONCLUSION
In this paper, we proposed a novel facial descriptor and
introduced two model for happiness intensity estimation
in group context problem. We extracted compact geomet-
ric features from facial landmarks that refer to facial ac-
tion units (AU)s. For emotion estimation, we used two
structured regression frameworks-Continuous Conditional
Random Fields(CCRF) and Continuous Conditional Neural
Fields(CCNF). The combination of feature descriptor and
emotion estimation model is used to infer the happiness
intensities in a group of people.
We conducted experiments on HAPPEI database to show
how the proposed facial feature considerably improves the
performance of happiness intensity estimation. We also
tested the performances of two different structured regres-
sion models, and compared with most popular regression
model - Support Vector Regression (SVR). Experimental
results indicate that, compared to traditional single face
analysis methods, considering the relations between faces
in a group will improve emotion estimation accuracy sig-
nificantly. Experiment result also shows CCNF have better
performances over CCRF.
357
In future, we will extend our method to real time emotion
tracking of multiple people in video sequences. We also
expecting to use deep learning methods to improve the
accuracy of emotion estimation and prediction.
VI. ACKNOWLEDGMENT
This material is based upon work partially supported
by the NASA under Grant Number NNX15AV40A. Any
opinions, findings, and conclusion or recommendations ex-
pressed in this materials are those of the authors and do
not necessarily reflect the views of the National Science
Foundation.
REFERENCES
[1] Janice R. Kelly, Sigal G. Barsade, Mood and Emotions in SmallGroups and Work Teams, 3rd ed. Harlow, England: Addison-Wesley, 1999.
[2] S. Barasade and D. Gibson, Group Emotion: a View from Topand Bottom, 3rd ed. Harlow, England: Addison-Wesley, 1999.
[3] Z. Zeng,M. Pantic, G. I. Toisman, and T.S. Huang, A surveyof affect recognition methods: Audio, visual, and spontaneousexpressions, IEEE Trans. Pattern Anal. Mach. Intell., vol.31, no. 1,pp. 39-58,Jan. 2009.
[4] E. Sariyanidi, H. Gunes, and A. Cavallaro, Automatic analysisof facial affect: A survey of registration, representation andrecognition, . IEEE Trans. on Pattern Analysis and MachineIntelligence, pp. 1-22,2014.
[5] A. Dhall, R. Goecke, and T. Gedeon, Automatic Group Hap-piness Intensity Analysis, . IEEE Transactions on AffectiveComputing, vol.6, no. 1, 2015
[6] W. Mou, O. Celiktutan and H. Gunes, Group-level Arousal andValence Recognition in Static Images: Face, Body and Context,IEEE International Conference and Workshops on AutomaticFace and Gesture Recognition (FG) 2015.
[7] A. Dhall, J. Joshi, K. Sikka, R. Goeckee and N. Sebe, TheMore the Merrier: Analysing the Affect of a Group of Peoplein Images, IEEE International Conference and Workshopson Automatic Face and Gesture Recognition (FG) 2015.
[8] A. Dhall, J. Joshi, I. Radwan and R. Goecke, Finding HappiestMoments in a Social Context, ACCV, 2012.
[9] S. Lucey, A. B. Ashraf, and J. Gohn,, Investigating sponta-neous facial action recognition through AAM representationsof the face, Face recognition Book. Mamendorf, Germany:Pro Literatur Verlag, 2007.
[10] M. Valstar, H. Gunes, and M. Pantic, How to distinguishposed from spontaneous smiles using geometric features,Proc. ICM Int. Conf. Multimodal Interfaces, 2007.
[11] Y. L. Tian, T. Kanade, and J. Cohn, Recognizing action unitsfor facia lexpression analysis, IEEE Trans. Pattern Anal.Mach. Intell., vol.23, no. 2,pp.97-115, Feb.2001.
[12] G. Littlewort, M. S. Bartlett, I. Fasel, J. Susskind and J.Movellan, Dynamics of facial expression extracted Automat-ically from Video, IEEE Conference on Computer Visionand Pattern Recognition Workshops (CVPRW) 2004.
[13] D. McDuff, R. El Kaliouby, K. Kassam, and R. Picard, Affectvalence inference from facial action unit spectrograms, IEEEconf. Compu. Vis. Pattern Recogni. Workshop 2004.
[14] F. Zhou, F. De la Torre, and J. F. Cohn, Unsuperviseddiscovery of facial events, 2010 IEEE Comput. Soc. Conf.Comput. Vis. Pattern Recognit., 2010.
[15] ] Michael Xuelin Huang, Grace Ngai, Kien A. Hua, Identify-ing User-specific Facial Affects from Spontaneous Expressionswith Minimal Annotation, IEEE Transactions on AffectiveComputing 2015.
[16] X. Xiong and F. De la Torre, Supervised descent method andits applications to face alignment, IEEE CVPR 2013.
[17] T. Ahonen, A. Hadid and M. Pietikainen, Face Descriptionwith Local Binary Patterns: Application to Face Recognition,IEEE Trans. on Pattern Ana. and Mach. Intel vol.28, 2006.
[18] V. Ojansivu and J. Heikkila, Blur Insensitive Texture Classi-fication Using Local Phase Quantization, In Proc. Int. Conf.Image Signal Process., 2008, pp.236-243.
[19] N. Dalal and B. Triggs, Histograms of Oriented Gradientsfor Human Detection, IEEE Conf. Comput. Vis. PatternRecogni., vol.1, 2005,pp.886-893.
[20] E. Sariyanidi, H. Gunes, M. Gokmen and A. Cavallaro, LocalZernike moment representations for facial affect recognition,British Machine Vision Conference 2013.
[21] C. Liu and H. Wechsler, Gabor feature based classificationusing the enhanced fisher linear discriminant model for facerecognition, IEEE Transactions on Image Processing, 2002.
[22] A. C. Gallagher, T. Chen, Understanding Images of Groupsof People, IEEE CVPR 2009.
[23] M. Eichner and V. Ferrari, We are Family: Joint Pose Estima-tion of Multiple Persons, European Conference on ComputerVision , 2010.
[24] V. Imbrasaite, T. Baltrusaitis, and P. Robinson CCNF forContinuous Emotion Tracking in Music: Comparison withCCRF and relative feature representation, IEEE Intern. Conf.on Multimedia and Expo, Multimedia Affective Computing,2014.
[25] T. Baltrusaitis, P. Robinson, and L.-Philippe Morency Con-tinous Conditional Neural Fields for Structured Regression,ECCV, 2014.
[26] Bishop, C.M., Pattern Recognition and Machine Learning.,Springer-Verlag New York,Inc. 2006.
[27] P. Ekman and W. V. Friesen, The Facial Action CodingSystem: A technique for the measurement of Facial Movement.,San Francisco: Consulting Psychologists Press, 1978.
[28] P. Viola and M. Jones, Rapid Object Detection using aBooosted Cascade of Simple Features., IEEE CVPR 2001.
[29] A. Bosch, A. Zisserman, and X. Munoz, Representing shapewith a spatial pyramid kernel., ACM International Confer-ence on Image and Video Retrieval (CIVR), 2007.
[30] H. G. Chou, N. Edge, ”They are Happier and HavingBetter lives than I Am”: The Impact of Using Facebook onPerceptions of Others’ Lives, Cyberpychology, Behavior,and Social Networking, vol 15, no 2, 2012.
[31] T. Qin,T. L, X. Zhang,D. Wang, H. Li, Global Ranking UsingContinuous Conditional Random Fields, Conference onNeural Information Processing Systems (NIPS) , 2008.
[32] V. Imbrasaite, T. Baltrusaitis, and P. Robinson, EmotionTracking in Music Using Continuous Conditional RandomFields and Relative Feature Representation, IEEE Intern.Conf. on Multimedia and Expo Workshops, 2013.
[33] T. Baltrusaitis,N. Banda, and P. Robinson, Dimensional AffectRecognition using Continuous Conditional Random Fields,IEEE International Conference and Workshops on AutomaticFace and Gesture Recognition (FG), 2013.
[34] Kanade, T., Cohn, J. F., and Tian, Y. Comprehensive databasefor facial expression analysis, IEEE International Conferenceand Workshops on Automatic Face and Gesture Recognition(FG), 20000.
358