PROGRESS ON EMOTION RECOGNITION

ERMIS / IST-2000-29319

PROGRESS ON EMOTION

RECOGNITION J G Taylor & N Fragopanagos King’s College London

ERMIS / IST-2000-29319KCL WORK IN ERMIS

• Analysis of emotion v cognition in human brain (→simulations of emotion/attention paradigms)

• → emotion recognition architecture ANNA• ANNA hidden layer = emotion state, +

feedback control for attention (= IMC)• Learning laws for ANNA developed• ANNA fuses all modalities or only one• HUMAINE: WP3 +WP4

ERMIS / IST-2000-29319

BASIC BRAIN EMOTION CIRCUIT

• Valence in amygdala & OBFC• Attention in parietal & PFC • Interaction in ACG

SC

Parietal

A

Thal

ACG

SFG

NBM

ERMIS / IST-2000-29319

• SIMPLIFIED ARCHITECTURE OF EMOTIONAL/COGNITIVE PROCESSING IN THE BRAIN:

ERMIS / IST-2000-29319

DETAILED ARCHITECTURE FOR

FACES CLASSIFICATION

gender

ERMIS / IST-2000-29319

BASIC ERMIS EMOTION

RECOGNITION ARCHITECTURE:

Attention control system:

Feature vector Inputs:

Emotion state ashidden layer

Output as recognised emotional state

↑

→

ERMIS / IST-2000-29319

ANNA:Assume linear output:

Hidden layer response:

IMC node response:

Then solve self-consistent

equations for (y, z) for each

training input by relaxation

i

ii yaOUT

j k

kijkjiji zAxwfy ])1[(

i

ikik yBfz )(

ERMIS / IST-2000-29319NATURE OF ANNA

• Handles both unimodal and multi-modal data (input vector x of arbitrary dimension, not too large)

• Needs consistent input and output data {x(t), OUT(t)}, with t specified for both x & OUT=(activat, evaluat)

• Uses SALAS date-base (450 tunes)from QUB (Roddie/Ellie/Cate)

ERMIS / IST-2000-29319UNIMODAL RESULTS

• Can use numerous representations of emotion: extreme, continuous in n dimensions, …

• ANNA → FEELTRACE output (continuous 2-D)

• Trained on unimodal for prosody• First look at word content

ERMIS / IST-2000-29319

Text Post-Processing Module

• Prof. Whissell compiled ‘Dictionary of Affect in Language (DAL)’

• Mapping of ~9000 words → (activation-evaluation), based on students’ assessment

• Take words from meaningful segments obtained by pause detection → (activation-evaluation) space

• But humans use context to assign emotional content to words

ERMIS / IST-2000-29319

Text Post-Processing Module (SALAS data)

Table 1. Quadrant match for normal text (full DAL).Participant P1 P2 P3 P4 P9 P12 AllQuadrant match (%) 21.4 12.5 21.4 30.4 25.0 19.6 16.1

Table 2. Quadrant match for scrambled text (full DAL).Participant P5 P6 P7 P8 P10 P11 AllQuadrant match (%) 07.1 23.2 25.0 32.1 23.2 21.4 21.4

Table 3. Standard deviation of participants’ assessments for normal and scrambled text (average over all passages assessed).

Normal ScrambledEvaluation 1.24 1.45Activation 1.55 1.73

Table 4. Quadrant match averaged over participants’ groups for normal text and scrambled text when threshold for DAL range* is varied.

Threshold 0.0 0.25 0.5 0.75Normal text 16.1 16.0 12.5 16.4Scrambled text 21.4 21.4 19.6 21.8

*The higher the threshold the higher emotionally rated words are spotted only.

Conclude: need further context/semantics

ERMIS / IST-2000-29319

Correlational analysis of ASSESS

features• Correlational analysis between ~450

ASSESS features and FeelTrace =>– ASSESS features correlate more

highly with activation– Similar top ranking features for 3 out

of 4 FeelTracers (but still differences)– Different top ranking features for

different SALAS subjects->Is there a male/female trend? Difficult to say - insufficient data

ERMIS / IST-2000-29319

ANNA on top correlated ASSESS

features• Quadrant match using top 10 activation

features + top 10 evaluation features and activation – evaluation output space: Feeltracer jd cc dr em

Avg Quad Match

0.42 0.39 0.37 0.45

Std Dev 0.03 0.02 0.02 0.04

ERMIS / IST-2000-29319

ANNA on top correlated ASSESS

features• Half-plane match using top 10

activation features and activation only output space: Feeltracer jd cc dr em

Avg Quad Match

0.75 0.66 0.64 0.74

Std Dev 0.02 0.02 0.02 0.03

ERMIS / IST-2000-29319

PRESENT SITUATION OF ANNA:

MULTIMODAL

• Time-stamped data now becoming available for lexical (ILSP) & face streams (NTUA)

• Expect to have results in about 1 month for recognition for fused modalities (faces/prosody/words)

ERMIS / IST-2000-29319CONCLUSIONS

• UNIMODAL: ANNA on prosody OK (especially activation)

• MULTIMODAL: Soon to be done• On semi-realistic data (SALAS QUB)• Future work: 1) analysis of detailed results 2) insert temporality in ANNA

ERMIS / IST-2000-29319

QUESTIONS

• How to handle variations across experiencers and across FEELTRACERS?

• How to incorporate expert knowledge?• How combine recognition across

models? • Coding of emotions: as dimensional

reps or as dissociated states (sad AMYG v angry OBFC)?

• Nature of emotions as goal/reward assessment (frustration → anger; impossible →sadness, etc: brain-based)?

Documents

PROGRESS ON EMOTION RECOGNITION