Computational Aspects of Emotion in Adaptive Behavior Joost Broekens, Walter Kosters, Fons Verbeek LIACS, Leiden University, The Netherlands

Computational Aspects of Emotion in Adaptive Behavior

Joost Broekens, Walter Kosters, Fons Verbeek

LIACS, Leiden University, The Netherlands.

Overview

• Emotion & Information Processing.

• Adaptive agents:– reactive,– cognitive,– emotion-modulated cognitive agents.

• Experiment: Pleasure regulates information processing.

• Future work.

Joost Broekens, LIACS, Leiden University, The Netherlands.

Emotion: communication medium, decision heuristic and modulator.

• Common emotions: fear, anger, happiness, sadness, surprise, disgust.• Short episode triggered by an (internal/external) event composed of

– subjective feelings,– inclinations to act (action preparation, action tendency (Frijda)),– facial expressions,– cognitive evaluation, and – physiological arousal (heartbeat, alertness).

• Emotion: communication medium.– Communicate internal state (Biological & Sociological evidence: Darwin,

Ekman).• Emotion: decision-heuristic relating events to goals, needs, desires,

beliefs of an agent.– Result of evaluation of personal relevance, helps decision-making

(Neurological & cognitive evidence: Damasio, appraisal theory). • Emotion: influences information processing.

– Neurocomputational & cognitive evidence: Doya and Frijda, Manstead and Bem.


Emotion & Information Processing

• BiologyEmotion; internal drives, homeostasis, hardwired reactions• CognitionEmotion; cognitive emotion elicitation:

– Emotions result from the interpretation of our world in relation to our goals, needs, desires, beliefs, etc. (Appraisal Theory, Frijda, Lazarus, Arnolds, etc.).

• Emotionbehavior; emotion influences adaptive behavior:– emotion as drive,– emotion as source of information,– emotion as modulator of cognitive processes.– Relates to different types of (views on aspects of) adaptive agents:

• reactive,• cognitive,• emotion-modulated cognitive agents.


Emotions and reactive agents

• Reactive agents:– have predefined behaviors,– learn new behavior based on instrumental conditioning, and– select behaviors based on this learned model and based on

internal drives (motivations).

• Emotion influences behavior:– can be such an internal drive, and– can trigger typical behaviors (fight / flight).

• Computational models that study emotion within this context (drive/motivation) (Avila-Garcia and Cãnamero, 2004; Cãnamero, 1997; Velasquez, 1998).


Emotion and cognitive agents

• Cognitive agents are reactive agents plus:– Internally represented knowledge used in– planning and reasoning, and an– Attention mechanism guiding perception and action,– etc...

• Emotion influences behavior:– is a source of (explicit) information used in reasoning

(knowledge), and– can (implicitly) modulate information processing

(systemic influence).• Computational models in which emotion is used

as information (e.g. Botelho and Coelho).


Thinking: Internal Simulation of Behavior

• Internal simulation of behavior– Covertly execute and evaluate potential interaction using

sensory-motor substrates (Hesslow, 2002; Damasio; Cotterill, 2001), but see also

– “interaction potentialities” (Bickhard), and – “state anticipation” (Butz, Sigaud, Gérard, 2003).

– Existing mechanisms are basis for simulation– Evolutionary continuity!

• Our basis for information processing


Emotion modulates information processing

• Emotion influences thinking and behavior at multiple levels of cognitive complexity (Frijda, Manstead and Bem, 2000; Damasio, 1994; Davidson, 2000; Berridge, 2003; Rolls, 2000).

• Emotion is integrated at multiple levels of processing &higher levels of processingconscious, reflective reasoningnot always existed evolutionary advantage to integration of emotion at lower levels can be expected; levels close to reward systems, and behavioral control.

– If thinking is internal simulation of behavior, these low-level integration mechanisms should also learn us about the influence of emotion on higher-level cognitive mechanisms, e.g., on attention.

• In this research we focus on the low-level influence of emotion on information processing in simulated adaptive agents.

• We use emotion as a metalearning parameter (Doya, 2000).• Emotion: pleasure and arousal (Russell, 2003).


Experiment: Can pleasure regulate information processing such that this

provides an adaptive advantage for the agent?


Pleasure regulates information processing

Cognitive influence

Reactive behaviorRL

model

Interaction-selection

Action-selection

action

reinforcement

interaction

ENVIRONMENT

Perception

simulated interaction

simulated reinforcement

percept

predicted interactions

Emotion process

pleasure

stimulus

RLmodel


Learning

• The agent learns to interact with the environment through Reinforcement Learning (instrumental conditioning).– Agent’s actions are rewarded or punished.– Learns value-state predictions of potential next

states.– Uses these predictions to determine what next

action to do.– Basics of the model are based on (Sutton and

Barto, 1998).• Learns through continuous interaction.• Learns based on perception-action pairs.


Learning: reinforcement example

Reward: propagate back to beginning, using a mechanism that solves the temporal credit assignment problem (i.e., find actions responsible for reward).


Action-Selection

Cognitive influence

Reactive behaviorDistributed-state RL

model


Action-selection

action

reinforcement

interaction

ENVIRONMENT

Perception



percept


Emotion process

pleasure

stimulus


Action-Selection

• Value-state predictions are transformed into action-values.

• Action-selection is based on these action values.– Choose an action from the set of action-value

pairs stochastically (e.g. using a Boltzmann distribution)

• Action-selection responsible for exploration vs. exploitation behavior.


Our agent’s cognitive part (based on internal simulation of behavior)

Cognitive influence


model


Action-selection

action

reinforcement

interaction

ENVIRONMENT

Perception



percept


Emotion process

pleasure

stimulus


Simulation: action-selection bias

At every step, instead of action-selection, select a subset of predicted interactions from reinforcement learning model feed back to RL model.

1. Interaction-selection: select a subset of predicted interactions.2. Simulate-and-bias-predicted-benefit: feed back to model as if a real

interaction.

Cognitive influence

Reactive behaviorHierarchical-state

RL model


Action-selection

action

reinforcement

interaction

ENVIRONMENT

Perception



percept


Emotion process

pleasure

stimulus

3. Action-selection: select the next action using the action-selection mechanism explained earlier based on the now biased action values.


Simulation: example

• Action list before simulation (!hypothetical example!):– {up=0.2, down=-0.5, right=-1, left=-1}

• Action-selection would have selected “up”,– With Boltzmann high probability for “up”.

• Simulate all interactions.– Propagate back the predicted values by simulating interaction with

environment. – Effect is a “value look-ahead” of 1 step.

• Action list after simulation:– {up=0.1, down=0.5, right=-1, left=-1}

• Action-selection selects “down”.

• In this example simulating all predicted interactions helps .

Roadblock r=-.5


But: Simulating Everything is not Always Best

• Even apart from fact that simulating everything costs mental effort.• Earlier experiments (Broekens, 2005) showed that

– simulation has benefit, especially when many interactions are simulated. This is not surprising (better heuristic). However,

– in some cases less simulation resulted in better learning. Dynamic relation between environment and simulation “strategy” (i.e.

simulation threshold: percentage of all predicted interactions to be simulated).

Emotion as metalearning to adapt amount of internal simulation? (Doya, 2002)– Pleasure is an indication of the current performance of the agent (Clore

and Gasper, 2000). Also,– high pleasure top down thinking, and

low pleasure bottom up thinking (Fiedler and Bless, 2000).


Pleasure Modulates Simulation

Cognitive influence


model


Action-selection

action

reinforcement

interaction

ENVIRONMENT

Perception



percept


Emotion process

pleasure

stimulus



• Many theories of emotion.• We use core-affect (or activation-valence) theory of

emotion as basis. – Two fundamental factors, pleasure and arousal (Russell, 2003).– Pleasure relates to emotional valence, and– arousal relates to action-readiness, or activity.

• In this study we model pleasure as simulation threshold.– We use pleasure to dynamically adapt the amount of interactions

that are simulated. It is thus used as a dynamic simulation threshold.

– We study the indirect effect of emotion as a metalearning parameter affecting information processing that on its turn influences action-selection.



• Pleasure quantification: indication of current performance relative to what the agent is used to.– Tried to capture this by the normalized difference between the short

term average reinforcement signal and the long term average reinforcement signal:

ltarltarltarstarp ffrre 2))((

Cognitive influence

Reactive behaviorHierarchical-state

RL model


Action-selection

action

reinforcement

interaction

ENVIRONMENT

Perception



percept


Emotion process

pleasure, ep

stimulus

• Continuous pleasure feedback:– High pleasure, going well? Continue

strategy, goal directed thinking.• > ep, high threshold, simulate predicted

best interactions,

– Low pleasure? Look broader, pay more attention to all predicted interactions.

• < ep, low threshold, simulate many interactions.

This is the only formula in the presentation!


Experimental setup

• To measure adaptive effect of pleasure-modulated simulation: force agent to adapt to new task. – First the agent has 128 trials to learn task 1, then– switch environment to new task, 128 trials to learn task 2.– Repeat for many different parameter settings (e.g. the window of the

long and short term average reinforcement signals, the learning rate, etc…)

• Pleasure predictions:– Pleasure increases to value near 1 (agent gets better at task)– then slowly converges down to .5. (agent gets used to task)– At switch: pleasure drops, (new task, drop in performance)– then increases to value near 1, and converges down to .5 (agent gets

used to new task)


Results

• Performance of pleasure-modulated simulation is comparable with simulating ALL / Best 50% predicted interactions (static simulation threshold), but, using only 30% / 70% of the mental resources.


Results

• Some settings even have a significantly better performance at lower mental cost.

• Predicted pleasure curve was confirmed


Some conclusions

• Can pleasure regulate information processing such that this provides an adaptive advantage for the agent?– Yes.

• Simple pleasure feedback can be used to determine how broad an agent should internally simulate potential behavior.– Agent’s performance is comparable and mental effort decreases.– Since we introduce few new mechanism for simulation

results are relevant to the understanding of the evolutionary plausibility of the simulation hypothesis, as increased individual adaptation at lower cost is an evolutionary advantageous feature.

• Our results provide clues of a relation between the simulation hypothesis and emotion theory.


Future work.

• Use emotion to modulate:– action-selection distribution (Doya, 2002), and– interaction-selection distribution (e.g. temperature of Boltzmann,

threshold of our AS mechanism).• Interplay between covert interaction (simulation) and overt

interaction (action-selection).– Simulate the best interaction, but chose an action stochastically, see

also (Gadanho, 2003): Gives extra “drive” to certain actions.

– The inverse? Seems rational too:Simulate bad actions for “mental (covert) exploration”, choose best actions

for “overt exploitation”.Early experiments do not (yet) show clear benefit.

• Use arousal factor as feed-back• Could arousal modify amount of energy available for information

processing, and thereby provide a bound for the amount of simulation?• Arousal resulting from low-level evaluation of familiarity and suddenness

(e.g. Scherer).


Questions?


Documents

Computational Aspects of Emotion in Adaptive Behavior Joost Broekens, Walter Kosters, Fons Verbeek LIACS, Leiden University, The Netherlands