159
Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context. 3. Temporally-delayed Learning & Reinforcement.

Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Embed Size (px)

Citation preview

Page 1: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

tial&

Tem

porally

-Delay

edLearn

ing

1.TheProblem

.

2.Seq

uen

tialLearn

ing&

Contex

t.

3.Tem

porally

-delay

edLearn

ing&

Rein

forcem

ent.

Page 2: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Erro

r-driv

en+Heb

bian

:Solvetask

s,learnsystem

aticrep

resentatio

ns,

gen

eralizeto

new

stimuli.

What’s

left?...

Page 3: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Erro

r-driv

en+Heb

bian

:Solvetask

s,learnsystem

aticrep

resentatio

ns,

gen

eralizeto

new

stimuli.

What’s

left?...

Tim

e!

Page 4: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Erro

r-driv

en+Heb

bian

:Solvetask

s,learnsystem

aticrep

resentatio

ns,

gen

eralizeto

new

stimuli.

What’s

left?...

Tim

e!

Curren

tly:netw

orkslearn

immed

iateconseq

uen

ceofagiven

input.

Page 5: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Erro

r-driv

en+Heb

bian

:Solvetask

s,learnsystem

aticrep

resentatio

ns,

gen

eralizeto

new

stimuli.

What’s

left?...

Tim

e!

Curren

tly:netw

orkslearn

immed

iateconseq

uen

ceofagiven

input.

•What

ifcu

rrentinputonly

mak

essen

seas

part

ofaseq

uen

ceofinputs

(e.g.,lan

guag

e,social

interactio

ns)?

Page 6: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Erro

r-driv

en+Heb

bian

:Solvetask

s,learnsystem

aticrep

resentatio

ns,

gen

eralizeto

new

stimuli.

What’s

left?...

Tim

e!

Curren

tly:netw

orkslearn

immed

iateconseq

uen

ceofagiven

input.

•What

ifcu

rrentinputonly

mak

essen

seas

part

ofaseq

uen

ceofinputs

(e.g.,lan

guag

e,social

interactio

ns)?

•What

iftheconseq

uen

ceofthisinputcomes

later(e.g

.,sch

ool/work,

life)?

Page 7: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Page 8: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Page 9: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Page 10: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Ismypurple

colorfav

orite.

Page 11: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Ismypurple

colorfav

orite.

Ispurple

mycolorfav

orite.

Page 12: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Ismypurple

colorfav

orite.

Ispurple

mycolorfav

orite.

Thegirl

pick

edupthepen

.

Page 13: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Ismypurple

colorfav

orite.

Ispurple

mycolorfav

orite.

Thegirl

pick

edupthepen

.

Thepig

racedaro

undthepen

.

Page 14: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Ismypurple

colorfav

orite.

Ispurple

mycolorfav

orite.

Thegirl

pick

edupthepen

.

Thepig

racedaro

undthepen

.

Werep

resentthecontex

t,notjustthecu

rrentinput.

Page 15: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Seq

uen

ceLearn

ing

How

dowedoit?

Forexam

ple:

Myfav

orite

colorispurple.

Purple

mycolorfav

orite

is.

Ismypurple

colorfav

orite.

Ispurple

mycolorfav

orite.

Thegirl

pick

edupthepen

.

Thepig

racedaro

undthepen

.

Werep

resentthecontex

t,notjustthecu

rrentinput.

inlan

guag

e,social

interactio

ns,driv

ing(w

hogoes

ata4-w

aysto

p?)

Page 16: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Rep

resentin

gContex

tforSeq

uen

ceLearn

ing

How

does

thebrain

doit?

How

would

weget

ourmodels

todoit?

Page 17: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Rep

resentin

gContex

tforSeq

uen

ceLearn

ing

How

does

thebrain

doit?

How

would

weget

ourmodels

todoit?

Addlay

ersto

keep

trackofcontex

t(prefro

ntal

cortex

;hippocam

pus...).

Page 18: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

AnExam

ple

Task

BTXSE

BPVPSE

BTSXXTVVE

BPTVPSE

Page 19: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

AnExam

ple

Task

BTXSE

BPVPSE

BTSXXTVVE

BPTVPSE

Which

ofthefollo

wingseq

uen

cesare

allowed

?:

BTXXTTVVE

TSXSE

VVSXE

BSSXSE

Page 20: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

AnExam

ple

Task

BTXSE,B

PVPSE,B

TSXXTVVE,BPTVPSE,B

TXXTTVVE

TSXSE,V

VSXE,B

SSXSE

Page 21: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

AnExam

ple

Task

BTXSE,B

PVPSE,B

TSXXTVVE,BPTVPSE,B

TXXTTVVE

TSXSE,V

VSXE,B

SSXSE

BTP

V

SV

PE

startend

ST

0

12

34

5

XX

Page 22: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

AnExam

ple

Task

BTXSE,B

PVPSE,B

TSXXTVVE,BPTVPSE,B

TXXTTVVE

TSXSE,V

VSXE,B

SSXSE

BTP

V

SV

PE

startend

ST

0

12

34

5

XX

Weim

plicitly

learnsu

chgram

mars

(e.g.,p

ressingbutto

nsfaster

toletters

that

follo

wgram

mar).

Page 23: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tim

e&

Seq

uen

ces

Curren

tly:netw

orkslearn

immed

iateconseq

uen

ceofagiven

input.

What

ifcu

rrentinputonly

mak

essen

seas

part

ofatem

porally

-exten

ded

sequen

ceofinputs?

(contex

t)

What

iftheconseq

uen

ceofthisinputcomes

laterin

time?

(nextweek

)

Page 24: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 25: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 26: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 27: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 28: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 29: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 30: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 31: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 32: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 33: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 34: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 35: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 36: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 37: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 38: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 39: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 40: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 41: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 42: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 43: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 44: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

WhyCopytheHidden

Rep

resentatio

n?

Page 45: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 46: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 47: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 48: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 49: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 50: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 51: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 52: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 53: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 54: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 55: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 56: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 57: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 58: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 59: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 60: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 61: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 62: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

WhyCopytheHidden

Rep

resentatio

n?

•Copyinginputoroutputonly

letsthenetw

ork

hold

onto

oneprev

ious

item

•Copyingthehidden

layer

letsthenetw

ork

hold

onto

anarb

itrarily

largenumber

ofitem

s–

eventhough

itisalw

aysjustcopying

lasthidden

stateattim

et-1.

•Thenetw

ork

learnshow

strongly

tohold

onto

past

items

Page 63: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 64: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 65: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 66: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 67: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 68: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 69: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 70: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 71: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 72: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Sim

ple

SRN

story

isnotflaw

less

•How

ishidden→

“copy”functio

nim

plem

ented

biologically

?

•Durin

gsettlin

g,co

ntex

tmustbeactiv

elymain

tained

(ongoinghidden

activity

has

noeffect

oncontex

t).

•Assu

mes

allcontex

tisrelev

ant:What

ifdistractin

ginform

ation

presen

tedin

middle

ofseq

uen

ce?Wan

tto

only

hold

onto

relevan

t

contex

t.

→Stay

tuned

forsp

ecializedbiological/

computatio

nal

mech

anism

sfor

updatin

g/gatin

gvs.robust

main

tenan

ceofcontex

t.

Page 73: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Motiv

atingMotiv

ation

Whydoes

anyonegoto

university

?

Page 74: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Motiv

atingMotiv

ation

Whydoes

anyonegoto

university

?

(or,w

hydoweev

erdoan

ythingbesid

eseat,sleep

,hav

esex

,etc)?

Page 75: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Motiv

atingMotiv

ation

Whydoes

anyonegoto

university

?

(or,w

hydoweev

erdoan

ythingbesid

eseat,sleep

,hav

esex

,etc)?

e.g.,W

hyam

Ihere

today,in

steadoflyingonabeach

inMexico

,drin

king

mojito

san

dread

ingagoodbook?

Page 76: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Motiv

atingMotiv

ation

Whydoes

anyonegoto

university

?

(or,w

hydoweev

erdoan

ythingbesid

eseat,sleep

,hav

esex

,etc)?

e.g.,W

hyam

Ihere

today,in

steadoflyingonabeach

inMexico

,drin

king

mojito

san

dread

ingagoodbook?

Challen

ge:

mak

earesp

onsib

leneu

ralnetw

ork!

Page 77: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheMotiv

ational

Bootstrap

•Somemotiv

ationsmustbebuilt-in

(elsewewould

die)

•Where

doart/

science

comefro

m?

–Need

tolearn

ontopofbuilt-in

driv

es

Page 78: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheMotiv

ational

Bootstrap

•Somemotiv

ationsmustbebuilt-in

(elsewewould

die)

•Where

doart/

science

comefro

m?

–Need

tolearn

ontopofbuilt-in

driv

es

Cultu

re&

social

driv

esprovidecu

mulativ

esh

apingoflearn

ing.

Page 79: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheMotiv

ational

Bootstrap

•Somemotiv

ationsmustbebuilt-in

(elsewewould

die)

•Where

doart/

science

comefro

m?

–Need

tolearn

ontopofbuilt-in

driv

es

Cultu

re&

social

driv

esprovidecu

mulativ

esh

apingoflearn

ing.

So,w

hydoes

anyonegoto

university

?

•Socially

-med

iatedstan

dard

sofsu

ccess.

•Stro

ngbuilt-in

desire

tosh

arew/others.

•Stro

ngbuilt-in

desire

tolearn

(dopam

ine?)

Page 80: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

What

I’mActu

allyTalk

ingAbout

Skinnerian

learning

Thebasic

stuffthat

every

mam

mal

has

incommon:

Neu

ralmech

anism

sofPav

lovian

conditio

ning

(from

acomputatio

nal

persp

ective).

Page 81: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

What

I’mActu

allyTalk

ingAbout

Skinnerian

learning

Thebasic

stuffthat

every

mam

mal

has

incommon:

Neu

ralmech

anism

sofPav

lovian

conditio

ning

(from

acomputatio

nal

persp

ective).

Nosu

perv

isedtarg

etsig

nal

availab

le:only

good/bad

outco

mes

Enab

lesbootstrap

ofnew

stimuli(C

S’s)

onto

built-in

desires

(US’s):

CS(m

oney

)→

US(fo

od,etc)

Page 82: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

What

I’mActu

allyTalk

ingAbout

Skinnerian

learning

Thebasic

stuffthat

every

mam

mal

has

incommon:

Neu

ralmech

anism

sofPav

lovian

conditio

ning

(from

acomputatio

nal

persp

ective).

Nosu

perv

isedtarg

etsig

nal

availab

le:only

good/bad

outco

mes

Enab

lesbootstrap

ofnew

stimuli(C

S’s)

onto

built-in

desires

(US’s):

CS(m

oney

)→

US(fo

od,etc)

Butwhat

ifconseq

uen

ceofgiven

inputcomes

laterin

time?

Page 83: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 84: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tem

porally

-delay

edLearn

ing&

Rein

forcem

ent

Rein

forcem

entoften

delay

edfro

mtheev

ents

that

leadto

it:

need

to“sp

anthegap

”.

Page 85: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tem

porally

-delay

edLearn

ing&

Rein

forcem

ent

Rein

forcem

entoften

delay

edfro

mtheev

ents

that

leadto

it:

need

to“sp

anthegap

”.

Key

idea:

•Wewan

tto

pred

ictfuture

reward

sconsisten

tlyover

time.

•Thisallo

wusto

learnwhat

even

tsare

associated

with

reward

s,earlier

andearlier

back

intim

e.

Weuse

theTem

poral

Differen

ces(TD)alg

orith

m(Sutto

n&

Barto

).

Page 86: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 87: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 88: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 89: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 90: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 91: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 92: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 93: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context
Page 94: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Rein

forcem

entlearn

ingan

ddopam

ine:

pred

ictionerro

rsPositiv

ePE:

Neg

ativePE:

dopam

ine:

Sch

ultz,

Sato

h,R

oesch

,Zag

houl,G

limch

er,Hylan

d..an

dman

ymore

Page 95: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Basic

Data:

VTA

dopam

inefirin

gin

Conditio

ning

Sch

ultz,

Montag

ue&

Day

an,2007

Page 96: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Dopam

inean

dRew

ardProbab

ility

Page 97: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Dopam

inean

dRew

ardProbab

ility

Page 98: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Burst/

Pau

secorrelatio

nswith

Rew

Pred

ictionErro

rs

Bay

eret

al,2007

JNeu

rophys

Page 99: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tem

poral

Differen

ceLearn

ing:Equatio

ns

Valu

efunctio

n,su

mofdisco

unted

future

reward

s:

V(t)

=〈γ

0r(t)

+γ1r(t+

1)+

γ2r(t+

2)...〉

(1)

Page 100: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tem

poral

Differen

ceLearn

ing:Equatio

ns

Valu

efunctio

n,su

mofdisco

unted

future

reward

s:

V(t)

=〈γ

0r(t)

+γ1r(t+

1)+

γ2r(t+

2)...〉

(1)

Recu

rsivedefi

nitio

n:

V(t)

=〈r(t)

+γV(t+

1)〉

(2)

Page 101: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tem

poral

Differen

ceLearn

ing:Equatio

ns

Valu

efunctio

n,su

mofdisco

unted

future

reward

s:

V(t)

=〈γ

0r(t)

+γ1r(t+

1)+

γ2r(t+

2)...〉

(1)

Recu

rsivedefi

nitio

n:

V(t)

=〈r(t)

+γV(t+

1)〉

(2)

Erro

rin

pred

ictedrew

ard(fro

mprev

iousto

nexttim

e-step):

δ(t)

=(

r(t)

+γV̂(t+

1))

−V̂(t)

(3)

Page 102: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Tem

poral

Differen

ceLearn

ing:Equatio

ns

Valu

efunctio

n,su

mofdisco

unted

future

reward

s:

V(t)

=〈γ

0r(t)

+γ1r(t+

1)+

γ2r(t+

2)...〉

(1)

Recu

rsivedefi

nitio

n:

V(t)

=〈r(t)

+γV(t+

1)〉

(2)

Erro

rin

pred

ictedrew

ard(fro

mprev

iousto

nexttim

e-step):

δ(t)

=(

r(t)

+γV̂(t+

1))

−V̂(t)

(3)

Update

valu

eestim

ate:

V̂(t)←

V̂(t)

+αδ(t)

(4)

α=learn

ingrate

Page 103: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TD

andDopam

ineRelatio

nsh

ip

Sch

ultz,D

ayan

&Montag

ue,1997,S

cience

δV

Page 104: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Model:

CSat

t=2,U

Sat

t=16

a)

TD Error 0−

0.2−

0.4−

0.6−

0.8−

1−02

46

810

1214

1618

20

Tim

eb)

0−

0.2−

0.4−

0.6−

0.8−

1−02

46

810

1214

1618

20

Tim

eTD Errorc)

02

46

810

1214

1618

20

0−

0.2−

0.4−

0.6−

0.8−

1−

Tim

e

TD Error

Page 105: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Netw

ork

Implem

entatio

n

Stim

uli

Hidden

V(t)

^^V(t+

1) + r(t)

γ

δ(t)

Page 106: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Phase-b

asedIm

plem

entatio

n

Stim

ulus

1

−1+1

V̂(1)

δδ

−2+2

−3+3

32

rV̂

(t)

V̂(2)

V̂(t)

V(t+

1)^

γV̂

(t)V

(t+1)

γ _ 1γ _ 1

V̂(2)

V̂(3)

V̂(3)

r(3)

Tim

e

(ExtR

ew)

TD

Rew

Integ

TDRew

Integ

=TDRew

Pred

+ExtRew

Minusphase:

TDRew

Integ

clamped

toprev

plusphase

valu

e.

Page 107: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Phase-b

asedIm

plem

entatio

n

Stim

ulus

1

−1+1

V̂(1)

δδ

−2+2

−3+3

32

rV̂

(t)

V̂(2)

V̂(t)

V(t+

1)^

γV̂

(t)V

(t+1)

γ _ 1γ _ 1

V̂(2)

V̂(3)

V̂(3)

r(3)

Tim

e

(ExtR

ew)

TD

Rew

Integ

TDRew

Integ

=TDRew

Pred

+ExtRew

Minusphase:

TDRew

Integ

clamped

toprev

plusphase

valu

e.

Plusphase:

TDRew

Integ

settlesvia

weig

hts

=expected

reward

att+1,p

lusan

yExtRew

attim

et.

Page 108: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Phase-b

asedIm

plem

entatio

n

Stim

ulus

1

−1+1

V̂(1)

δδ

−2+2

−3+3

32

rV̂

(t)

V̂(2)

V̂(t)

V(t+

1)^

γV̂

(t)V

(t+1)

γ _ 1γ _ 1

V̂(2)

V̂(3)

V̂(3)

r(3)

Tim

e

(ExtR

ew)

TD

Rew

Integ

TDRew

Integ

=TDRew

Pred

+ExtRew

Minusphase:

TDRew

Integ

clamped

toprev

plusphase

valu

e.

Plusphase:

TDRew

Integ

settlesvia

weig

hts

=expected

reward

att+1,p

lusan

yExtRew

attim

et.

Learn

ingsig

nal

δ(=

“TD”)

trainspred

ictionforprev

ioustim

estep

.( elig

ibility

tracesneed

ed)

Page 109: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Exploratio

n:[rl

cond.proj]

Input

time

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

tone light

odor

stimuli

TD

Rew

Pred...

’Complete

Serial

Compound’(C

SC)inputrep

resentatio

n:

uniqueunitforeach

stimulusat

eachtim

epoint

(used

inSutto

n&

Barto

,Montag

ueet

al,etc)

Notrealistic,b

utgoodfordem

onstratio

n.Thisassu

mptio

ncan

berelax

ed

with

outch

angingcore

ideas

(e.g.Ludvig

etal,2008).

Page 110: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Exploratio

n:[rl

cond.proj]

Stan

dard

TD:

V̂(t)

=∑

iwi x

i (t)

[xiare

inputs:

tone,lig

ht]

Here:

passed

thru

activatio

nfunctio

n–has

tosu

rpass

thresh

old,su

bject

to

inhibito

rycompetitio

nfro

mother

valu

erep

s

Page 111: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

DA

andTim

ing:Late

andEarly

Rew

ards

Hollerm

an&

Sch

ultz

1998

Page 112: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

DA

andLearn

ing:Audito

ryCortex

Bao

etal,2001,N

ature

Page 113: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Learn

ingTheo

ry:Blocking(Beh

avior)

Page 114: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Learn

ingTheo

ry:Blocking(D

opam

ine)

Waelti

etal,

2001,Natu

re

Blocked

stimulus

Contro

l(notblocked

)stim

Page 115: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Learn

ingTheo

ry:(U

n)Blocking

Page 116: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TD

pred

ictionerro

ran

dhuman

functio

nal

imag

ing

O’D

oherty

etal,2004,S

cience

Ven

tralstriatu

m=DA

enrich

ed,correlates

with

TD

PE=Critic!

Page 117: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Optical

phasic

DA

stimulatio

ncau

sallyinduces

conditio

ning

(Tsai

etal,

2009,Scien

ce)

Page 118: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

DA

neu

ronsp

ikingdurin

grein

forcem

enttask

inhuman

s

(Zag

houlet

al,2009,

Scien

ce)

Page 119: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

How

aredopam

ine-b

asedRPEsig

nals

used

toselect

actions?

Will

consid

erbiological

implem

entatio

nin

basal

gan

glia

later

Page 120: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Qlearn

ing:exten

dingpred

ictionerro

rlearn

ingto

actions

Erro

rin

pred

ictedrew

ard:

δt=

(

rt+

γmax

aQ

t (st+

1,a))

−Q

t (s,a)

Update

valu

eestim

ate:

Qt (s,a)←

Qt (s,a)+

αδ(t)

Select

amongQ

valu

es:

Pt (a)=

eQt (s,a)

β

ni=1e

Qt (s,i)

β

γ=disco

unt,α=learn

ingrate,

β=“tem

peratu

re”/exploratio

nparam

eter

Watk

ins&

Day

an1992

Page 121: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Google

Deep

MindRLNetw

ork

(“DQN”)

Play

sAtari

Page 122: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

forvideo

,DQN

spacein

vad

ers.mov

Page 123: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Google

Deep

MindRLNetw

ork

(“DQN”)

Play

sAtari

Page 124: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Extra

Thefollo

wingslid

esdescrib

earecen

tlydev

eloped

alternativ

eto

TD,called

PVLV,w

hich

wethinkismore

biologically

plau

sible

andcomputatio

nally

powerfu

l.Thismaterial

isoptio

nal

forthecourse.

Page 125: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Q:H

ow

dowelearn

toattach

positiv

e/neg

ativevalen

ceto

enviro

nmen

tal

stimuli?

Page 126: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Q:H

ow

dowelearn

toattach

positiv

e/neg

ativevalen

ceto

enviro

nmen

tal

stimuli?

A:T

hesam

eway

welearn

lots

ofother

stuff:

theDelta

Rule!

δpv=

r−

V̂pv

V̂pv :expected

reward

based

onprio

rasso

ciations

r:rew

ardδpv :learn

ingsig

nal

Page 127: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TheProblem

Q:B

utwhat

hap

pen

swhen

enviro

nmen

talstim

ulusoccu

rsbefo

rerew

ard?

Page 128: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Basic

Data:

VTA

DA

Neu

ralFirin

gin

Conditio

ning

Befo

re:After:

Page 129: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Basic

Data:

VTA

DA

Neu

ralFirin

gin

Conditio

ning

Rew

DA

Rew

DA

a) Acquisition

b) Trained, rew

ard omission

CS

CS

Page 130: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Basic

Data:

VTA

DA

Neu

ralFirin

gin

Conditio

ning

Rew

DA

Rew

DA

a) Acquisition

b) Trained, rew

ard omission

CS

CS

Dopam

inesp

ikes/

dipsare

learningsig

nals

Page 131: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Basic

Data:

VTA

DA

Neu

ralFirin

gin

Conditio

ning

Rew

DA

Rew

DA

a) Acquisition

b) Trained, rew

ard omission

CS

CS

Dopam

inesp

ikes/

dipsare

learningsig

nals

Delta

rule

failsto

accountforpred

ictiveDA

spike!

Page 132: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Stan

dard

Approach

:TD

Pred

ictall

future

reward

s(disco

unted

):

Vt=

τ=∞

τ=t+

1γτ−(t+

1)r

τ

Recu

rsively

:

V̂t−

1=

rt+

γV̂t

Erro

r=Tem

poral

Differen

ce=TD:

DA

=δt=

[rt+

γV̂t ]−

V̂t−

1

Page 133: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

TD

Illustrated

−+

γ+

1

δ

V̂1

r1

S1

δ

V̂2

V̂r2

S2

2

S3

3

r3V̂

V̂3

δ

r^

S4

4

V̂V

44

δ

tonetone

tonetone

Tim

e

V̂0

−+

γ+

−+

γ+

−+

γ+

23

1

δinit

final

Input

43

21

Page 134: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Problem

swith

TD

•Great

algorith

m,d

evelo

ped

incomputer

science

/mach

inelearn

ing,

butisthisactu

allywhat

thebrain

does?

•Even

ifso,d

oesn

’tsp

ecifyhow

these

signals

arecomputed

bysystem

s

upstream

ofDA...

justpred

ictsDA

andδbutsay

snothingab

outV,etc.

•Curren

trew

ardvalu

eisalw

aysrelativ

eto

what

hap

pen

edjustbefo

re.

Toomuch

temporal

dep

enden

cy?

•Chain

ingnotseen

inneu

ralreco

rdings.

•What

determ

ines

“disco

untfacto

r”γ,b

iologically

?

Page 135: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Rat

study:sim

ultan

eousCSan

dUSDA

spike

Pan

etal,2005,Jo

urnal

ofNeu

roscien

ce

Inconsisten

twith

standard

TD!

Page 136: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

ThePVLVAltern

ative

PVLV=Prim

aryValu

e,Learn

edValu

e(O

’Reilly,F

rank,H

azy&

Watz,2007,B

ehav

Neu

rosci)

•Norew

ardpred

ictions,ju

stasso

ciations!

•Notem

poral

dep

enden

cies:DA

dep

endsonly

oncu

rrentstate.

•Uses

samebasic

delta-ru

lelearn

ingas

TD

(Resco

rla-Wag

ner).

Page 137: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

excitatoryinhibitory

CN

A

VS

patchV

Spatch

Stim

uli(C

S)

(DA

)V

TA

/SN

cC

ereb.(T

iming)

PP

TS

timuli

(US

)

(PV

)i(LV

)i

(LV )e

(PV

)e

ventral

striatum

NAc

LHA

CS

Tim

ing

DA

PV

i

LVe

US

/PV

e

•PV(Prim

aryValu

e):Prim

aryrew

ards(U

S),can

celed.

•LV(Learn

edValu

e):Learn

edasso

ciations(C

S→

DA).

Page 138: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PV

:Prim

aryV

alue

•Train

edat

eachpointin

timeonactu

alrew

ardvalu

epresen

t:

δt=

rt−

V̂t

Page 139: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PV

:Prim

aryV

alue

•Train

edat

eachpointin

timeonactu

alrew

ardvalu

epresen

t:

δt=

rt−

V̂t

•Thisuses

immed

iatepred

iction(V̂

t )ofcu

rrentrew

valu

e(rt )

Page 140: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PV

:Prim

aryV

alue

•Train

edat

eachpointin

timeonactu

alrew

ardvalu

epresen

t:

δt=

rt−

V̂t

•Thisuses

immed

iatepred

iction(V̂

t )ofcu

rrentrew

valu

e(rt )

•Acco

unts

forcan

celingofDA

spike@rew

,andDA

dipswhen

norew

received

.

Page 141: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PV

:Prim

aryV

alue

•Train

edat

eachpointin

timeonactu

alrew

ardvalu

epresen

t:

δt=

rt−

V̂t

•Thisuses

immed

iatepred

iction(V̂

t )ofcu

rrentrew

valu

e(rt )

•Acco

unts

forcan

celingofDA

spike@rew

,andDA

dipswhen

norew

received

.

•Butthisdoesn

’tacco

untforpred

ictiveDA

spikes...

(actually

results

in

pred

ictiveDA

dips!)

Page 142: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

LV:Learned

value

•Rep

resents

perceiv

edvalu

esofstim

seven

when

thereis

nocurrent

rewexpectation

.

Page 143: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

LV:Learned

value

•Rep

resents

perceiv

edvalu

esofstim

seven

when

thereis

nocurrent

rewexpectation

.

•Only

gets

trainingsig

nal

@rew

,orwhen

PVexpects

somerew

.(ie

learningisfiltered

byprim

aryPVsystem

.)

Page 144: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

LV:Learned

value

•Rep

resents

perceiv

edvalu

esofstim

seven

when

thereis

nocurrent

rewexpectation

.

•Only

gets

trainingsig

nal

@rew

,orwhen

PVexpects

somerew

.(ie

learningisfiltered

byprim

aryPVsystem

.)

•→

Learn

sat

timeofrew

,butnotat

CSonset.

•→

Gen

eralizesrew

valu

esto

CS...

•→

Acco

unts

forDA

spikes

forstim

ulithat

hav

eprev

iously

been

associated

with

reward

!

Page 145: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:C

omputatio

nally

Powerfu

l

Compariso

nwith

TD

onRan

dom

Delay

s(break

sTD

chain

ing):

050

100150

200250

Epochs

-0.05

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Avg DA Value

Delay 3

Delay 6

Delay 12

Rnd D

elay, p=.2, TD

Disc .95 Lrate .1

050

100150

200250

Epochs

0.00

0.25

0.50

0.75

1.00

Avg DA Value

Delay 3

Delay 6

Delay 12

Rnd D

elay, p=.2, PV

LV Lrate .005

Page 146: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:C

omputatio

nally

Powerfu

l

Compariso

nwith

TD

onRan

dom

Delay

s(break

sTD

chain

ing):

050

100150

200250

Epochs

-0.05

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Avg DA Value

Delay 3

Delay 6

Delay 12

Rnd D

elay, p=.2, TD

Disc .95 Lrate .1

050

100150

200250

Epochs

0.00

0.25

0.50

0.75

1.00

Avg DA Value

Delay 3

Delay 6

Delay 12

Rnd D

elay, p=.2, PV

LV Lrate .005

Enab

lesworkingmem

ory

model

tolearn

complex

WM

tasks.

Page 147: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Sim

ilarto

Brown,Bullo

ck&

Grossb

erg,’99

Diffs:

Anato

mical

(CNA

vs.VS;D

orsal

vs.Ven

tralPatch

)

Functio

nal

(intrin

sictim

ing?LVsystem

cannottrain

itself).

Page 148: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLVacco

unts

fortim

ingdata

better

than

TD!

•Data:

durin

gtran

sientlearn

ingperio

d,both

rewsan

dCSelicit

activatio

n.

•Thisacco

unted

forbyPV,L

Vsystem

soperatin

gin

parallel.

•TD:p

redicts

chain

ingback

intim

efro

mrew

toCS.

Page 149: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLVacco

unts

fortim

ingdata

better

than

TD!

•Data:

delay

edrew

ardscau

sedips@usu

altim

e,then

spikes

•Thisacco

unted

forbyboth

TD

andPV.

Page 150: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLVacco

unts

fortim

ingdata

better

than

TD!

•Data:

delay

edrew

ardscau

sedips@usu

altim

e,then

spikes

•Thisacco

unted

forbyboth

TD

andPV.

•Data:

earlyrew

ardscau

sesp

ikes,

then

dips@usu

altim

e

•Thisacco

unted

forbyPV(sp

ike),P

V(dip),b

utTD

only

accountsfor

spike.

Page 151: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

More

Key

Pred

ictionsfro

mPVLV

excitatoryinhibitory

CN

A

VS

patchV

Spatch

Stim

uli(C

S)

(DA

)V

TA

/SN

cC

ereb.(T

iming)

PP

TS

timuli

(US

)

(PV

)i(LV

)i

(LV )e

(PV

)e

ventral

striatum

NAc

LHA

•CNA

=Pav

lovian

conditio

ning

(e.g.,Killcro

sset

al.’97).

•NAc(patch

/sh

ell)=Extin

ction

(Ferry

etal.

’00;Annett

etal.,

89),Blocking(data?).

•NAc(m

atrix/core)

=Basic

actions(O

R’s,ap

proach

,av

oid).

•CNA

can’ttrain

itself:No2n

d

order

conditio

ning!

•BLA

=2n

dorder

cond,u

ses

DA-in

dep

enden

tmech

anism

s

(CNA/BLA

double-d

issoc).

Page 152: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Conclu

sions

PVLVprovides

computatio

nally

motiv

atedarch

itecture

that

seemsto

fitwith

biology&

beh

avioral

data.

These

learningmech

anism

sen

able

arbitrary

stimuli/

goals

tobeplugged

into

ourfixed

setofbuilt-in

motiv

ational

driv

es.

Page 153: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

Conclu

sions

PVLVprovides

computatio

nally

motiv

atedarch

itecture

that

seemsto

fitwith

biology&

beh

avioral

data.

These

learningmech

anism

sen

able

arbitrary

stimuli/

goals

tobeplugged

into

ourfixed

setofbuilt-in

motiv

ational

driv

es.

Someth

ingmotiv

atesev

erygen

eratedmen

tal-state,alway

s!

Page 154: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV,W

M,an

dDA

DA

PF

C(spans the delay)

DA

a)b)

(causes updating)

(maint in P

FC

)

CS

CS

BG

−G

o

(reinforces Go)

US

/r

US

/r

Page 155: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PVlearn

ing:

δpv=

r−

V̂pv

–or–

δpv=

PVe−

PVi

∆wi=

ǫxi δpv

Page 156: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PVlearn

ing:

δpv=

r−

V̂pv

–or–

δpv=

PVe−

PVi

∆wi=

ǫxi δpv

LVlearn

ing(filtered

byPV):

∆wi=

{

ǫ(rt−

V̂lv )

xi

ifV̂pv>

θpvorrt>

00

otherw

ise

Page 157: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

PVLV:T

woSep

arateMech

anism

s(PV,L

V)

PVlearn

ing:

δpv=

r−

V̂pv

–or–

δpv=

PVe−

PVi

∆wi=

ǫxi δpv

LVlearn

ing(filtered

byPV):

∆wi=

{

ǫ(rt−

V̂lv )

xi

ifV̂pv>

θpvorrt>

00

otherw

ise

Global

DA

(PVdominates):

δt=

{

δpv

ifV̂pv>

θpvorrt>

0δlv

otherw

ise

δlv

=LVe−

LVi

Page 158: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

LVExtras

•DA

spikes

only

observ

ed@CSonset,d

on’tcontin

uethroughoutdelay

until

reward

.Problem

forPV?

Page 159: Sequential & Temporally-Delayed Learningski.clps.brown.edu/cogsim/cogsim.8temporal.pdf · Sequential & Temporally-Delayed Learning 1. The Problem. 2. Sequential Learning & Context

LVExtras

•DA

spikes

only

observ

ed@CSonset,d

on’tcontin

uethroughoutdelay

until

reward

.Problem

forPV?

•Solutio

n:PVsystem

has

synap

ticdep

ression,acco

mmodates

to

constan

tsen

sory

inputs;o

nly

perceiv

esvalu

esofstim

sthat

were

not

presen

tin

lasttim

estep

.

•Thisisalso

importan

tforPFClearn

ing..(stay

tuned

)