26
Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience Unit

Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Embed Size (px)

Citation preview

Page 1: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Flexible Shaping:How learning in small steps helps

Hierarchical Organization of Behavior, NIPS 2007

Kai Krueger and Peter DayanGatsby Computational Neuroscience Unit

Page 2: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Outline

Introduction learning may require external guidance shaping as a concept:

Set-Up 12-AX task, LSTM network, shaping procedure

Results simple shaping when does shaping help most? flexibility to adapt to variations

Conclusions, issues and future work rules and habits

Page 3: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Introduction

Learning essential for flexibility trial and error external guidance:

“one shot teaching” by verbal explanation of abstract rules imitation shaping

Guidance critical for complex behavior branching, working memory, rapid changes

Page 4: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Shaping

“a method of successive approximations” (Skinner 1938)

Key features: external alteration of reward contingencies withdrawal of intermittent rewards

Creates behavioral units e.g. lever pressing of a rat

Separate time scales / branching points by providing separate stages in shaping

Ubiquitously (and implicitly) in animal experiments

Page 5: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

12-AX task

Demo

Page 6: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

LSTM network

Long Short-Term memory (Hochreiter and Schmidthuber 1997)

3-layer recurrent neural network

Provides built-in mechanisms for: working memory gating (input, output and forget)

Abstract “over-simplified” model of PFC basis to motivate PBWM (O’Reilly et al.)

Page 7: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Shaping procedure

Teach 12-AX as successive approximations

Separate WM timescales: long: (1 / 2 ) short: (AX/BY)

Learning in 7 stages last stage: full 12-AX

Resource allocation currently done by hand each stage learned into a new block

all other memory blocks disabled

provides separation / No interference

Page 8: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Simple shaping

• Improvement in learning times:• 8 fold decrease (only final stage)• significantly better (including complete training) • median: 13 epochs, min: 8 epochs

• Need the 4 stages of shaping 1 and 2

• High variance in shaping times

Page 9: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

What makes shaping work

Robustness to additional structure: irrelevant “experience”

related and unrelated tasks / inputs

Resource allocation: interference between tasks => no benefits

Page 10: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Shaping: when is it useful?

Can shaping prevent scaling of learning time with task complexity?

One aspect of complexity: Temporal credit assignment

increase the outer loop length => higher temporal complexity

Results: training time still increases, but

scales much slower. increasing complexity

=> shaping more important

Page 11: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Rule abstraction

Rule abstraction: flexibility to cope with change in

statistics

Train on the base 12-AX task (loop length 4)

Test with variations loop lengths 4, 5, 12, 40 disable learning

Should perform perfectly abstract rules have not changed

Page 12: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Generalisation

Generalisation: cope with yet unseen data

(inner loop combinations)

Train 12-AX task without AZ and CX

Test performance on full task

Only 7 combinations one valid generalisation only?

Mixed results: differences in emphasis (1-

back / 0-back) overall shaping still better

Page 13: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Reversal learning

• Reverse stimulus – rule association• shape all components needed

• Repeatedly reverse (after 500 epochs)• learning of reversals.

• Identify flexibility to perform reversals• unshaped: mostly fails• shaped: succeeds more often

Page 14: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Conclusions

Shaping works

Reduces learning times

Helps learning long time delays separating time scales of actions recombine “behavioral units” into sequences

Improves abstraction and separation

Increases flexibility to reversals

Take home message: need to take sequential and transfer learning more into account when

looking at learning architectures.

Still issues to solve though

Page 15: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Limitations

Resource allocation prime computational issue

done by hand (Homunculus)

ideas to automate: compute “responsibilities” Mosaic

Experimental data no published data on learning 12-AX interesting manipulations:

loop length, target frequency, … natural grouping of alphabet

Page 16: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Future Work

Still based on “habitual” learning => no instant reprogramming Need additional mechanisms:

more explicit rules variable substitution

Bilinear rules framework:(Dayan 2007)

recall match execute

Close interaction between habitual and rule based learning rules supervise habit learning habits form basis of rule execution

Results in a task grammar?

Page 17: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Questions?

Thank you

Page 18: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Shaping in computational work

Surprisingly few implementations: instead: elaborate learning framework to learn in one go

Learning grammars: simple examples first (Elman 1993) “self simplification” by network limitations

Robotics learning in a grid world (Singh1992) (Dorigo and Colombetti 1998) robot vs animal shaping (Savage 1998)

Page 19: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Shaping procedure

Remembering “1” and “2” anything following a 1 => L / 2 => R progressively increase loop length

1-back memory for A / B unconditional CPT-AX 2 Separate stages for A and B

Page 20: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

LSTM network

Page 21: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Examples of learning in LSTM

Page 22: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Shaping at work

Page 23: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Rule abstraction

Page 24: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Reversal learning continued

Page 25: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Alternative shaping paths

Not all shaping paths have equal properties

Trivial shaping path: each stage full 12-AX rules increase loop length from 1 through 4

Similar training time, but bad generalisation

Page 26: Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience

Rules

individual rules:

recall

match

execute