17
Learning to Plan with Logical Automata Brandon Araki 1 *, Kiran Vodrahalli 2 *, Thomas Leech 1,3 , Mark Donahue 3 , Cristian-Ioan Vasile 1 , Daniela Rus 1 1 Massachusetts Institute of Technology 2 Columbia University 3 MIT Lincoln Laboratory *Equal contributors 1

Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Learning to Plan with Logical Automata

Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3,Mark Donahue3, Cristian-Ioan Vasile1, Daniela Rus1

1Massachusetts Institute of Technology2Columbia University

3MIT Lincoln Laboratory

*Equal contributors

1

Page 2: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

2

Page 3: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Goals

Learn to plan in an environment with rules1. Learn the rules in a way that they can be easily interpreted by humans

2. Incorporate the rules into planning so that modifying the rules results in predictable changes in behavior

3

Page 4: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Packing a Lunchbox

Pack a burger or a sandwich; then pack a banana

4

Page 5: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Goal 1 – Interpretability

Pack a burger or a sandwich;

then pack a banana

5

InitialState

Picked upor

Packedor

Picked up Packed

GOAL!

Rules

Finite State Automaton

Page 6: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

6

Low-level MDP

High-level MDPPack sandwich or burger;

Then pack bananaAvoid obstacles

Factoring the Environment

Page 7: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

77

Discrete 2D gridworld

Finite state automaton

Representing the Environment

Pack sandwich or burger;Then pack banana

Avoid obstacles

InitialState

Picked upor

Packedor

Picked up Packed

Page 8: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Goal 2 – Manipulability

Incorporate FSA into planning

8

InitialState

Picked upor

Packedor

Picked up Packed

S0 S1 S2 S3 G

S0 o ØS0

S1

S2

S3

G

T

Page 9: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Learn reward

Learn transitions

Learn transitions of FSA

9

One VIN for each FSA state

Differentiable Recursive Planning

Based on Tamar, Aviv, et al. "Value iteration networks." Advances in Neural Information Processing Systems. 2016.

Page 10: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Experiments - Interpretability

10

Propositions

FSA

Sta

tes

S0 o ØS0

S1

S2

S3

G

T

Page 11: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

S0 o ØS0

S1

S2

S3

G

T

Experiments - Interpretability

11

Picking up the sandwich or the hamburger causes a transition to the next state

Page 12: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

12

InitialState

Picked upor

Packedor

Picked up Packed

Page 13: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

13

InitialState

Picked upor

Packedor

Picked up Packed

Page 14: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

14

S0 o ØS0

S1

S2

S3

G

T

Page 15: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

S0 o ØS0

S1

S2

S3

G

T

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

15

Page 16: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

S0 o ØS0

S1

S2

S3

G

T

Experiments – Manipulability

We can modify the FSA so that it will only pick up the burger and not the sandwich.

16

Page 17: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark

Learning to Plan with Logical Automata

Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3,Mark Donahue3, Cristian-Ioan Vasile1, Daniela Rus1

1Massachusetts Institute of Technology2Columbia University

3MIT Lincoln Laboratory

*Equal contributors

17