Learning through Interactive Behavior Specifications

1

Learning through Interactive Behavior Specifications

Tolga KonikCSLI, Stanford University

Douglas PearsonThree Penny Software

John LairdUniversity of Michigan

2

Goal

Automatically generate cognitive agents

Reduce the cost of agent development

Reduce the expertise required to develop agents.

3

Domains

Autonomous Cognitive agents Dynamic Virtual Worlds Real time decisions based on

knowledge and sensed data Soar agent architecture

4

Learning by Observation

Approach: Observe expert behavior Learn to replicate it

Why? We may want human-like agents In complex domains, imitating

humans maybe easier than learning from scratch

5

Bottleneck in pure Learning by Observation

PROBLEM: You cannot observe the internal reasoning

of the expert

SOLUTION: Ask the expert for additional information

Goal annotations Use additional knowledge sources

Task & domain knowledge

6


Agent

Actions Percepts

Learner

Goalannotations

Additional Task Knowledge

Interface EnvironmentExpert

7

Agent Interface Environment

ILP 2004

Machine Learning Journal (forthcoming)


8

Learning by ObservationCritic Mode

Agent Interface Environment

Expert

critic

Learner

9

One Body, Two Minds

?

How and when to switch control

How the expert and the agent program communicate

? Agent Interface Environment

Expert

10

Expert

Diagrammatic Behavior Specification

Agent

EnvironmentRedux

Learner

11

Redux

Visual rule editing

Diagrammatic Behavior Specification

12

Get-item-in-room(Item)

Get-item(Item)

Go-through(Door)

Goto-next-roomGet-item-different-room(Item)

Go-to-door(D)Go-to(Door)

Goal Hierarchy

Task-Performance knowledge is represented with a hierarchy of durative goals.

i3

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3 i3 i3

13

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3


Get-item(i3)

Go-through(Door)

Goto-next-roomGet-item-different-room(Item)


i3

Get-item-in-room(i3)

Item=i3

Goal Hierarchy

14

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

Get-item-different-room(Item)Get-item-different-room(i3)

Go-to(Door)


Get-item(i3)

Go-through(Door)Go-to(d1)

i3

Door=d1

Item=i3

Goal Hierarchy

15

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3


Get-item(i3)

Go-through(d1)

Goto-next-roomGet-item-different-room(i3)


i3

Door=d1

Goal Hierarchy

17

Behavior Specification

Agent

Expert

Expert draws initial abstract situation Create senario by selecting actions

18

Goal Specification

Agent

Expert

Goals are explicitly selected The agent contributes based on the current

situation, current goal and its knowledge

20

Goal Hierarchy

Learning by Observation perspective Unobservable mental reasoning of the expert

Learning Perspective Bias hypothesis space “learn agent” problem reduced to “learn goal

selection and termination” MI Perspective

information exchange between the expert and the agent

21

Relevant Knowledge Specification

Agent

Prepare food

Expert can mark important objects in a decision

Expert

22

Expert specified undesired actions and goals

Expert rejected actions and goals of the approximately learned agent program

Watch TV

Rich Behavior Trace

23

Hypothetical Actions and Goals Situation history : a tree structure of

possible behaviors

Rich Behavior Trace

24

Input: Relational Situations Goal and action selections and rejections Additional annotations (i.e. important objects) Background knowledge

Output: Rule based agent program

Learn goal/action selection/termination generalizing over multiple examples

Inductive Logic Programming to combine rich knowledge structures

Relational Learning by Observation

25


26Find the common structures in the decision examples


27

?

“Select a door in the current room, which leads to a room that contains the item the agent wants to get”

Learn relations between what the agent wants, perceives and knows.


32

Summary

Diagrammatic behavior specification approach: To extract rich behavior knowledge Interactive behavior specification Communication medium between the

agents (explicit goals and assumed situation)

Relational learning by observation approach to combine multiple complex knowledge sources

33

Future Work

Improve mixed initiative interaction of the interface

Explore domain independent diagrammatic interface features

Allow the expert to enter context sensitive knowledge

34

Mixed initiative perspective

Interactive behavior specification Diagrammatic representation of behavior

communication medium between the agents Explicit goals and desired behavior

Facilitates interaction between the agents

Documents

Learning through Interactive Behavior Specifications