Upload
brigit
View
27
Download
0
Embed Size (px)
DESCRIPTION
Learning through Interactive Behavior Specifications. Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan. Goal. Automatically generate cognitive agents Reduce the cost of agent development - PowerPoint PPT Presentation
Citation preview
1
Learning through Interactive Behavior Specifications
Tolga KonikCSLI, Stanford University
Douglas PearsonThree Penny Software
John LairdUniversity of Michigan
2
Goal
Automatically generate cognitive agents
Reduce the cost of agent development
Reduce the expertise required to develop agents.
3
Domains
Autonomous Cognitive agents Dynamic Virtual Worlds Real time decisions based on
knowledge and sensed data Soar agent architecture
4
Learning by Observation
Approach: Observe expert behavior Learn to replicate it
Why? We may want human-like agents In complex domains, imitating
humans maybe easier than learning from scratch
5
Bottleneck in pure Learning by Observation
PROBLEM: You cannot observe the internal reasoning
of the expert
SOLUTION: Ask the expert for additional information
Goal annotations Use additional knowledge sources
Task & domain knowledge
6
Learning by Observation
Agent
Actions Percepts
Learner
Goalannotations
Additional Task Knowledge
Interface EnvironmentExpert
7
Agent Interface Environment
ILP 2004
Machine Learning Journal (forthcoming)
Learning by Observation
8
Learning by ObservationCritic Mode
Agent Interface Environment
Expert
critic
Learner
9
One Body, Two Minds
?
How and when to switch control
How the expert and the agent program communicate
? Agent Interface Environment
Expert
10
Expert
Diagrammatic Behavior Specification
Agent
EnvironmentRedux
Learner
11
Redux
Visual rule editing
Diagrammatic Behavior Specification
12
Get-item-in-room(Item)
Get-item(Item)
Go-through(Door)
Goto-next-roomGet-item-different-room(Item)
Go-to-door(D)Go-to(Door)
Goal Hierarchy
Task-Performance knowledge is represented with a hierarchy of durative goals.
i3
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3 i3 i3
13
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3
Get-item-in-room(Item)
Get-item(i3)
Go-through(Door)
Goto-next-roomGet-item-different-room(Item)
Go-to-door(D)Go-to(Door)
i3
Get-item-in-room(i3)
Item=i3
Goal Hierarchy
14
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3
Get-item-different-room(Item)Get-item-different-room(i3)
Go-to(Door)
Get-item-in-room(Item)
Get-item(i3)
Go-through(Door)Go-to(d1)
i3
Door=d1
Item=i3
Goal Hierarchy
15
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3
Get-item-in-room(Item)
Get-item(i3)
Go-through(d1)
Goto-next-roomGet-item-different-room(i3)
Go-to-door(D)Go-to(Door)
i3
Door=d1
Goal Hierarchy
17
Behavior Specification
Agent
Expert
Expert draws initial abstract situation Create senario by selecting actions
18
Goal Specification
Agent
Expert
Goals are explicitly selected The agent contributes based on the current
situation, current goal and its knowledge
20
Goal Hierarchy
Learning by Observation perspective Unobservable mental reasoning of the expert
Learning Perspective Bias hypothesis space “learn agent” problem reduced to “learn goal
selection and termination” MI Perspective
information exchange between the expert and the agent
21
Relevant Knowledge Specification
Agent
Prepare food
Expert can mark important objects in a decision
Expert
22
Expert specified undesired actions and goals
Expert rejected actions and goals of the approximately learned agent program
Watch TV
Rich Behavior Trace
23
Hypothetical Actions and Goals Situation history : a tree structure of
possible behaviors
Rich Behavior Trace
24
Input: Relational Situations Goal and action selections and rejections Additional annotations (i.e. important objects) Background knowledge
Output: Rule based agent program
Learn goal/action selection/termination generalizing over multiple examples
Inductive Logic Programming to combine rich knowledge structures
Relational Learning by Observation
25
Relational Learning by Observation
26Find the common structures in the decision examples
Relational Learning by Observation
27
?
“Select a door in the current room, which leads to a room that contains the item the agent wants to get”
Learn relations between what the agent wants, perceives and knows.
Relational Learning by Observation
32
Summary
Diagrammatic behavior specification approach: To extract rich behavior knowledge Interactive behavior specification Communication medium between the
agents (explicit goals and assumed situation)
Relational learning by observation approach to combine multiple complex knowledge sources
33
Future Work
Improve mixed initiative interaction of the interface
Explore domain independent diagrammatic interface features
Allow the expert to enter context sensitive knowledge
34
Mixed initiative perspective
Interactive behavior specification Diagrammatic representation of behavior
communication medium between the agents Explicit goals and desired behavior
Facilitates interaction between the agents