Upload
kevin-ritchie
View
235
Download
4
Tags:
Embed Size (px)
Citation preview
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 1
Learning Hierarchical Task Networksby Analyzing Expert Traces
Pat LangleyTolga KonikNegin Nejati
Institute for the Study of Learning and Expertise
Palo Alto, California
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 2
Formulation of the Learning Task
Given: A set of domain operators with known effects A worked out problem solution that consists of
The goal to be achieved in the problem A sequence of operator instances that achieves the goal A related sequence of intermediate problem states
Find: A hierarchical task network that Reproduces the solution to the training problem Generalizes well to related problems in the domain
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 3
The ICARUS Architecture
ConceptualConceptualMemoryMemory
BeliefBeliefMemoryMemory
Goal/IntentionGoal/IntentionMemoryMemory
ConceptualConceptualInferenceInference
SkillSkillExecutionExecution
PerceptionPerception
EnvironmentEnvironment
PerceptualPerceptualBufferBuffer
Skill LearningSkill Learning
MotorMotorBufferBuffer
Skill RetrievalSkill Retrievaland Selectionand Selection
Skill MemorySkill Memory
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 4
Representing Long-Term Structures
Conceptual clauses: A set of relational inference rules with perceived objects or defined concepts in their antecedents;
Skill clauses: A set of executable skills that specify: a head that indicates a goal the skill achieves; a single (typically defined) precondition; a set of ordered subgoals or actions for achieving the goal.
These define a specialized class of hierarchical task networks in a syntax very similar to Nau et al.’s SHOP2 formalism.
Beliefs, goals, and intentions are instances of these structures.
ICARUS encodes two forms of general long-term knowledge:
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 5
Representing Concepts (Axioms)
((in-rightmost-lane ?self ?clane) :percepts ((self ?self) (segment ?seg)
(line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane)
(last-lane ?clane) (not (lane-to-right ?clane ?anylane))) )
((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane)
(aligned-with-lane-in-segment ?self ?seg ?lane) (centered-in-lane ?self ?seg ?lane) (steering-wheel-straight ?self)) )
((in-lane ?self ?lane) :percepts ((self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ((> ?dist -10) (<= ?dist 0)) )Pr
imit
ive
Con
cept
sN
onpr
imit
ive
Con
cept
s
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 6
((in-rightmost-lane ?self ?line) :percepts ((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line)) )
((driving-well-in-segment ?self ?seg ?line) :percepts ((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg)
(centered-in-lane ?self ?seg ?line) (aligned-with-lane-in-segment ?self ?seg ?line) (steering-wheel-straight ?self)) )
((in-segment ?self ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross)
(segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions ((steer 1)) )Pr
imit
ive
Skil
l Cla
uses
Non
prim
itiv
eSk
ill C
laus
esRepresenting Skills (Methods)
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 7
Each concept is defined in terms of other concepts and/or percepts.
Each skill is defined in terms of other skills, concepts, and percepts.
concepts
skills
ICARUS organizes both concepts and skills in a hierarchical manner.
Hierarchical Structure of Memory
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 8
Hierarchical Structure of Memory
For example, the skill highlighted here refers directly to the highlighted concepts.
ICARUS interleaves its long-term memories for concepts and skills.
concepts
skills
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 9
Basic ICARUS Processes
Concepts are matched bottom up, starting from percepts.
Skill paths are matched top down, starting from intentions.
ICARUS matches patterns to recognize concepts and select skills.
concepts
skills
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 10
Impasse-Driven Analytical Learning
Skill Hierarchy
ReactiveExecution
AnalyticalLearning
Expert’s Primitive Skill Sequence
…
Effects of Primitive skills
Learned Skills
If Impasse
Problem
?InitialState
Goal
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 11
Learning HTNs by Trace Analysis
concepts
primitive skills
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 12
Skill Chaining
concepts
primitive skills
Learning HTNs by Trace Analysis
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 13
Concept Chaining
concepts
primitive skills
Learning HTNs by Trace Analysis
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 14
unstack C B
on B A hand-empty
putdown C
putdownable C
unstackable B A
clear A
unstack B A
clear B
unstackable C B
AB
C
AB
C AC
B
ABC
Constructing an Explanation
concepts
primitive skills
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 15
ABC
unstack C B
on B A hand-empty
putdown C
putdownable C
unstackable B A
clear A
unstack B A
clear B
unstackable C B
AB
C
AB
C AC
B
concepts
primitive skills
From an Explanation to an HTN
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 16
on ?y ?x hand-empty
putdown ?z
putdownable ?z
unstackable ?y ?x
clear ?x
clear ?y
unstackable ?z ?y
unstack ?z ?y
AB
C
AB
C AC
B
ABC
unstack ?y ?x
From an Explanation to an HTN
concepts
primitive skills
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 17
Key Ideas of the Approach
Constrained form of hierarchical task networks
Each skill clause/method has a goal as its head
Each method has one (possibly defined) precondition
The resulting semi-lattice makes learning tractable
Learning involves analyzing the expert trace
Explanation draws on a form of goal regression
Each step in the explanation becomes an HTN method
Similar to explanation-based learning for planning but retains the explanation structure
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 18
Related Research
Nonincremental, knowledge-lean approaches
Behavioral cloning (Sammut, 1996; Urbancic & Bratko 1994)
Relational induction from traces (e.g., Reddy & Tadepalli, 1997)
Incremental, knowledge-intensive approaches
Explanation-based learning (e.g., Shavlik, 1989; Mooney, 1990)
Derivational analogy (e.g., Veloso & Carbonell, 1993)
Programming by demonstration (e.g., Lau et al., 2003)
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 19
Plans for Future Research
Extend framework to use and learn partial-order skills
Augment approach to use known subtasks during learning
Extend method to learn skills with negated goals and subgoals
Modify approach to handle partially observable traces
Extend system to learn skills with uncertain outcomes
IL Kickoff MeetingJune 20-21, 2006
DARPA Integrated LearningPOIROT Project 20
End of Presentation