IL Kickoff Meeting June 20-21, 2006 DARPA Integrated Learning POIROT Project 1 Learning Hierarchical Task Networks by Analyzing Expert Traces Pat Langley

IL Kickoff MeetingJune 20-21, 2006

DARPA Integrated LearningPOIROT Project 1

Learning Hierarchical Task Networksby Analyzing Expert Traces

Pat LangleyTolga KonikNegin Nejati

Institute for the Study of Learning and Expertise

Palo Alto, California



Formulation of the Learning Task

Given: A set of domain operators with known effects A worked out problem solution that consists of

The goal to be achieved in the problem A sequence of operator instances that achieves the goal A related sequence of intermediate problem states

Find: A hierarchical task network that Reproduces the solution to the training problem Generalizes well to related problems in the domain



The ICARUS Architecture

ConceptualConceptualMemoryMemory

BeliefBeliefMemoryMemory

Goal/IntentionGoal/IntentionMemoryMemory

ConceptualConceptualInferenceInference

SkillSkillExecutionExecution

PerceptionPerception

EnvironmentEnvironment

PerceptualPerceptualBufferBuffer

Skill LearningSkill Learning

MotorMotorBufferBuffer

Skill RetrievalSkill Retrievaland Selectionand Selection

Skill MemorySkill Memory



Representing Long-Term Structures

Conceptual clauses: A set of relational inference rules with perceived objects or defined concepts in their antecedents;

Skill clauses: A set of executable skills that specify: a head that indicates a goal the skill achieves; a single (typically defined) precondition; a set of ordered subgoals or actions for achieving the goal.

These define a specialized class of hierarchical task networks in a syntax very similar to Nau et al.’s SHOP2 formalism.

Beliefs, goals, and intentions are instances of these structures.

ICARUS encodes two forms of general long-term knowledge:



Representing Concepts (Axioms)

((in-rightmost-lane ?self ?clane) :percepts ((self ?self) (segment ?seg)

(line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane)

(last-lane ?clane) (not (lane-to-right ?clane ?anylane))) )

((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane)

(aligned-with-lane-in-segment ?self ?seg ?lane) (centered-in-lane ?self ?seg ?lane) (steering-wheel-straight ?self)) )

((in-lane ?self ?lane) :percepts ((self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ((> ?dist -10) (<= ?dist 0)) )Pr

imit

ive

Con

cept

sN

onpr

imit

ive

Con

cept

s



((in-rightmost-lane ?self ?line) :percepts ((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line)) )

((driving-well-in-segment ?self ?seg ?line) :percepts ((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg)

(centered-in-lane ?self ?seg ?line) (aligned-with-lane-in-segment ?self ?seg ?line) (steering-wheel-straight ?self)) )

((in-segment ?self ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross)

(segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions ((steer 1)) )Pr

imit

ive

Skil

l Cla

uses

Non

prim

itiv

eSk

ill C

laus

esRepresenting Skills (Methods)



Each concept is defined in terms of other concepts and/or percepts.

Each skill is defined in terms of other skills, concepts, and percepts.

concepts

skills

ICARUS organizes both concepts and skills in a hierarchical manner.

Hierarchical Structure of Memory



Hierarchical Structure of Memory

For example, the skill highlighted here refers directly to the highlighted concepts.

ICARUS interleaves its long-term memories for concepts and skills.

concepts

skills



Basic ICARUS Processes

Concepts are matched bottom up, starting from percepts.

Skill paths are matched top down, starting from intentions.

ICARUS matches patterns to recognize concepts and select skills.

concepts

skills



Impasse-Driven Analytical Learning

Skill Hierarchy

ReactiveExecution

AnalyticalLearning

Expert’s Primitive Skill Sequence

…

Effects of Primitive skills

Learned Skills

If Impasse

Problem

?InitialState

Goal



Learning HTNs by Trace Analysis

concepts

primitive skills



Skill Chaining

concepts

primitive skills




Concept Chaining

concepts

primitive skills




unstack C B

on B A hand-empty

putdown C

putdownable C

unstackable B A

clear A

unstack B A

clear B

unstackable C B

AB

C

AB

C AC

B

ABC

Constructing an Explanation

concepts

primitive skills



ABC

unstack C B

on B A hand-empty

putdown C

putdownable C

unstackable B A

clear A

unstack B A

clear B

unstackable C B

AB

C

AB

C AC

B

concepts

primitive skills

From an Explanation to an HTN



on ?y ?x hand-empty

putdown ?z

putdownable ?z

unstackable ?y ?x

clear ?x

clear ?y

unstackable ?z ?y

unstack ?z ?y

AB

C

AB

C AC

B

ABC

unstack ?y ?x

From an Explanation to an HTN

concepts

primitive skills



Key Ideas of the Approach

Constrained form of hierarchical task networks

Each skill clause/method has a goal as its head

Each method has one (possibly defined) precondition

The resulting semi-lattice makes learning tractable

Learning involves analyzing the expert trace

Explanation draws on a form of goal regression

Each step in the explanation becomes an HTN method

Similar to explanation-based learning for planning but retains the explanation structure



Related Research

Nonincremental, knowledge-lean approaches

Behavioral cloning (Sammut, 1996; Urbancic & Bratko 1994)

Relational induction from traces (e.g., Reddy & Tadepalli, 1997)

Incremental, knowledge-intensive approaches

Explanation-based learning (e.g., Shavlik, 1989; Mooney, 1990)

Derivational analogy (e.g., Veloso & Carbonell, 1993)

Programming by demonstration (e.g., Lau et al., 2003)



Plans for Future Research

Extend framework to use and learn partial-order skills

Augment approach to use known subtasks during learning

Extend method to learn skills with negated goals and subgoals

Modify approach to handle partially observable traces

Extend system to learn skills with uncertain outcomes



End of Presentation

Documents

IL Kickoff Meeting June 20-21, 2006 DARPA Integrated Learning POIROT Project 1 Learning Hierarchical Task Networks by Analyzing Expert Traces Pat Langley