Upload
kevork
View
26
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Flexible Planning. Ian Miguel AI Group Department of Computer Science University of York. AI Planning. Plan : Course of action to achieve pre-specified goals. Components of a planning problem: Plan objects. Initial state. Goal state. Operators. Example – Initial State. c 3. guard 1. - PowerPoint PPT Presentation
Citation preview
Flexible Planning
Ian Miguel
AI Group
Department of Computer Science
University of York
AI Planning
• Plan: Course of action to achieve pre-specified goals.
• Components of a planning problem:• Plan objects.• Initial state.• Goal state.• Operators.
Example – Initial State
• Goals:• Both packages to c4.• Guard to c3.
• pkg1 is valuable.• pkg2 is not.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
• ci: Cities.
• ri: Major roads.
• mi: Mountainous roads.
Example - Operators
• Load-truck (guard present if pkg valuable).• Unload-truck (guard present if pkg valuable).• Guard-boards-vehicle.• Guard-leaves-vehicle.• Drive-truck (main roads only).
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
Example – Solution
1. Drive-truck c1 to c2 via r1.
2. Load-truck pkg2.Guard-boards-vehicle truck.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1
pkg2
guard1
Example - Solution
3. Drive-truck c1 to c2 via r1.
4. Load-truck pkg2.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
Example - Solution
5. Drive-truck c1 to c2 via r1.
6. Drive-truck c2 to c3 via r2.
7. Drive-truck c3 to c4 via r1.
8. Unload-truck pkg1.Unload-truck pkg2.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
Example - Solution
9. Drive-truck c4 to c3 via r3.
10.Guard-leaves-vehicle truck.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
Solving Planning Problems
• Many and varied approaches (see [Weld99]).
• Focus here on Graphplan [Blum, Furst 97].• Sound/complete.• Optimal in the number of actions/length of plan.• Constructs a planning graph, of which a valid
plan is a sub-graph.• Easy to translate the search for a consistent sub-
graph into a constraint satisfaction problem.
The Planning Graph• Divided into levels (equivalent to a step).
• Each contains action and proposition nodes.• Level 0 contains propositions that capture the initial
problem state.• Graph extended by instantiating operators whose
preconditions met by propositions in previous level.InitialConditions Actions1 Propositions1
. . .
. . .
. . .
Goals
Mutual Exclusion Constraints
• Record that a pair of actions or propositions cannot occur together in this level of a valid plan.
• Restricts the set of sub-graphs that must be considered.
Mutual Exclusion Constraints• Exclusive actions:
• Inconsistent effects:•Drive-truck c1 to c2 vs. Drive-truck c2 to c3.
• 1st action: truck at c2. 2nd action: truck not at c2.
• Interference:• Between an effect and a precondition.
•Drive-truck c1 to c2 vs. Load-truck at c1.
• Truck no longer at c1 after 1st action.
• Competing needs:• Between preconditions.
•Drive-truck c1 to c3 vs. Guard-boards-truck at c2.
• Truck cannot be at c1 and c2 at the same time.
Mutual Exclusion Constraints
• Exclusive propositions:• Negation:
• Truck at c2 vs. ¬(Truck at c2).
• Inconsistent Support.• Every way of supporting proposition a is exclusive of every
way of supporting proposition b.
a
b
Finding a Valid Plan
• Search for a consistent sub-graph connecting goal propositions and initial conditions.
• If no such sub-graph, expand planning graph by one level and try again.
• Approaches:• Translate into a propositional satisfiability (SAT)
problem, use a specialised SAT solver.• Translate into a constraint satisfaction problem
• Either one large problem, or connected sub-problems.
The Constraint Satisfaction Problem
• Given:1.A finite set of variables.
2.Each variable has an associated finite domain of potential values.
3.A set of constraints over these variables.
• Find:• A complete assignment of values to variables
that satisfies all constraints.
Specify allowed combinations of assignmentsof values to variables.
The CSP Viewpoint
• Variables: proposition nodes.• Domains: actions who assert these propositions as
effects.• Each sub-problem is a small CSP.• Goals, and their domains form a first sub-problem.• Action pre-conditions specify new sub-problems…
GoalSub-problem
Memoisation
• Goal: Solve as few sub-problems as possible.
• Generate memosets from unsolvable sub-problems.• New sets of mutually exclusive propositions.• If a memoset matches the propositions of a sub-
problem, prune the search branch immediately.
GoalSub-problem
Memoset Propagation
• Memosets are propagated forwards.
• If parent sub-problem has no child leading to a solution, propagated information used to create a memoset for it.
GoalSub-problem
A Weakness of Classical Planning
• Inability to compromise:• All goals must be satisfied.• Applicability of an operator in a particular
situation is Boolean.
• Solution:• Introduce flexibility into planning.• Support compromise.
Flexible Planning Problems
• Incorporate preferences into operators and goals.• Describe both as fuzzy relations.
• Map from precondition combinations onto L, a totally ordered satisfaction scale.
• Load-truck• Truck and (valuable) package in same place: l1
• Guard also present: l2
• Can relax some preconditions with associated damage to the satisfaction degree of resultant plan.
Truth Degree
• From a scale, K. Endpoints indicate total truth/falsehood.
• Attached to each proposition.• For example, can express how valuable a package is:
• valuable pkga k. Equivalent to ¬(valuable pkga).
• valuable pkgb k, valuable pkgb k…
• Operators and goals can identify ranges of acceptable truth degrees in their preconditions.
Flexible Plan Quality
• Plan satisfaction degree is combination of all action/goal satisfaction degrees.• Via min.
• Plan quality:• Length combined with plan satisfaction degree• With same satisfaction, shorter preferred.
• Trade length of plan against number and severity of compromises made.
Flexible Example
• Flexible goals:• Both packages to c4.
• pkg2 is worth less, don’t deliver: l1
• Guard to c3.• Can also leave guard at c2 or c4: l2
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
L={l , l1, l2, lT}
Flexible Example
• Operators:• Drive-truck.
• Avoid mountains or: l1
• Load/Unload-truck.• For valuable package, guard present or: l2
• Guard-boards/leaves-truck.
c1 c2 c4
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
L={l , l1, l2, lT}
Flexible Planning Graph
• Actions annotated with their satisfaction degrees.• CSP variable domains expressed as unary fuzzy constraints.
• Prefer to assign an element with l, then lT-1, …
Actions1 Propositions1
. . .
. . .
. . .
l2
l3
l1
Finding Valid Flexible Plans: Flexible Graphplan
• Same basic process.
• Sub-problems are now fuzzy CSPs.
• Overall search is branch and bound:• Find a plan with higher satisfaction degree than
highest currently known.
GoalSub-problem
Short Compromise Plan
1. Load-truck pkg1 truck l2.
2. Drive-truck truck c1 to c2 via r1 lT.
3. Drive-truck truck c2 to c4 via m2 l1.
4. Unload-truck pkg1 truck l2.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
L={l , l1, l2, lT}
4-steps (l1)
Longer Plan, Fewer Compromises
1. Load-truck pkg1 truck l2.
2. Drive-truck truck c1 to c2 via r1 lT.
3. Load-truck pkg2 truck lT.
4. Drive-truck truck c2 to c3 via r2 lT.
5. Drive-truck truck c3 to c4 via r3 lT.
6. Unload-truck pkg1 truck l2, pkg2 truck lT.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
L={l , l1, l2, lT}
6-steps (l2)
Limited Graph Expansion
• A plan with satisfaction degree l has been found.
• Because of min aggregation:• A plan with a higher satisfaction degree than l cannot
contain any action with satisfaction degree l.
• When expanding graph do not add actions with satisfaction degrees l.
• Effect:• Reduce size of planning graph/sub-problems.
Satisfaction Degree Propagation
• Action2 has single precondition, effect of Action1.• Only way to support selection of Action2 at
levela+1 is by also selecting Action1 at levela.• If known when solving sub-problem at levela+1, can
possibly prune branch earlier.• So propagate sat degrees forwards as graph
expanded.
l1
Action1 Action2Levela Levela+1
Satisfaction Degree Propagation
• Stage 1:• Label proposition nodes with max sat degree of those
attached to all actions that assert it as an effect.
l1
Action1 Action3Levela Levela+1
l2
l2
Satisfaction Degree Propagation• Stage 2:
• Action satisfaction degree =Min(own sat degree, min(sat degrees attached to each precondition)).
l1
Action1 Action3Levela Levela+1
l2
l2l2
Results: FGP vs Boolean Solving
0
5000
10000
15000
20000
25000
30000
1 2 3 4 5 6 7 8 9 10 11 12
Problem Number
Tim
e(m
s)
l1
l2
lT
Boolean
• Short compromise plans can often be found very quickly.
Utility of Limited Graph Expansion/Satisfaction Propagation
0
5000
10000
15000
20000
25000
30000
35000
40000
1 2 3 4 5 6 7 8 9 10 11 12
Problem Number
Tim
e(m
s)
FGP
FGP+LGE
FGP+SP
FGP+LGE+SP
• Limited Graph Expansion, Satisfaction Propagation are Complementary.
Flexible Graphplan: Observations
• It is more expensive to search for a range of plans than for one compromise-free plan.
• But it is often possible to find short, compromise plans quickly.• Supports anytime behaviour.
• Range of plans trade length versus number and severity of the compromises made.
Drowning and Leximin Ordering
• Low satisfaction degree from one action:• Drowns the others because of min aggregation.
• Leximin ordering:• Sort satisfaction degree vector associated with a
solution.• Compare lexicographically:• {l2, l3, l3} >lex {l2, l2, l3}
• Find compromise plans that min based search misses.
Solution (Leximin)
1. Load-truck pkg1 truck l2.
2. Drive-truck truck c1 to c2 via r1 lT.
3. Load-truck pkg2 truck lT.
4. Drive-truck truck c2 to c4 via m2 l1.
5. Unload-truck pkg1 truck l2, pkg2 truck lT.
c1 c2 c4
c3
r1
r2 r3
m1 m2
pkg1 pkg2
guard1
L={l , l1, l2, lT}
5-steps (l1, l2, l2, l2)Also finds 2 more plans
that FGP misses.
Finding Leximin-optimal Plans: Leximin FGP
• Again, same basic search process.• Sub-problems are now leximin fuzzy CSPs.• Overall search is branch and bound:
• Find a plan with higher satisfaction degree vector than highest currently known.
GoalSub-problem
Enhancements
• Limited graph expansion works in the same way.• Must now propagate satisfaction degree vectors.• Stage 1:
• Label proposition nodes with max sat degree vector of those attached to all actions that assert it as an effect.
{l1, l2}
Action1 Action3Levela Levela+1
{l2, l3}
{l2, l3}{l1}
Satisfaction Degree Vector Propagation
• Stage 2:• Action satisfaction degree vector combines
satisfaction degree of this action, and• Satisfaction degree vectors of each precondition
proposition.
{l1, l2}
Action1 Action3Levela Levela+1
{l2, l3}
{l2, l3}{l1, l2, l3}
Action2
Removing Duplicates
• Not correct: {l1, l1, l3}• Only one instance of l1 is guaranteed by selecting Action2.
• Solution: make satisfaction degrees unique.• Composite objects referring to the action that created them.• Simple matter to remove duplicates.
{l1}
Action1 Action2Levela Levela+1
{?, l3}
{l1}
{l1}
Results: BBFGP vs. LFGPLogs8
1
10
100
1000
10000
1 2 3 4 5 6 7
Plan Number
Tim
e (m
s) BBFGP
LFGP
LFGP*
• More compromise plans found effectively.
Results: BBFGP vs. LFGPLogs10
1
10
100
1000
10000
100000
1 2 3 4 5 6 7
Plan Number
Tim
e (m
s) BBFGP
LFGP
LFGP*
• Effectiveness of satisfaction vector propagation problem-dependent.
Results: BBFGP vs. LFGP
Logs15
1
10
100
1000
10000
1 2 3 4 5 6 7 8 9
Plan Number
Tim
e (m
s) BBFGP
LFGP
LFGP*
• Larger |L| can mean many possible compromises.
Results: Flexible LogisticsLogs1-15
1
10
100
1000
10000
100000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Plan Number
Su
b-p
rob
lem
s
BBFGP
LFGP
LFGP*
• Explains time difference.• LFGP* in particular is solving many more sub-problems.
Utility of Satisfaction Degree Vector Propagation
• Never degrades performance.• Overhead of propagation and duplicate removal
is compensated for by performance gain.
• Sometimes hugely improves performance:• 17 times is best result so far.• Propagation allows branches of search to be
pruned much earlier.
Leximin FGP: Observations
• More costly than FGP (efficiency is being improved).• But, effectively produces a greater range of
compromise solutions.• Removes drowning.• Not limited by size of L.
• In min version, can only be one plan of sat degree l1, one of sat degree l2…
Conclusions
• Flexible planning overcomes the inability to compromise in classical AI planning.
• Flexible planners produce a range of solutions from a given input problem from which the user can select.• Trade length versus compromises made.
• FGP and LFGP planners effectively solve these problems using hierarchical decomposition of the planning graph.
Related Work
• Conformant Planning:• Knowledge about possible initial states, and possible
outcomes of each action.
• Contingent Planning:• Sensing actions to detect the state of the world during
execution.
• Numerically weighted constraints:• Quantitative means of differentiating plans.
• Pyrrhus:• Replaces goal formulae with utility models.
• Does not associate utilities with individual actions.
Future Work
• Reasoning about:• Time.• Resources.
• Truth degrees go some way towards this.
• Further efficiency improvements:• Improve quality of memoisation.• Smaller memosets are better.
Acknowledgements
• Qiang Shen, University of Edinburgh.
• Peter Jarvis, SRI International.
Resources
• www-users.cs.york.ac.uk/~ianm/FlexiblePlanning.html• FGP, LFGP.
• Example problems.
• References.