Flexible Planning

Flexible Planning

Ian Miguel

AI Group

Department of Computer Science

University of York

AI Planning

• Plan: Course of action to achieve pre-specified goals.

• Components of a planning problem:• Plan objects.• Initial state.• Goal state.• Operators.

Example – Initial State

• Goals:• Both packages to c4.• Guard to c3.

• pkg1 is valuable.• pkg2 is not.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

• ci: Cities.

• ri: Major roads.

• mi: Mountainous roads.

Example - Operators

• Load-truck (guard present if pkg valuable).• Unload-truck (guard present if pkg valuable).• Guard-boards-vehicle.• Guard-leaves-vehicle.• Drive-truck (main roads only).

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

Example – Solution

1. Drive-truck c1 to c2 via r1.

2. Load-truck pkg2.Guard-boards-vehicle truck.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1

pkg2

guard1

Example - Solution


4. Load-truck pkg2.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

Example - Solution




8. Unload-truck pkg1.Unload-truck pkg2.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

Example - Solution


10.Guard-leaves-vehicle truck.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

Solving Planning Problems

• Many and varied approaches (see [Weld99]).

• Focus here on Graphplan [Blum, Furst 97].• Sound/complete.• Optimal in the number of actions/length of plan.• Constructs a planning graph, of which a valid

plan is a sub-graph.• Easy to translate the search for a consistent sub-

graph into a constraint satisfaction problem.

The Planning Graph• Divided into levels (equivalent to a step).

• Each contains action and proposition nodes.• Level 0 contains propositions that capture the initial

problem state.• Graph extended by instantiating operators whose

preconditions met by propositions in previous level.InitialConditions Actions1 Propositions1

. . .

. . .

. . .

Goals

Mutual Exclusion Constraints

• Record that a pair of actions or propositions cannot occur together in this level of a valid plan.

• Restricts the set of sub-graphs that must be considered.

Mutual Exclusion Constraints• Exclusive actions:

• Inconsistent effects:•Drive-truck c1 to c2 vs. Drive-truck c2 to c3.

• 1st action: truck at c2. 2nd action: truck not at c2.

• Interference:• Between an effect and a precondition.

•Drive-truck c1 to c2 vs. Load-truck at c1.

• Truck no longer at c1 after 1st action.

• Competing needs:• Between preconditions.

•Drive-truck c1 to c3 vs. Guard-boards-truck at c2.

• Truck cannot be at c1 and c2 at the same time.

Mutual Exclusion Constraints

• Exclusive propositions:• Negation:

• Truck at c2 vs. ¬(Truck at c2).

• Inconsistent Support.• Every way of supporting proposition a is exclusive of every

way of supporting proposition b.

a

b

Finding a Valid Plan

• Search for a consistent sub-graph connecting goal propositions and initial conditions.

• If no such sub-graph, expand planning graph by one level and try again.

• Approaches:• Translate into a propositional satisfiability (SAT)

problem, use a specialised SAT solver.• Translate into a constraint satisfaction problem

• Either one large problem, or connected sub-problems.

The Constraint Satisfaction Problem

• Given:1.A finite set of variables.

2.Each variable has an associated finite domain of potential values.

3.A set of constraints over these variables.

• Find:• A complete assignment of values to variables

that satisfies all constraints.

Specify allowed combinations of assignmentsof values to variables.

The CSP Viewpoint

• Variables: proposition nodes.• Domains: actions who assert these propositions as

effects.• Each sub-problem is a small CSP.• Goals, and their domains form a first sub-problem.• Action pre-conditions specify new sub-problems…

GoalSub-problem

Memoisation

• Goal: Solve as few sub-problems as possible.

• Generate memosets from unsolvable sub-problems.• New sets of mutually exclusive propositions.• If a memoset matches the propositions of a sub-

problem, prune the search branch immediately.

GoalSub-problem

Memoset Propagation

• Memosets are propagated forwards.

• If parent sub-problem has no child leading to a solution, propagated information used to create a memoset for it.

GoalSub-problem

A Weakness of Classical Planning

• Inability to compromise:• All goals must be satisfied.• Applicability of an operator in a particular

situation is Boolean.

• Solution:• Introduce flexibility into planning.• Support compromise.

Flexible Planning Problems

• Incorporate preferences into operators and goals.• Describe both as fuzzy relations.

• Map from precondition combinations onto L, a totally ordered satisfaction scale.

• Load-truck• Truck and (valuable) package in same place: l1

• Guard also present: l2

• Can relax some preconditions with associated damage to the satisfaction degree of resultant plan.

Truth Degree

• From a scale, K. Endpoints indicate total truth/falsehood.

• Attached to each proposition.• For example, can express how valuable a package is:

• valuable pkga k. Equivalent to ¬(valuable pkga).

• valuable pkgb k, valuable pkgb k…

• Operators and goals can identify ranges of acceptable truth degrees in their preconditions.

Flexible Plan Quality

• Plan satisfaction degree is combination of all action/goal satisfaction degrees.• Via min.

• Plan quality:• Length combined with plan satisfaction degree• With same satisfaction, shorter preferred.

• Trade length of plan against number and severity of compromises made.

Flexible Example

• Flexible goals:• Both packages to c4.

• pkg2 is worth less, don’t deliver: l1

• Guard to c3.• Can also leave guard at c2 or c4: l2

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

L={l , l1, l2, lT}

Flexible Example

• Operators:• Drive-truck.

• Avoid mountains or: l1

• Load/Unload-truck.• For valuable package, guard present or: l2

• Guard-boards/leaves-truck.

c1 c2 c4

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

L={l , l1, l2, lT}

Flexible Planning Graph

• Actions annotated with their satisfaction degrees.• CSP variable domains expressed as unary fuzzy constraints.

• Prefer to assign an element with l, then lT-1, …

Actions1 Propositions1

. . .

. . .

. . .

l2

l3

l1

Finding Valid Flexible Plans: Flexible Graphplan

• Same basic process.

• Sub-problems are now fuzzy CSPs.

• Overall search is branch and bound:• Find a plan with higher satisfaction degree than

highest currently known.

GoalSub-problem

Short Compromise Plan

1. Load-truck pkg1 truck l2.

2. Drive-truck truck c1 to c2 via r1 lT.

3. Drive-truck truck c2 to c4 via m2 l1.

4. Unload-truck pkg1 truck l2.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

L={l , l1, l2, lT}

4-steps (l1)

Longer Plan, Fewer Compromises



3. Load-truck pkg2 truck lT.



6. Unload-truck pkg1 truck l2, pkg2 truck lT.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

L={l , l1, l2, lT}

6-steps (l2)

Limited Graph Expansion

• A plan with satisfaction degree l has been found.

• Because of min aggregation:• A plan with a higher satisfaction degree than l cannot

contain any action with satisfaction degree l.

• When expanding graph do not add actions with satisfaction degrees l.

• Effect:• Reduce size of planning graph/sub-problems.

Satisfaction Degree Propagation

• Action2 has single precondition, effect of Action1.• Only way to support selection of Action2 at

levela+1 is by also selecting Action1 at levela.• If known when solving sub-problem at levela+1, can

possibly prune branch earlier.• So propagate sat degrees forwards as graph

expanded.

l1

Action1 Action2Levela Levela+1

Satisfaction Degree Propagation

• Stage 1:• Label proposition nodes with max sat degree of those

attached to all actions that assert it as an effect.

l1


l2

l2

Satisfaction Degree Propagation• Stage 2:

• Action satisfaction degree =Min(own sat degree, min(sat degrees attached to each precondition)).

l1


l2

l2l2

Results: FGP vs Boolean Solving

0

5000

10000

15000

20000

25000

30000

1 2 3 4 5 6 7 8 9 10 11 12

Problem Number

Tim

e(m

s)

l1

l2

lT

Boolean

• Short compromise plans can often be found very quickly.

Utility of Limited Graph Expansion/Satisfaction Propagation

0

5000

10000

15000

20000

25000

30000

35000

40000

1 2 3 4 5 6 7 8 9 10 11 12

Problem Number

Tim

e(m

s)

FGP

FGP+LGE

FGP+SP

FGP+LGE+SP

• Limited Graph Expansion, Satisfaction Propagation are Complementary.

Flexible Graphplan: Observations

• It is more expensive to search for a range of plans than for one compromise-free plan.

• But it is often possible to find short, compromise plans quickly.• Supports anytime behaviour.

• Range of plans trade length versus number and severity of the compromises made.

Drowning and Leximin Ordering

• Low satisfaction degree from one action:• Drowns the others because of min aggregation.

• Leximin ordering:• Sort satisfaction degree vector associated with a

solution.• Compare lexicographically:• {l2, l3, l3} >lex {l2, l2, l3}

• Find compromise plans that min based search misses.

Solution (Leximin)



3. Load-truck pkg2 truck lT.

4. Drive-truck truck c2 to c4 via m2 l1.

5. Unload-truck pkg1 truck l2, pkg2 truck lT.

c1 c2 c4

c3

r1

r2 r3

m1 m2

pkg1 pkg2

guard1

L={l , l1, l2, lT}

5-steps (l1, l2, l2, l2)Also finds 2 more plans

that FGP misses.

Finding Leximin-optimal Plans: Leximin FGP

• Again, same basic search process.• Sub-problems are now leximin fuzzy CSPs.• Overall search is branch and bound:

• Find a plan with higher satisfaction degree vector than highest currently known.

GoalSub-problem

Enhancements

• Limited graph expansion works in the same way.• Must now propagate satisfaction degree vectors.• Stage 1:

• Label proposition nodes with max sat degree vector of those attached to all actions that assert it as an effect.

{l1, l2}


{l2, l3}

{l2, l3}{l1}

Satisfaction Degree Vector Propagation

• Stage 2:• Action satisfaction degree vector combines

satisfaction degree of this action, and• Satisfaction degree vectors of each precondition

proposition.

{l1, l2}


{l2, l3}

{l2, l3}{l1, l2, l3}

Action2

Removing Duplicates

• Not correct: {l1, l1, l3}• Only one instance of l1 is guaranteed by selecting Action2.

• Solution: make satisfaction degrees unique.• Composite objects referring to the action that created them.• Simple matter to remove duplicates.

{l1}


{?, l3}

{l1}

{l1}

Results: BBFGP vs. LFGPLogs8

1

10

100

1000

10000

1 2 3 4 5 6 7

Plan Number

Tim

e (m

s) BBFGP

LFGP

LFGP*

• More compromise plans found effectively.

Results: BBFGP vs. LFGPLogs10

1

10

100

1000

10000

100000

1 2 3 4 5 6 7

Plan Number

Tim

e (m

s) BBFGP

LFGP

LFGP*

• Effectiveness of satisfaction vector propagation problem-dependent.

Results: BBFGP vs. LFGP

Logs15

1

10

100

1000

10000

1 2 3 4 5 6 7 8 9

Plan Number

Tim

e (m

s) BBFGP

LFGP

LFGP*

• Larger |L| can mean many possible compromises.

Results: Flexible LogisticsLogs1-15

1

10

100

1000

10000

100000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Plan Number

Su

b-p

rob

lem

s

BBFGP

LFGP

LFGP*

• Explains time difference.• LFGP* in particular is solving many more sub-problems.

Utility of Satisfaction Degree Vector Propagation

• Never degrades performance.• Overhead of propagation and duplicate removal

is compensated for by performance gain.

• Sometimes hugely improves performance:• 17 times is best result so far.• Propagation allows branches of search to be

pruned much earlier.

Leximin FGP: Observations

• More costly than FGP (efficiency is being improved).• But, effectively produces a greater range of

compromise solutions.• Removes drowning.• Not limited by size of L.

• In min version, can only be one plan of sat degree l1, one of sat degree l2…

Conclusions

• Flexible planning overcomes the inability to compromise in classical AI planning.

• Flexible planners produce a range of solutions from a given input problem from which the user can select.• Trade length versus compromises made.

• FGP and LFGP planners effectively solve these problems using hierarchical decomposition of the planning graph.

Related Work

• Conformant Planning:• Knowledge about possible initial states, and possible

outcomes of each action.

• Contingent Planning:• Sensing actions to detect the state of the world during

execution.

• Numerically weighted constraints:• Quantitative means of differentiating plans.

• Pyrrhus:• Replaces goal formulae with utility models.

• Does not associate utilities with individual actions.

Future Work

• Reasoning about:• Time.• Resources.

• Truth degrees go some way towards this.

• Further efficiency improvements:• Improve quality of memoisation.• Smaller memosets are better.

Acknowledgements

• Qiang Shen, University of Edinburgh.

• Peter Jarvis, SRI International.

Resources

• www-users.cs.york.ac.uk/~ianm/FlexiblePlanning.html• FGP, LFGP.

• Example problems.

• References.

Documents

Flexible Planning