Feng Zhiyong Tianjin University Fall 2008 11 planning

Feng ZhiyongTianjin University

Fall 2008

11 planning

Given:– Initial state, goal state, and actions

Find:– A plan: a sequence of actions that when applied, beginning with the initial state, transforms the world into a goal state

11.1 The Planning Problem 11.2 Planning with State-Space Search 11.3 Partial-Order Planning 11.4 Planning Graphs 11.5 Planning with Propositional Logic 11.6 Analysis of Planning Approaches 11.7 Summary

an ordinary problem-solving agent may face:◦The problem-solving agent can be

overwhelmed by irrelevant actions。

◦Difficult to find a good heuristic function.

◦Inefficient: cannot take advantage of problem decomposition

The agent is the sole cause of change in the environment World is accessible (i.e. the agent knows all it need to know about the environment)

Closed World Assumption:◦ State description lists all that is true◦ Anything else is assumed false

The planning task is very difficult, even with such a simplified framework!

Dressing◦ Initial state: socks, shoes◦ Goal state: socks on, shoes on correct feet, ◦ Actions: PutOnSock(f), PutOnShoe(f)

Blocks World◦ Initial state: some configuration of blocks on a table◦ Goal State: another configuration (stacked?)◦ Actions: Pickup(x), Putdown(x), Stack(x,y),

Unstack(x,y) Shopping

◦ Initial state: at home, with no items◦ Goal state: at home, having a list of items◦ Actions: Go(store), Buy(item), etc…

Facts: ground literals with variables

◦ Poor Unknown At(Plane, Beijing)

Situations: conjunction of facts

◦ At(Plane1, Beijing) ⋀ At(Plane2, Tianjin) ◦ Poor ⋀ Unknown

Goal: conjunction of positive literals◦ Variables allowed, assume all variables are existential

◦ Rich ⋀ Famous, At(Plane1, Xi’an)

Actions:◦ Action name◦ Preconditions: conjunction of positive literals that

defines if action is legal/applicable◦ Effects: conjunction of positive literals (called the add

list) and negative literals (called the delete list)

Action(Fly(p, from, to), PRECOND:AT(P,from) ⋀ Plane(p) ⋀ Airport(from) ⋀ Airport(to) EFFECT : ¬At(p,from) ⋀ At(p, to))

◦ delete list, add list ◦ Assumption: everything stays the same unless

explicitly on the delete list (avoids frame problem)

Result of an action:◦ The positive literals in the effect are added to the

state.◦ Any negative literals in the effect that match

existing positive literals in the state make the positive literals disappear.

•Exceptions:◦ Positive literals already in the state are not added

again.◦ Negative literals that match with nothing in the

state are ignored.

A B C

ABC

The planning problem can be seen as a search problem.

We can move from one state of the problem to another in both a forward and backward direction because the actions are defined in terms of both preconditions and effects.

Forward search: progression planning Backward search: regression planning

Progression: Forward Chaining◦ Like state-space search except for representation◦ Inefficient due to large situation space to explore

Regression: Backward Chaining◦ Start from the goal state and solve its sub-

goals(preconditions)◦ More efficient and goal-directed than progression

(fewer applicable operators)

Forward

Backward

The initial state of the search is the initial state from the planning problem. In general, each state will be a set of positive ground literals; literals not appearing are false.

The actions that are applicable to a state are all those whose preconditions are satisfied. The successor state resulting from an action is generated by adding the positive effect literals and deleting the negative effect literals. (In the first-order case, we must apply the unifier from the preconditions to the effect literals.) Note that a single successor function works for all planning problems - a consequence of using an explicit action representation.

The goal test checks whether the state satisfies the goal of the planning problem.

The step cost of each action is typically. Although it would be easy to allow different costs for different actions, this is seldom done by STRIPS planners.

Forward planning is equivalent to forward search and is very inefficient. In fact, it suffers from all the caveats of the underlying search algorithm.

A better way to solve a planning problem is through backward state-space search, i.e. by starting at the goal and working our way back to the initial state.

Advantage: we need only consider moves that achieve part of the goal!

In STRIPS, there is no problem in finding the predecessors of a state.

the goal in our 10-airport air cargo problem

Searching backwards is sometimes called regression planning

PreCon:

Consistent: Not be consistent ◦ Any positive effects of A that appear in G are deleted.◦ Each precondition literal of A is added, unless it already

appears.

Substitution in FOL

State-space search (forward and backward) is not efficient enough.

Can we perform A* style search with an admissible heuristic?

Key Assumption Sub-goals are independent of each other

◦ Divide and conquer the problem without worryingabout other parts of the problem

e.g. With putting on socks: the order doesn’t matter;

putting on left sock first doesn’t preclude putting on the right◦ Whole plan is sum of all sub-plans

This heuristic is:◦ –Optimistic (admissible) when the goals do

interact i.e. an action in a subplandeletes a goal achieved by another subplan.

◦ –Pessimistic (inadmissible) when subplans contain redundant actions

This heuristic assumes that all actions have only positive effects.

For example, if an action has the effects A and ¬B, the empty-delete list heuristic considers the action as if it only had the effect A.

In that way, we assume that no action can delete the literals achieved by another action.

Up to now, plans have been totally ordered i.e. the exact temporal relationships between the actions are known: Ai is after Ai-1 and before Ai+1

In partially ordered plans, we don’t have to specify the temporal relationships between all the actions.

In practice, this means that we can identify actions that happen in any order.

Total-order planner (linear):◦ – Maintains a partial solution as a “totally

ordered” list of steps found so far◦ – e.g. STRIPS◦ – e.g. Situation-space progression/regression

planners Partial-order planner (non-linear):

◦ – Only maintains partial order◦ – Constraints on the ordering of steps in the plan

Principle of Least Commitment: don’t make an ordering choice unless required to do so◦ – Property of partial-order planners (POP)◦ – Not a property of situation-space planners: they

commit to an ordering when an operator is applied

Keep the ordering choice as general as possible

Reduces the amount of backtracking needed◦ – Don’t waste time undoing steps

Ordering constraints:◦ – S1 < S2: S1 before S2◦ – S1 must occur before S2 ◦ but not necessarily immediately before it◦ – Thin links

Causal constraints:◦ – S1 S2: S1 achieves c for S2◦ – S1 has a literal c in its effect list that is needed

to satisfy part of the precondition for S2

c

An action threatens a causal link when it might delete the goal that the link satisfies.

Example: in the dynamic blocks world, pickup(a) has “handempty”in its effects so it threatens the link (putdown(c,b),handempty,pickup(d))

The consequences of adding an action that breaks a causal link into the plan are serious. We have to make sure to remove the threat by demotion (move earlier) or promotion (move later).

A open (i.e. unsatisfied) precondition is one that does not have a causal link to it. How is an open precondition p for step S solved?◦ – Step addition: add new plan step R that contains

p in its Effects list◦ – Simple establishment: find an existing plan step

R prior to S that has p in its Effects list◦ – Then add a causal and ordering links from R to S

To keep the search focused, the planner only

adds steps that achieve an open precondition

POP is sound and complete POP Plan is a solution if:

◦ All preconditions are supported (by causal links), i.e., no open conditions.

◦ No threats◦ Consistent temporal ordering

By construction, the POP algorithm reaches a solution plan

“Fast Planning Through Planning Graph Analysis,” Artificial Intelligence,

Propositionalize actions and situations Construct a planning graph

◦ Levels (e.g. time steps) with potential action nodes

Include persistence actions (inactions) to deal with frame prob.

◦ Link actions to situation nodes between each level◦ Indicate which situation descriptions are mutually

exclusive with “mutex links”

Planning graphs work only for propositional planning problems

Inconsistent effects: one action negates an effect of the other. For example Eat(Cake) and the persistence of Have(Cake) have inconsrstent effects because they disagree on the effect Have ( Cake).

Interference: one of the effects of one action is the negation of a precondition of the other. For example Eat(Cake) interferes with the persistence of Have(Cake) by negating its precondition.

Competing needs: one of the preconditions of one action is mutually exclusive with a precondition of the other. For example, Bake( Cake) and Eat (Cake) are mutex because they compete on the value of the Have( Cake) precondition.

Literals increase monotonically: Once a literal appears at a given level, it will appear at all subsequent levels. This is because of the persistence actions; once a literal shows up, persistence actions cause it to stay forever.

Actions increase monotonically: Once an action appears at a given level, it will appear at all subsequent levels. This is a consequence of literals' increasing; if the preconditions of an action appear at one level, they will appear at subsequent levels, and thus so will the action.

Mutexes decrease monotonically: If two actions are mutex at a given level Ai, then they will also be mutex for all previous levels at which they both appear. The same holds for mutexes between literals. It might not always appear that way in the figures, because the figures have a simplification: they display neither literals that cannot hold at level Si nor actions that cannot be executed at level Ai. We can see that "mutexes decrease monotonically" is true if you consider that these invisible literals and actions are mutex with everything.

“Planning as Satisfiability,” ◦ Initial state ⋀all possible action descriptions ⋀

goal Recall that a planning environment can be

expressed in situation calculus◦ Axioms of the form a→b (rather ﹁ a ⋁ b)

Recall that plans are considered to be a conjunction of sub-goals:

◦ Start state ∧axioms ∧ goals

The basic idea with SAT-Plan:◦ Describe the environment in situation calculus◦ Propositionalize all the axioms

disjunctions),enumerated for each of an arbitrary number of steps

◦ Conjoin all instantiated rules with the initial state and goal descriptions

This provides us with a PL formula in CNF, which we can try to solve using HC, SA, Tabu, GAs, etc.

Initial state: Time: T0. Some propositions are unknown

Time: T1 successor--state axioms

KB: initial state ⋀ successor-state axioms ⋀ Goal precondition axioms: Action exclusion axioms

◦ ¬(Fly(P2,J FK, SFO)0 ⋀ Fly(P2,J FK, LAX)')

The number of clauses is larger, For example, with 10 time steps, 12 planes, and 30 airports, the complete action exclusion axiom has 583 million clauses. (T x Planes x I Airportls2 )

reduced to a set of binary predicates (symbol splitting) Fly(P1, SFO, JFK)0,

T x Act x P x O

Parallel actions Fly(P1, SFO, JFK)0 and Fly(P2, JFK, SFO)0

State-space search (STRIPS) can be directed using logic, but is still incomplete

Partially-ordered planners are complete, but

are practically limited in the number of steps they can accurately plan

Planning was sort of a “dead” AI research area for a while

Since 1992, there have been several newapproaches to the planning task discovered (e.g.Graph-Plan and SAT-Plan) that can find plans upto thousands of steps long

D. Weld, “Recent advances in AI planning,” AI Magazine,1999

◦ Excellent coverage of these new approaches

Planning agents search to find a sequence of actions to achieve a goal using a flexible representation of states, operators, goals, plans◦ – STRIPS language describes actions in terms of

their preconditions and effects

Not feasible to search through the entire space as was done with search agents◦ Regression planning focuses the search◦ STRIPS assumes sub-goals are independent◦ POP uses principle least commitment, declobbering

Partial-Order Planning (POP) is a sound and complete planning algorithm, but can be

limited by plan length

Recent advances in AI planning reduce theplanning environment to other problems(Graphs, SAT formulas) that can be solvedusing other methods

Documents

Feng Zhiyong Tianjin University Fall 2008 11 planning