25
9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

  • View
    225

  • Download
    3

Embed Size (px)

Citation preview

Page 1: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

9/14: Belief Search Heuristics

Today: Planning graph heuristics for belief searchWed: MDPs

Page 2: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs
Page 3: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Heuristics for Belief-Space Planning

Page 4: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs
Page 5: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Evaluating search/planning: Theoretical

“Worst-case” Look at the complexity

Worst-case complexity of most search/planning problems is NP-complete or higher.

What would it tell us other than “find something else easier (if less interesting) to do”

Consider formal restrictions on domains under which complexity may be lower..

These restrictions may not be natural..

“Average-case” Average-case complexity would

be better But much harder to analyze

What distribution of problems to use?

Similar issues arise in empirical analyses

Page 6: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Evaluating Search/Planning: Empirical

Random problems Look at actual performance on

problems. WHICH PROBLEMS? Randomly generated problems

Which distribution? (hardest problems may live in small phase-transition regions as in SAT)

Find the phase-transition regions, generate random problems there

But who said such problems are at all related to problems that occur?

“Real” or “Benchmark” problems Use “real world” problems

Fine as far as the customers of that problem are boss is concerned, but not clear whether the claims will carry over to any other problems

May have to do analysis to figure out what is it about that domain that makes certain approaches work well

Develop many “benchmark” domains inspired by various real world problems and use them to evaluate the coverage of a planner Easy to abstract way the critical

characteristics when developing benchmarks

See Cushing’s analysis of temporal planning domains

Page 7: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Heuristics for Conformant Planning

First idea: Notice that “Classical planning” (which assumes full observability) is a “relaxation” of conformant planning So, the length of the classical planning solution is a

lowerbound (admissible heuristic) for conformant planning Further, the heuristics for classical planning are also

heuristics for conformant planning (albeit not very informed probably)

Next idea: Let us get a feel for how estimating distances between belief states differs from estimating those between states

Page 8: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Three issues: How many states are there? How far are each of the states from goal? How much interaction is there between states? For example if the length of plan for taking S1 to goal is 10, S2 to goal is 10, the length of plan for taking both to goal could be anywhere between 10 and Infinity depending on the interactions [Notice that we talk about “state” interactions here just as we talked about “goal interactions” in classical planning]

Need to estimate the length of “combined plan” for taking all states to the goal

World’s funniest joke (in USA)

In addition to interactions between literals as in classical planningwe also have interactions between states (belief space planning)

Page 9: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Belief-state cardinality alone won’t be enough…

Early work on conformant planning concentrated exclusively on heuristics that look at the cardinality of the belief state The larger the cardinality of the belief state, the higher its uncertainty, and the

worse it is (for progression) Notice that in regression, we have the opposite heuristic—the larger the cardinality, the

higher the flexibility (we are satisfied with any one of a larger set of states) and so the better it is

From our example in the previous slide, cardinality is only one of the three components that go into actual distance estimation. For example, there may be an action that reduces the cardinality (e.g. bomb the

place ) but the new belief state with low uncertainty will be infinite distance away from the goal.

We will look at planning graph-based heuristics for considering all three components (actually, unless we look at cross-world mutexes, we won’t be considering the

interaction part…)

Page 10: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Using a Single, Unioned GraphPM

QM

RM

P

Q

R

M

A1

A2

A3

Q

R

M

K

LA4

GA5

PA1

A2

A3

Q

R

M

K

L

P

G

A4K

A1P

M

Heuristic Estimate = 2

• Not effective• Lose world

specific support information

Union literals from all initial states into a conjunctive initial graph level

• Minimal implementation

Actions:A1: M

P => KA2: M

Q => KA3: M

R => LA4: K => GA5: L => G

Goal State:G

Initially: (P V Q V R) &

(~P V ~Q) &(~P V ~R) &(~Q V ~R) &

M

Page 11: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Using Multiple GraphsP

M

A1 P

M

K

A1 P

M

KA4

G

R

MA3

R

M

L

A3R

M

L

GA5

PM

QM

RM

Q

M

A2Q

M

K

A2Q

KA4

G

M

G

A4K

A1

M

P

G

A4K

A2Q

M

GA5

L

A3R

M

• Same-world Mutexes

• Memory Intensive

• Heuristic Computation Can be costly

Unioning these graphs a priori would give much savings …

Page 12: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Using a Single, Labeled Graph(joint work with David E. Smith)

P

Q

R

A1

A2

A3

P

Q

R

M

L

A1

A2

A3

P

Q

R

L

A5

Action Labels:Conjunction of Labels of Supporting Literals

Literal Labels:Disjunction of LabelsOf Supporting Actions

PM

QM

RM

KA4

G

K

A1

A2

A3

P

Q

R

M

GA5

A4L

K

A1

A2

A3

P

Q

R

M

Heuristic Value = 5

• Memory Efficient

• Cheap Heuristics

• Scalable• Extensibl

eBenefits from BDD’s

~Q & ~R

~P & ~R

~P & ~Q

(~P & ~R) V (~Q & ~R)

(~P & ~R) V (~Q & ~R) V(~P & ~Q)

M

True

Label Key

Labels signify possible worldsunder which a literal holds

Page 13: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

What about mutexes? In the previous slide, we considered only relaxed plans (thus ignoring any

mutexes) We could have considered mutexes in the individual world graphs to get better

estimates of the plans in the individual worlds (call these same world mutexes) We could also have considered the impact of having an action in one world on the

other world. Consider a patient who may or may not be suffering from disease D. There is a medicine M,

which if given in the world where he has D, will cure the patient. But if it is given in the world where the patient doesn’t have disease D, it will kill him. Since giving the medicine M will have impact in both worlds, we now have a mutex between “being alive” in world 1 and “being cured” in world 2!

Notice that cross-world mutexes will take into account the state-interactions that we mentioned as one of the three components making up the distance estimate.

We could compute a subset of same world and cross world mutexes to improve the accuracy of the heuristics… …but it is not clear whether or not the accuracy comes at too much additional cost to

have reasonable impact on efficiency.. [see Bryce et. Al. JAIR submission]

Page 14: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Heuristics for sensing

We need to compare the cumulative distance of B1 and B2 to goal with that of B3 to goal Notice that Planning cost is related to plan

size while plan exec cost is related to the length of the deepest branch (or expected length of a branch)

If we use the conformant belief state distance (as discussed last class), then we will be over estimating the distance (since sensing may allow us to do shorter branch)

Bryce [ICAPS 05—submitted] starts wth the conformant relaxed plan and introduces sensory actions into the plan to estimate the cost more accurately

As

A

7

12,000

11,000

300

B1

B2

B3

Page 15: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

A set of states is a logical formulaA transition function is also a logical formulaProjection is a logical operation

Symbolic Projection

Page 16: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Symbolic Manipulation with OBDDs

Strategy Represent data as set of OBDDs

Identical variable orderings Express solution method as sequence of symbolic operations

Sequence of constructor & query operations Similar style to on-line algorithm

Implement each operation by OBDD manipulation Do all the work in the constructor operations

Key Algorithmic Properties Arguments are OBDDs with identical variable orderings Result is OBDD with same ordering Each step polynomial complexity

[From Bryant’s slides]

Page 17: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Transition function as a BDD

Belief stateas a BDD

BDDs for representing States & Transition Function

Page 18: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Argument F

Restriction Execution Example

0

a

b

c

d

1 0

a

c

d

1

Restriction F[b=1]

0

c

d

1

Reduced Result

Page 19: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs
Page 20: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Don’t look beyond this point

Page 21: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

A* vs. AO* Search

A* search finds a path in in an “or” graph

AO* search finds an “And” path in an And-Or graph

AO*A* if there are no AND branches

AO* typically used for problem reduction search

Page 22: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs
Page 23: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs
Page 24: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs
Page 25: 9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs