70
3/4 The slides on quotienting were added after the class to reflect the white-board discussion in the class

3/4 The slides on quotienting were added after the class to reflect the white-board discussion in the class

Embed Size (px)

Citation preview

Page 1: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

3/4

The slides on quotienting were added after the class to reflect the white-board discussion in the class

Page 2: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Thoughts on Candidate Set semantics for Temporal Planning

Doing Temporal Planning Correctly[In search of Complete Position-Constrained Planner]

Page 3: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Need

Talking about “complete” and “completely optimal” seems to make little sense unless we first define the space over which we want completeness

Qn: What is the space over which candidate set of a temporal plan is defined? For classical planning, we know it is over “action sequences”

Interestingly, even partial-order planners are essentially aiming for completeness over these action sequences

Page 4: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Dispatches as candidates

We can define candidate sets in terms of “dispatches” A dispatch is a set of 3 tuples { <a, sa, ea>} where

a is a ground (durative) action Sa is the start time for the action a Ea is the end time for the action a

For fixed duration actions, ea is determined given sa

Completeness, optimality etc should be defined over these dispatches eventually..

Page 5: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Quotient spaces The space of dispatches is “dense” when you have real valued time points It is more convenient to think of search in terms of quotient spaces defined

over the space of dispatches In fact, it seems necessary that we search in quotient spaces for temporal

planning (especially with real-valued time) Since we want the complexity of planning be somehow related to the

number of actions in the plan, and not on their durations(?) A quotient space essentially involves setting up disjoint equivalence classes

over the base space SNLP’s partial plans actually set up a quotient space over the ground

operator sequences (otherwise, the space of partially ordered plans will be much larger than the space of sequences..)

There are multiple ways of setting up quotient spaces over dispatches You can discuss completeness of any planner w.r.t. any legal quotient space.

But.. Some quotient spaces may be more natural to discuss some planners…

Page 6: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Start/End point permutations (SEPP) One quotient space over dispatches is to consider the space of

permutations over the start and end points of actions Specifically, we consider the space of sequences over the alphabet {as

ae} over all actions where: If the sequence contains as, it must contain ae (and vice versa) as must come before ae in the sequence If the sequence contains end points of two actions a1 and a2, then their order

must not violated durations of the actions If d(a1)< d(a2), then we can’t have ..a1s…a2s…a2e..a1e.. In the sequence

Note that each element of SEPP space is a representative for a possibly infinite number of dispatches

Completeness over the SEPP space is a necessary condition for completeness over dispatch space

Page 7: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

POP space

The space of partially ordered causal link plans that VHPOP/Zeno search in should be seen as quotienting further over the SEPP space Similar to the way SNLP plans can be seen as

quotienting over the action sequences.

Page 8: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

SAPA-space?

Another way of setting up a quotient space over dispatches is to consider specific dispatches themselves as the prototypes of an equivalence class of dispatches

Page 9: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Prototype-based quotient spaces

SAPA seems to be easiest to understand in terms of associating a specific dispatch as the representative of a set of dispatches It then only searches over these dispatches ..so it will be incomplete if the optimal solution of a problem is

not in the space of these canonical dispatches The basic result of [Cushing et al 2007] can be understood as

saying that there is no easy way to set up a finite set of representative dispatches that will be complete for all problems This, I believe, is the lesson of the failed quest for complete DEP

planners Left-shifted plans as representatives?

Page 10: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Quotient Space & Navigation??

Sapa can be understood as Trying to navigate in a quotient space of left-shifted

dispatches But with an incomplete navigational strategy

Navigation is being effected through epochs Our inability to find a good epoch-based navigation seems

to suggest that there is no natural way to navigate this space?

Page 11: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Left-shifted plans

Two plans are equivalent if they have the same happening sequence

The canonical representation

Page 12: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Mid-term Feedback.. 9 out of 12 gave feedback. I will post them all un-edited. People generally happy (perhaps embarrassingly happy) with the way the

class is going One person said it is all too overwhelming and the pace and coverage should be

reduced significantly Readings: A mixture of reading before and after. Homeworks: Majority seem happy that they force them to re-read the paper.

There seems to be little support for “more” homework One person said they should be more challenging and go beyond readings.

Semester project: Majority seem to be getting started; and want to spend time on “their” project rather than homeworks etc.

Interactivity: People think there is enough discussion (I beg to disagree—but I am just an instructor). One person thought that there should be more discssion--and suggested design of

more incentives for discussion (sort of like the blog discussion requirement)

Page 13: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Temporal Constraints

Page 14: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Temporal Constraints

Qualitative Interval constraints (and

algebra) Point constraints (and

algebra)

Metric constraints Best seen as putting

distance ranges over time points

General temporal constraint reasoning is NP-hard. Tractable subclasses exist.

• Hybrid: allow qualitative and quantitative constraints

Most temporal constraint formalisms model only binary constraints

Page 15: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Tradeoffs: Progression/Regression/PO Planning for metric/temporal planning

Compared to PO, both progression and regression do a less than complete job of handling concurrency (e.g. slacks may have to be handled through post-processing).

Progression planners have the advantage that the exact amount of a resource is known at any given state. So, complex resource constraints are easier to verify. PO (and to some extent regression), will have to verify this by posting and then verifying resource constraints.

Currently, SAPA (a progression planner) does better than TP4 (a regression planner). Both do oodles better than Zeno/IxTET. However TP4 could be possibly improved significantly by giving up the insistence

on admissible heuristics Zeno (and IxTET) could benefit by adapting ideas from RePOP.

Page 16: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Interleaving-Space: TEMPO

Delay dispatch decisions until afterwards Choose

Start an action End an action Make a scheduling decision

Solve temporal constraints

Temporally Simple Complete, Optimal

Temporally Expressive Complete, Optimal

Salvaging State-space Temporal Planning

light

fix

match

fuse

fix

fix light

fusefix light

matchfusefix light

Page 17: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Y

Qualitative Temporal Constraints(Allen 83)

x before y x meets y x overlaps y x during y x starts y x finishes y x equals y

X Y

X Y

X Y

YX

YX

Y X

X

y after x y met-by x y overlapped-by x y contains x y started-by x y finished-by x y equals x

Page 18: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Intervals can be handled directly

The 13 in the previous page are primitive relations. The relation between a pair of intervals may well be a disjunction of these primitive ones: A meets B OR A starts B

There are “transitive” axioms for computing the relations between A and C, given the relations between A and B & B and C A meets B & B starts C => A starts C A starts B & B during C => ~ [C before A] Using these axioms, we can do constraint propagation directly on interval

relations; to check for tight relations among any given pair of relations (as well as consistency of a set of relations)

Allen’s Interval Algebra Intervals can also be handled in terms of their start and end points. This latter

is what we will see next.

Page 19: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Qualitative Temporal ConstraintsMaybe Expressed as Inequalities

(Vilain, Kautz 86) x before y X+ < Y-

x meets y X+ = Y-

x overlaps y (Y- < X+) & (X- < Y+) x during y (Y- < X-) & (X+ < Y+) x starts y (X- = Y-) & (X+ < Y+) x finishes y (X- < Y-) & (X+ = Y+) x equals y (X- = Y-) & (X+ = Y+) Inequalities may be expressed as binary interval relations:

X+ - Y- < [-inf, 0]

Page 20: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Metric Constraints Going to the store takes at least 10 minutes and at most

30 minutes.→ 10 < [T+(store) – T-(store)] < 30

Bread should be eaten within a day of baking.→ 0 < [T+(baking) – T-(eating)] < 1 day

Inequalities, X+ < Y- , may be expressed as binary interval relations:→ - inf < [X+ - Y-] < 0

Page 21: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Metric Time: Quantitative Temporal Constraint Networks

(Dechter, Meiri, Pearl 91)

A set of time points Xi at which events occur.

Unary constraints

(a0 < Xi < b0 ) or (a1 < Xi < b1 ) or . . .

Binary constraints

(a0 < Xj - Xi < b0 ) or (a1 < Xj - Xi < b1 ) or . . .

Not n-ary constraints

STN (simple temporal network)is a TCN that has no disjunctive constraints (each constraint has one interval)

Page 22: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

TCSP Are Visualized UsingDirected Constraint Graphs

1 3

42

0[10,20]

[30,40][60,inf]

[10,20]

[20,30][40,50]

[60,70]

Page 23: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

TCSPs vs CSPs

TCSP is a subclass of CSPs with some important properties The domains of the variables are totally ordered

The domains of the variables are continuous Most queries on TCSPs would involve reasoning over all

solutions of a TCSP (e.g. earliest/latest feasible time of a temporal variable) Since there are potentially an infinite number of solutions to a TCSP,

we need to find a way of representing the set of all solutions compactly

Minimal TCSP network is such a representation

Page 24: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

TCSP Queries(Dechter, Meiri, Pearl, AIJ91)

Is the TCSP consistent? Planning What are the feasible times for each X i? What are the feasible durations between

each Xi and Xj? What is a consistent set of times? Scheduling

Dispatch

What are the earliest possible times? Scheduling What are the latest possible times?All of these can be done if we compute the minimal equivalent network

Page 25: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Constraint Tightness & Minimal Networks

A TCSP N1 is considered minimal network if there is no other network N2 that has the same solutions as N1, and has at least one tighter constraint than N1 Tightness means there are fewer valid composite labels for the variables.

This has nothing to do with the “syntactic complexity” of the constraint A Constraint a[ 1 3]b is tighter than a constraint a[0 10]b A constraint a[1 1.5][1.6 1.9][1.9 2.3] [2.3 4.8] [5 6]b is tighter than a constraint

a[0 10]b Computation of minimal networks, in general, involves doing two

operations: Intersection over constraints Composition over constraints

For each path p in the network, connecting a pair of nodes a and b, find the path constraint between a and b (using composition)

Intersect all the constraints between a pair of nodes a and b to find the tightest constraint between a and b

Can lead to “fragmentation of constraints” in the case of disjunctive TCSPs…

Page 26: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Union/Composition/Intersection of Temporal Constraints

Page 27: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Operations on Constraints:

Intersection

And

Composition

1 3

42

0[10,20]

[30,40][60,inf]

[10,20]

[20,30][40,50]

[60,70]

1 3

42

0[10,20]

[30,40][60,inf]

[10,20]

[20,30][40,50]

[60,70]

Compose [10,20] with [30,40][60,inf] to get constraint between 0 and 3

Page 28: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

An example where minimal network is different from the original one.

1 30[10,20] [30,40]

[0,100]

1 30[10,20] [30,40]

[0,100]

[40,60]

To compute the constraint between 0 and 3, we first compose [10,20] and [30,40] to get [40,60] we then intersect [40,60] and [0,100] to get [40,60]

Page 29: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Computing Minimal Networks Using Path Consistency

Minimal networks for TCSPs can be computed by ensuring “path consistency” For each triple of vertices i,j,k

C(i,k) := C(i,k) .intersection. [C(i,j) .compose. C(j,k)]

For STP’s we are guaranteed to reach fixpoint by the time we visit each constraint once I.e., outerloop executes only

once. For Disjunctive TCSPs, enforcing

path consistency is NP-hard Shouldn’t be surprising…

consistency of disjunctive precedence constraints is NP-hard

“Fragmentation” happens Approximation schemes

possible

Page 30: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Solving Disjunctive TCSPs: Split disjunction

Suppose we have a TCSP, where just one of the constraints is dijunctive: a [1 2][5 6] b We have two STPs one in which the constraint a[1 2]b is

there and the other contains a[5 6]b Disjunctive TCSP’s can be solved by solving the

exponential number of STPs Minimal network for DTP is the union of minimal networks for

the STPs This is a brute-force method; Exponential number of STPs—

many of which have significant overlapping constraints.

Page 31: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

To Query an STN Map to aDistance Graph Gd = < V,Ed >

70

1 3

42

020

50

-10

40

-30

20 -10

-40-60

1 3

42

0[10,20] [30,40]

[10,20]

[40,50]

[60,70]

Tij = (aij Xj - Xi bij)Xj - Xi bij

Xi - Xj - aij

Edge encodes an upper bound on distance to target from source.

Page 32: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Conjoined Paths are Computed using All Pairs Shortest Path

(e.g., Floyd-Warshall’s algorithm )

1. for i := 1 to n do dii 0;

2. for i, j := 1 to n do dij aij;

3. for k := 1 to n do4. for i, j := 1 to n do5. dij min{dij, dik + dkj};

ik

j

Page 33: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 0

d-graph

Shortest Paths of Gd

70

1 2

43

020

50

-10

40

-30

20 -10

-40-60

Page 34: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

STN Minimum Network

0 1 2 3 4

0 [0] [10,20] [40,50] [20,30] [60,70]

1 [-20,-10] [0] [30,40] [10,20] [50,60]

2 [-50,-40] [-40,-30] [0] [-20,-10] [20,30]

3 [-30,-20] [-20,-10] [10,20] [0] [40,50]

4 [-70,-60] [-60,-50] [-30,-20] [-50,-40] [0]

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 0

d-graph STN minimum network

Page 35: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Testing Plan Consistency

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 0

d-graph70

1 2

43

020

50

-10

40

-30

20 -10

-40-60

No negative cycles: -5 > TA – TA = 0

Page 36: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Latest Solution

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 070

1 2

43

020

50

-10

40

-30

20 -10

-40-60

d-graph

Node 0 is the reference.

Page 37: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Earliest Solution

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 070

1 2

43

020

50

-10

40

-30

20 -10

-40-60

d-graph

Node 0 is the reference.

Page 38: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Solution: Earliest Times

70

1 3

42

020

50

-10

40

-30

20 -10

-40

-60

S1 = (-d10, . . . , -dn0)

Page 39: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Scheduling:Feasible Values

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 0

d-graph

• X1 in [10, 20]

• X2 in [40, 50]

• X3 in [20, 30]

• X4 in [60, 70]

Latest Times

Earliest Times

Page 40: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Solution by Decomposition

0 1 2 3 4

0 0 20 50 30 70

1 -10 0 40 20 60

2 -40 -30 0 -10 30

3 -20 -10 20 0 50

4 -60 -50 -20 -40 0

d-graph • Select value for 4, consistent with 1,2 & 3O(N2)

• Select value for 2, consistent with 1

45

• Select value for 1

15

• Select value for 3, consistent with 1 & 2

30

Page 41: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

10/30 (Don’t print hidden slides)

Page 42: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Multi-objective search

Multi-dimensional nature of plan quality in metric temporal planning: Temporal quality (e.g. makespan, slack—the time when a

goal is needed – time when it is achieved.) Plan cost (e.g. cumulative action cost, resource consumption)

Necessitates multi-objective optimization: Modeling objective functions Tracking different quality metrics and heuristic estimation Challenge: There may be inter-dependent

relations between different quality metric

Page 43: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Example

Option 1: Tempe Phoenix (Bus) Los Angeles (Airplane) Less time: 3 hours; More expensive: $200

Option 2: Tempe Los Angeles (Car) More time: 12 hours; Less expensive: $50

Given a deadline constraint (6 hours) Only option 1 is viable Given a money constraint ($100) Only option 2 is viable

Tempe

Phoenix

Los Angeles

Page 44: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Solution Quality in the presence of multiple objectives

When we have multiple objectives, it is not clear how to define global optimum

E.g. How does <cost:5,Makespan:7> plan compare to <cost:4,Makespan:9>? Problem: We don’t know what the user’s utility metric

is as a function of cost and makespan.

Page 45: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Solution 1: Pareto Sets

Present pareto sets/curves to the user A pareto set is a set of non-dominated solutions

A solution S1 is dominated by another S2, if S1 is worse than S2 in at least one objective and equal in all or worse in all other objectives. E.g. <C:4,M9> dominated by <C:5;M:9>

A travel agent shouldn’t bother asking whether I would like a flight that starts at 6pm and reaches at 9pm, and cost 100$ or another ones which also leaves at 6 and reaches at 9, but costs 200$.

A pareto set is exhaustive if it contains all non-dominated solutions Presenting the pareto set allows the users to state their preferences implicitly by

choosing what they like rather than by stating them explicitly. Problem: Exhaustive Pareto sets can be large (exponentially large in many cases).

In practice, travel agents give you non-exhaustive pareto sets, just so you have the illusion of choice

Optimizing with pareto sets changes the nature of the problem—you are looking for multiple rather than a single solution.

Page 46: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Solution 2: Aggregate Utility Metrics Combine the various objectives into a single utility measure

Eg: w1*cost+w2*make-span Could model grad students’ preferences; with w1=infinity, w2=0

Log(cost)+ 5*(Make-span)25 Could model Bill Gates’ preferences.

How do we assess the form of the utility measure (linear? Nonlinear?) and how will we get the weights?

Utility elicitation process Learning problem: Ask tons of questions to the users and learn their utility function to fit their

preferences Can be cast as a sort of learning task (e.g. learn a neual net that is consistent with the examples)

Of course, if you want to learn a true nonlinear preference function, you will need many many more examples, and the training takes much longer.

With aggregate utility metrics, the multi-obj optimization is, in theory, reduces to a single objective optimization problem *However* if you are trying to good heuristics to direct the search, then since estimators are

likely to be available for naturally occurring factors of the solution quality, rather than random combinations there-of, we still have to follow a two step process

1. Find estimators for each of the factors2. Combine the estimates using the utility measure THIS IS WHAT IS DONE IN SAPA

Page 47: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Sketch of how to get cost and time estimates

Planning graph provides “level” estimates Generalizing planning graph to “temporal planning graph” will allow us to

get “time” estimates For relaxed PG, the generalization is quite simple—just use bi-level

representation of the PG, and index each action and literal by the first time point (not level) at which they can be first introduced into the PG

Generalizing planning graph to “cost planning graph” (i.e. propagate cost information over PG) will get us cost estimates

We discussed how to do cost propagation over classical PGs. Costs of literals can be represented as monotonically reducing step functions w.r.t. levels.

To estimate cost and time together we need to generalize classical PG into Temporal and Cost-sensitive PG

Now, the costs of literals will be monotonically reducing step functions w.r.t. time points (rather than level indices)

This is what SAPA does

Page 48: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

SAPA approach

Using the Temporal Planning Graph (Smith & Weld) structure to track the time-sensitive cost function: Estimation of the earliest time (makespan) to achieve all goals. Estimation of the lowest cost to achieve goals Estimation of the cost to achieve goals given the specific

makespan value. Using this information to calculate the heuristic

value for the objective function involving both time and cost

Involves propagating cost over planning graphs..

Page 49: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Heuristics in Sapa are derived from the Graphplan-stylebi-level relaxed temporal planning graph (RTPG)

Progression; so constructed anew for each state..

Page 50: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Relaxed Temporal Planning Graph

Relaxed Action:No delete effects

May be okay given progression planningNo resource consumption

Will adjust later

PersonAirplane

Person

A B

Load(P,A)

Fly(A,B) Fly(B,A)

Unload(P,A)

Unload(P,B)

Init Goal Deadline

t=0 tg

while(true) forall Aadvance-time applicable in S S = Apply(A,S)

Involves changing P,,Q,t{Update Q only with positive effects; and only when there is no other earlier event giving that effect}

if SG then Terminate{solution}

S’ = Apply(advance-time,S) if (pi,ti) G such that ti < Time(S’) and piS then Terminate{non-solution} else S = S’end while; Deadline goals

RTPG is modeled as a time-stamped plan! (but Q only has +ve events)

Note: Bi-level rep; we don’t actually stack actions multiple times in PG—we just keep track the first time the action entered

Page 51: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Heuristics directly from RTPG

For Makespan: Distance from a state S to the goals is equal to the duration between time(S) and the time the last goal appears in the RTPG.

For Min/Max/Sum Slack: Distance from a state to the goals is equal to the minimum, maximum, or summation of slack estimates for all individual goals using the RTPG. Slack estimate is the difference

between the deadline of the goal, and the expected time of achievement of that goal.

Proof: All goals appear in the RTPG at times smalleror equal to their achievable times.

ADMISSIBLE

PersonAirplane

Person

A B

Load(P,A)

Fly(A,B) Fly(B,A)

Unload(P,A)

Unload(P,B)

Init Goal Deadline

t=0 tg

PersonAirplane

Person

A B

Load(P,A)

Fly(A,B) Fly(B,A)

Unload(P,A)

Unload(P,B)

Init Goal Deadline

t=0 tg

Page 52: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Heuristics from Relaxed Plan Extracted from RTPG

RTPG can be used to find a relaxed solution which is thenused to estimate distance from a given state to the goals

Sum actions: Distance from a state S to the goals equals the number of actions in the relaxed plan.

Sum durations: Distance from a state S to the goals equals the summation of action durations in the relaxed plan.

PersonAirplane

Person

A B

Load(P,A)

Fly(A,B) Fly(B,A)

Unload(P,A)

Unload(P,B)

Init Goal Deadline

t=0 tg

Page 53: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Resource-based Adjustments to Heuristics

Resource related information, ignored originally, can be used to improve the heuristic values

Adjusted Sum-Action:

h = h + R (Con(R) – (Init(R)+Pro(R)))/R

Adjusted Sum-Duration:

h = h + R [(Con(R) – (Init(R)+Pro(R)))/R].Dur(AR)

Will not preserve admissibility

Page 54: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

The (Relaxed) Temporal PG

Tempe

Phoenix

Los Angeles

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0 t = 0.5 t = 1 t = 1.5 t = 10

Page 55: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Time-sensitive Cost Function

Standard (Temporal) planning graph (TPG) shows the time-related estimates e.g. earliest time to achieve fact, or to execute action

TPG does not show the cost estimates to achieve facts or execute actions

Tempe

Phoenix

L.A

Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour

cost

time0 1.5 2 10

$300

$220

$100

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0 t = 0.5 t = 1 t = 1.5 t = 10

Page 56: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Estimating the Cost Function

Tempe

Phoenix

L.A

time0 1.5 2 10

$300

$220

$100

t = 1.5 t = 10

Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour

1

Drive-car(Tempe,LA)

Hel(T,P)

Shuttle(T,P)

t = 0

Airplane(P,LA)

t = 0.5

0.5

t = 1

Cost(At(LA)) Cost(At(Phx)) = Cost(Flight(Phx,LA))

Airplane(P,LA)

t = 2.0

$20

Page 57: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Observations about cost functions

Because cost-functions decrease monotonically, we know that the cheapest cost is always at t_infinity (don’t need to look at other times) Cost functions will be monotonically decreasing as long as there are no exogenous

events Actions with time-sensitive preconditions are in essence dependent on exogenous

events (which is why PDDL 2.1 doesn’t allow you to say that the precondition must be true at an absolute time point—only a time point relative to the beginning of the action

If you have to model an action such as “Take Flight” such that it can only be done with valid flights that are pre-scheduled (e.g. 9:40AM, 11:30AM, 3:15PM etc), we can model it by having a precondition “Have-flight” which is asserted at 9:40AM, 11:30AM and 3:15PM using timed initial literals)

Becase cost-functions are step funtions, we need to evaluate the utility function U(makespan,cost) only at a finite number of time points (no matter how complex the U(.) function is. Cost functions will be step functions as long as the actions do not model

continuous change (which will come in at PDDL 2.1 Level 4). If you have continuous change, then the cost functions may change continuously too

ADDED

Page 58: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Cost Propagation Issues:

At a given time point, each fact is supported by multiple actions Each action has more than one precondition

Propagation rules: Cost(f,t) = min {Cost(A,t) : f Effect(A)} Cost(A,t) = Aggregate(Cost(f,t): f Pre(A))

Sum-propagation: Cost(f,t) The plans for individual preconds may be interacting

Max-propagation: Max {Cost(f,t)} Combination: 0.5 Cost(f,t) + 0.5 Max {Cost(f,t)}

Probably other better ideas could be tried

Can’t use something like set-level idea here becauseThat will entail tracking the costs of subsets of literals

Page 59: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Termination Criteria

Deadline Termination: Terminate at time point t if: goal G: Dealine(G) t goal G: (Dealine(G) < t) (Cost(G,t) =

Fix-point Termination: Terminate at time point t where we can not improve the cost of any proposition.

K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times.

cost

time0 1.5 2 10

$300

$220

$100

Drive-car(Tempe,LA)

H(T,P)

Shuttle(T,P)

Plane(P,LA)

t = 0 0.5 1 1.5 t = 10

Earliest time pointCheapest cost

Page 60: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Heuristic estimation using the cost functions

If the objective function is to minimize time: h = t0

If the objective function is to minimize cost: h = CostAggregate(G, t)

If the objective function is the function of both time and cost

O = f(time,cost) then:h = min f(t,Cost(G,t)) s.t. t0 t t

Eg: f(time,cost) = 100.makespan + Cost then h = 100x2 + 220 at t0 t = 2 t

time

cost

0 t0=1.5 2 t = 10

$300

$220

$100

Cost(At(LA))

Earliest achieve time: t0 = 1.5Lowest cost time: t = 10

The cost functions have information to track both temporal and costmetric of the plan, and their inter-dependent relations !!!

Page 61: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Heuristic estimation by extracting the relaxed plan

Relaxed plan satisfies all the goals ignoring the negative interaction: Take into account positive interaction Base set of actions for possible adjustment according to

neglected (relaxed) information (e.g. negative interaction, resource usage etc.)

Need to find a good relaxed plan (among multiple ones) according to the objective function

Page 62: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Heuristic estimation by extracting the relaxed plan

General Alg.: Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then:

Supported Fact = SF Effects(A)Goals = SF \ (G Precond(A))

Temporal Planning with Cost: If the objective function is f(time,cost), then A is selected such that:

f(t(RP+A),C(RP+A)) + f(t(Gnew),C(Gnew)) is minimal (Gnew = (G Precond(A)) \ Effects)

Finally, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated

time

cost

0 t0=1.5 2 t = 10

$300

$220

$100

Tempe

Phoenix

L.A

f(t,c) = 100.makespan + Cost

Page 63: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

End of 10/30 lecture

Page 64: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Adjusting the Heuristic Values

Ignored resource related information can be used to improve the heuristic values (such like +ve and –ve interactions in classical planning)

Adjusted Cost:

C = C + R (Con(R) – (Init(R)+Pro(R)))/R * C(AR)

Cannot be applied to admissible heuristics

Page 65: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Partialization Example

A1 A2 A3

A1(10) gives g1 but deletes pA3(8) gives g2 but requires p at startA2(4) gives p at end We want g1,g2

A position-constrained plan with makespan 22

A1

A2

A3 G

p

g1

g2

[et(A1) <= et(A2)] or [st(A1) >= st(A3)][et(A2) <= st(A3)….

OrderConstrainedplan

The best makespan dispatch of the order-constrained plan

A1

A2 A3 14+

There could be multiple O.C. plansbecause of multiple possible causal sources. Optimization will involve Going through them all.

Page 66: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Problem Definitions Position constrained (p.c) plan: The execution time of each action is

fixed to a specific time point Can be generated more efficiently by state-space planners

Order constrained (o.c) plan: Only the relative orderings between actions are specified More flexible solutions, causal relations between actions

Partialization: Constructing a o.c plan from a p.c plan

QR R

G

QR

{Q} {G}

t1 t2 t3

p.c plan o.c plan

Q R RG

QR

{Q} {G}

Page 67: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Validity Requirements for a partialization

An o.c plan Poc is a valid partialization of a valid p.c plan Ppc, if: Poc contains the same actions as Ppc

Poc is executable Poc satisfies all the top level goals (Optional) Ppc is a legal dispatch (execution) of Poc

(Optional) Contains no redundant ordering relations

PQ

PQ

Xredundant

Page 68: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Greedy Approximations

Solving the optimization problem for makespan and number of orderings is NP-hard (Backstrom,1998)

Greedy approaches have been considered in classical planning (e.g. [Kambhampati & Kedar, 1993], [Veloso et. al.,1990]):

Find a causal explanation of correctness for the p.c plan Introduce just the orderings needed for the explanation to

hold

Page 69: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Modeling greedy approaches as value ordering strategies

Variation of [Kambhampati & Kedar,1993] greedy algorithm for temporal planning as value ordering: Supporting variables: Sp

A = A’ such that: etp

A’ < stpA in the p.c plan Ppc

B s.t.: etpA’ < etp

B < stpA

C s.t.: etpC < etp

A’ and satisfy two above conditions Ordering and interference variables:

pAB = < if etp

B < stpA ; p

AB = > if stpB > stp

A

rAA’= < if etr

A < strA’ in Ppc; r

AA’= > if strA > etr

A’ in Ppc; rAA’= other wise.

Key insight: We can capture many of the greedy approaches as specific value ordering strategies on the CSOP encoding

Page 70: 3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class

Empirical evaluation

Objective: Demonstrate that metric temporal planner armed with our

approach is able to produce plans that satisfy a variety of cost/makespan tradeoff.

Testing problems: Randomly generated logistics problems from TP4

(Hasslum&Geffner)Load/unload(package,location): Cost = 1; Duration = 1;Drive-inter-city(location1,location2): Cost = 4.0; Duration = 12.0;Flight(airport1,airport2): Cost = 15.0; Duration = 3.0;Drive-intra-city(location1,location2,city): Cost = 2.0; Duration = 2.0;