18
3/25: Leaving STRIPS Planning and going to Sapa

3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25 Homework 4 due next class Midterm soon after that Will be take home Will have

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

3/25: Leaving STRIPS Planning and going to Sapa

Page 2: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Administrivia 3/25

Homework 4 due next class Midterm soon after that

Will be take home Will have a “Shock and Awe” flavor

You can be an “embedded exam taker” by suggesting problems Today:

Metric/Temporal planning (MTP) Representation issues Modeling MTP in Progression and Regression

And Graphplan and PO planning etc. etc.

Page 3: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Metric Temporal Planning Time: Durative actions; Temporal constraints on goals (deadlines; inter-goal

constraints – eg. Make sure to be in airport 2 hours before you are in the plane); Exogenous events Durations may be static or “dynamic”

duration depends on the context—eg. Time to fill your gas tank depends on how empty the tank is to begin with

Advanced issues: Uncertain durations… Modeling issues: When are preconditions needed? How long will they persist? When

are effects given? A default assumption is to say that all preconditions are needed at the beginning and must

hold during the entire action’s duration. And that all effects will be available at the end of the action

E.g Consider “Grading homeworks” action—when are the homeworks needed? When are the grades available? What does your teacher tell you?

Planning issues: How to support concurrency?(see next slide) How to support multi-objective (cost/make-span/robustness) optimization

Resources: Actions may consume/produce (continuous quantity) “resources” Modeling issues: How to model resource availability (especially over the duration of an

action) Planning issues: How to efficiently reason with continuous quantities during planning

Special cases: TP: Temporal planning RP: Resource Planning

Page 4: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Concurrency

Suppose I tell you that a plan P contains actions A1… A10, each with duration d1…d10, then what is the makespan (execution duration) of P? Makespan(P) >= max(d1…d10) If Makespan(P) = Sum(d1…d10), then it is a strictly serial plan If Makespan(P) > Sum(d1..d10), then there is idle-time in the plan

Actions don’t need to start right after the preceding action Think of the bank teller gossiping with his colleague in between servicing each

customer Planned idle/slack time may not always be a bad thing—it can sometimes

improve the robustness of the plan Think of three travel plans involving connections in Minneapolis: Plan 1 schedules 5 min for connection time; plan 2 schedules 1 hour; plan

3 schedules 2 days. Which one is better (all else being equal).

Page 5: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Some Brand Names

Planners that can handle similar types of temporal and resource constraints: TLPlan, HSTS, IxTexT, Zeno, SAPA

TlPlan, SAPA are progression-based planners HSTS,IxTET,Zeno are partial-order-based planners TlPlan,HSTS are domain-customized planners; the rest are domain independent

Planners that can handle a subset of constraints: Only temporal: TGP, TPG, LPGP Only resources: LPSAT, GRT-R, Kautz-Walser Subset of temporal and resource constraints: TP4, Resource-IPP

LPGP and LPSAT are “loosely-coupled” systems. LPSAT connects SAT and LP solvers; LPGP connects Graphplan and LPsolver

Issues of how “tight” is the loose-connection. TGP,TPG,LPGP are Graphplan-based LPSAT is based on SAT encodings being sent to LP solvers Kautz-Walser is based solely on LP encodings

Page 6: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Approaches for MTP

In theory, pretty much every one of the approaches we saw for classical planning can be (and have been) extended to MTP (with varying degrees of scalability)

There are some interesting tradeoffs PO planners are easiest to extend to support the concurrency

needed for durative actions Have harder time handling resources (because resource consumption

depends on exactly what actions occurred before this time point) Progression planners easiest to extend to support resource

consuming actions But harder time handling concurrency (need to consider “advancing

clock” as a separate option in addition to applying one of the actions)

Page 7: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

3/27: Our Road Map

Will focus on conjunctive planning approaches—with special attention to Sapa action models

Using PDDL2.1 standard how to model the search

Progression; Regression; PO planning how to extract good heuristics

Page 8: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Action Representation

Flying

(in-city ?airplane ?city1)

(fuel ?airplane) > 0

(in-city ?airplane ?city1) (in-city ?airplane ?city2)

consume (fuel ?airplane)

Durative with EA = SA + DA

Instantaneous effects e at time te = SA + d, 0 d DA

Preconditions need to be true at the starting point, and protected during a period of time d, 0 d DA

Action can consume or produce continuous amount of some resource

Action Conflicts:

Consuming the same resourceOne action’s effect conflictingwith other’s precondition or effect

Page 9: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

(:durative-action burn_match:parameters ():duration (= ?duration 15):condition: (and (at start have_match) (at start have_strikepad)):effect (and (at start have_light) (at end (not have_light))

))

have_match, have strikepad

have_light ~have_light(dur: 15)

(:durative-action cross_cellar:parameters ():duration (= ?duration 10):condition (and (at start have_light)

(over all have_light) (at start at_steps))

:effect (and (at start (not at_steps)) (at start crossing)(at end at_fuse_box))

have_light (dur: 10)have_light, at_steps

at_fuse_box~at_steps, crossing

PDDL 2.1 (Level 2)Pure Durative Actions

Page 10: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

PDDL 2.1 Level 3:Durative actions and numeric quantities

(but discrete effects)

The entire energy to be consumed is “encumbered” at the very beginning (even though it gets consumed Slowly over the full duration.

Page 11: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

PDDL 2.1 Level 4:Durative actions and numeric quantities

(with continuous effects: )

Page 12: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Issues in modeling continuous change by discrete vs. continuous effects

Consider the action of boiling a pan of water The quantity “temperature of water” changes

continuously over the duration of the action We can ignore continuous effects by

specifying that temperature is 1000 C at the end

Easy to handle; can only access the temperature at the end of the action; Reduces concurrency (what if we also put a blow torch to the pan to “hasten” the process?)

Or we can specify that the temperature of the water raises at a linear rate until it becomes 100

Harder to handle; but allows more concurrency (the total rate of increase is summation of all the individual rates of increase)

Page 13: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

PDDL 2.1 Standard:Summary

Durations Static and dynamic durations allowed Also allows duration inequalities

Preconditions Can be “at start” or “over all” (throughout the duration)

Doesn’t model preconditions being needed for arbitrary durations in the middle Effects

Can be “at start” or “at end” This makes effects “discrete”

Numeric quantities Can be present in the preconditions or effects Presence in the effects can be “discrete” (“at start”/”at end”) or continuous

Continuous change specified by giving a “rate” at which the quantity changes Non-linear rate harder

Page 14: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

State of the Art (as of IPC2002)

At IPC 2002; PDDL 2.1 standard had three levels Level 1: STRIPS/ADL Level 2: +Durative Actions

FF, LPG, SAPA…. Level 3: +Numeric quantities discrete change

Sapa, LPG Level 4: +Continuous change

None at IPC Some planners can handle it “in theory” but none are scalable

Page 15: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Problem Representation Achievement Goals are specified as a list <pi,ti> where pi

needs to hold by time ti ti is the deadline by which G must hold. It can be metric time (e.g.

make clear(b) true by 2pm.) If ti is omitted we will assume that G is a non-deadline goal (must be true by

the time the plan is done. “Persist Goals” are specified as a condition and an interval

over which it must hold A persist goal may be supported by different actions for the different parts

of the duration ( “goal reduction” a la ZENO) E.g. striking multiple matches to have light over a duration

Page 16: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Plan representation

A1

A2

A3

Drive(cityA,cityB)

QAt(truck,B)

An executable plan must provide -- the actions that need to be executed -- the start times for each of the actions Or a set of simple temporal constraints on the set of actions (S.T.C. are generalization of partial orders) E.g. A1—[4,5]A2 (means 4 <= ST(A2) – ST(A1) <= 5 )

Plan views: Pert and Gantt charts GANTT Chart is what is shown on the right PERT shows the Causal links

Page 17: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have
Page 18: 3/25: Leaving STRIPS Planning and going to Sapa. Administrivia 3/25  Homework 4 due next class  Midterm soon after that  Will be take home  Will have

Plan Quality Measures Makespan: Clock time for the execution of the plan

(more concurrency lower makespan) Slack: The difference between the deadline for a goal

and the time by which the plan achieves it Tardiness is negative slack Optimize max/min/average slack/tardiness measures

Cost: Sum of costs of all the actions Can be split into multiple dimensions, one corresponding to

each resource

A1

A2

A3

Drive(cityA,cityB)

QAt(truck,B)

Can two plans with same make-span have different slack measures?