32
An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University [email protected] Subbarao Kambhampati Department of Computer Science Arizona State University [email protected] Thomas Vossen Leeds School of Business University of Colorado at Boulder [email protected] J. Benton Department of Computer Science Arizona State University [email protected] http:// rakaposhi.eas.asu.edu/yochan /

An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University [email protected] [email protected]

Embed Size (px)

Citation preview

Page 1: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

An LP-Based Heuristic for Optimal Planning

Menkes van den BrielDepartment of Industrial Engineering

Arizona State [email protected]

Subbarao KambhampatiDepartment of Computer Science

Arizona State [email protected]

Thomas VossenLeeds School of Business

University of Colorado at [email protected]

J. BentonDepartment of Computer Science

Arizona State [email protected]

http://rakaposhi.eas.asu.edu/yochan/

Page 2: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

What is automated planning?

loc1 loc2 loc1 loc2

Initial states0 S

Goals* S

Page 3: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

What is automated planning?

loc1 loc2 loc1 loc2

loc1 loc1

Initial states0 S

Goals* S

Action

a = pre, post, prevail

Page 4: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

What is automated planning?

loc1 loc2 loc1 loc2

loc1 loc1

Initial states0 S

Goals* S

Action

a = pre, post, prevail

PlanP = a1, …, an

Page 5: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Motivation

• Why heuristics?– Heuristic state space search have been very successful in

solving automated planning problems

• Why optimal planning?– Real-world planning applications require optimal or near-optimal

solutions• The difference between a (near) optimal solution and a feasible

solution may be the difference between winning or losing the interest of an investor or strategic partner

Page 6: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

LP-based heuristic

Relax the ordering of the actions

Setup an integer programming formulation

Solve the LP-relaxation and use the objective function value as an admissible distance estimate

Strengthen the formulation by adding valid inequalites

Page 7: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Action selection formulation

• Represent the planning problem as a set of loosely coupled network flow problems– Each state variable defines one network flow problem– Nodes correspond to the state variable values– Arcs correspond to state variable transitions

Page 8: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Simple logistics example

1

2

T

1

2

DTGPackage1

DTGTruck1

Load(p1,t1,l1)

Load(p1,t1,l2)

Unload(p1,t1,l1)

Unload(p1,t1,l2)

Drive(l1,l2) Drive(l2,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

loc1 loc2

Page 9: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Action selection formulation

• Variables– xa Z+, for a A; xa is equal to the number of times action a is

executed

• Objective function– MIN aA xa

• Constraints, for all c C, f Vc

eVc+(f):aAcE(e) xa – eVc–(f):bAcE(e) xb

– xa M eVc+(f):bAcE(e) xb for all f s0[c], a AcV(f)

1 if f s0[c], f = s*[c]–1 if f = s0[c], f s*[c]0 otherwise

No time indicesNo upper bound

Page 10: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Simple logistics example

1

2

T

1

2

DTGPackage1

DTGTruck1

Load(p1,t1,l1)

Load(p1,t1,l2)

Unload(p1,t1,l1)

Unload(p1,t1,l2)

Drive(l1,l2) Drive(l2,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

loc1 loc2

Page 11: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Simple logistics example

Feasible plan

xDrive(l2,l1) = 1xLoad(p1,t1,l1) = 1xDrive(l1,l2) = 1xUnload(p1,t1,l2) = 11

2

T

1

2

DTGPackage1

DTGTruck1

Load(p1,t1,l1)

Load(p1,t1,l2)

Unload(p1,t1,l1)

Unload(p1,t1,l2)

Drive(l1,l2) Drive(l2,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

4

Drive(l2,l1) Load(p1,t1,l1) Drive(l1,l2) Unload(p1,t1,l2)

Page 12: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Simple logistics example

LP solution

xLoad(p1,t1,l1) = 1xUnload(p1,t1,l2) = 1xDrive(l2,l1) = 1/M

1

2

T

1

2

DTGPackage1

DTGTruck1

Load(p1,t1,l1)

Load(p1,t1,l2)

Unload(p1,t1,l1)

Unload(p1,t1,l2)

Drive(l1,l2) Drive(l2,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

Load(p1,t1,l1)Unload(p1,t1,l1)

2 + 1/M

Drive(l2,l1) Load(p1,t1,l1) Unload(p1,t1,l2)… …

Page 13: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Preliminary resultsProblem LP LP- Lplan h+ hFF Optimallog4-0 16.0* 17 19 19 20log4-1 14.0* 15 17 17 19log4-2 10.0* 11 13 13 15log5-1 12.0* 13 15 15 17log5-2 6.0* 7 8 8 8log6-1 10.0* 11 13 13 14log6-9 18.0* 19 21 21 24log12-0 32.0* 33 39 39 -log15-1 54.0* - 63 66 -freecell2-1 9 9 9 9 9freecell2-2 8 8 8 8 8freecell2-3 8 8 8 9 8freecell2-4 8 8 8 9 8freecell2-5 9 9 9 9 9freecell3-5 12 13 13 14 -freecell13-3 55 - - 95 -freecell13-4 54 - - 94 -freecell13-5 52 - - 94 -driverlog1 3.0* 7 6 8 7driverlog2 12.0* 13 14 15 19driverlog3 8.0* 9 11 11 12driverlog4 11.0* 12 12 15 16driverlog6 8.0* 9 10 10 11driverlog7 11.0* 12 12 15 13driverlog13 15.0* 16 21 26 -driverlog19 60.0* - 89 93 -driverlog20 60.0* - 84 106 -

Page 14: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Preliminary resultsProblem LP LP- Lplan h+ hFF Optimalzenotravel1 1 1 1 1 1zenotravel2 3.0* 5 4 4 6zenotravel3 4.0* 5 5 5 6zenotravel4 5.0* 6 6 6 8zenotravel5 8.0* 9 11 11 11zenotravel6 8.0* 9 11 13 11zenotravel13 18.0* 19 23 23 -zenotravel19 46.0* - 62 63 -zenotravel20 50.0* - - 69 -tpp1 3.0* 5 4 4 5tpp2 6.0* 7 7 7 8tpp3 9.0* 10 10 10 11tpp4 12.0* 13 13 13 14tpp5 15.0* 17 17 17 19tpp6 21.0* 23 21 21 -tpp28 150.0* - - 88 -tpp29 - - - 104 -tpp30 174.0* - - 101 -bw-sussman 4 6 5 5 6bw-12step 4 8 4 7 12bw-large-a 12 12 12 12 12bw-large-b 16 18 16 16 18

Page 15: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Strengthening techniques

• Composition of state variables (i.e. fluent merging)– Given the domain transition graph (DTG) of two state variables

c1, c2, the composition of DTGc1 and DTGc2 is the domain transition graph DTGc1||c2 = (Vc1||c2, Ec1||c2) where

– Vc1||c2 = Vc1 Vc2

– ((f1,g1),(f2,g2)) Ec1||c2 if f1,f2 Vc1, g1,g2 Vc2 and there exists an action a A such that one of the following conditions hold

• pre[c1] = f1, post[c1] = f2, and pre[c2] = g1, post[c2] = g2

• pre[c1] = f1, post[c1] = f2, and prevail[c2] = g1, g1 = g2

• pre[c1] = f1, post[c1] = f2, and g1= g2

The term composition is also used in model checking to define the parallel composition or the synchronized product of automata

[Cassandras & Lafortune, 1999]

Page 16: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Example

• Two DTGs and their composition

f3

f2

f1

g2

g1

b

c

d

DTGc1 DTGc2

a

b

f1,g2

f2,g1

f2,g2

f3,g1

f3,g2

f1,,g1

DTGc1 || c2

a

a

b

c

c

d

d

Page 17: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Example

• Two DTGs and their composition– Small in-arcs denote the initial state– Double circles denote the goal

f3

f2

f1

g2

g1

b

c

d

DTGc1 DTGc2

a

b

f1,g2

f2,g1

f2,g2

f3,g1

f1,,g1

DTGc1 || c2

a

a

b

c

c

d

d

Page 18: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Simple logistics example

loc1 loc2

1,1

1,T

2,T

2,2

1,2

2,1

DTGTruck1 || Package1

Drive(l1,l2)

Drive(l2,l1)

Load(p1,t1,l1)

Load(p1,t1,l2)

Unload(p1,t1,l1)

Unload(p1,t1,l2)

Drive(l1,l2)

Drive(l2,l1)

Drive(l1,l2)Drive(l2,l1)

Page 19: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Simple logistics example

1,1

1,T

2,T

2,2

1,2

2,1

DTGTruck1 || Package1

LP solution

xDrive(l2,l1) = 1xLoad(p1,t1,l1) = 1xDrive(l1,l2) = 1xUnload(p1,t1,l2) = 1

4

Drive(l2,l1) Load(p1,t1,l1) Drive(l1,l2) Unload(p1,t1,l2)

Drive(l1,l2)

Drive(l2,l1)

Load(p1,t1,l2)

Unload(p1,t1,l1)

Unload(p1,t1,l2)

Drive(l1,l2)

Drive(l2,l1)

Drive(l1,l2)Drive(l2,l1)

Page 20: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Another example

• Two DTGs and their composition

f3

f2

f1

g3

g2

g1

f1,g2

f1,g3

f2,g1

f2,g2f2,g3

f3,g1

f3,g2

f3,g3

f1,,g1

DTGc1 DTGc2 DTGc1 || c2

Page 21: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Another example

• Two DTGs and their composition– Solution to the individual state variables

f3

f2

f1

g3

g2

g1

f1,g2

f1,g3

f2,g1

f2,g2f2,g3

f3,g1

f3,g2

f3,g3

f1,,g1

b

a

a

b

DTGc1 DTGc2 DTGc1 || c2

Page 22: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Another example

• Two DTGs and their composition– Solution to the individual state variables represented in the

composed state variable

f3

f2

f1

g3

g2

g1

f1,g2

f1,g3

f2,g1

f2,g2f2,g3

f3,g1

f3,g2

f3,g3

f1,,g1

b

a

a

b

DTGc1 DTGc2 DTGc1 || c2

b

a

Page 23: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Another example

• Two DTGs and their composition– Solution to the individual state variables represented in the

composed state variable

f3

f2

f1

g3

g2

g1

f1,g2

f1,g3

f2,g1

f2,g2f2,g3

f3,g1

f3,g2

f3,g3

f1,,g1

b

a

a

b

DTGc1 DTGc2 DTGc1 || c2

b

a

Violates balance of flow constraints

Page 24: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Another example

• Two DTGs and their composition– Adding new balance of flow constraints strengthens the

formulation

f3

f2

f1

g3

g2

g1

f1,g2

f1,g3

f2,g1

f2,g2f2,g3

f3,g1

f3,g2

f3,g3

f1,,g1

b

a

a

b

DTGc1 DTGc2 DTGc1 || c2

b

a

c

c

e

dd

e

Page 25: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Identifying mergeable fluents

• When should we create a composition of two or more state variables?– Look at the causal graph– Look at the actions that introduce dependencies in the causal

graph

Person 1 Person 2

Airplane 1 Airplane 2

Fuel 1 Fuel 2

Person 1 Person 2

Airplane 1Fuel1

Airplane 2Fuel2

Page 26: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Experimental setup

• Objective– Minimize number of actions

• Domains– Selected domains from the International Planning Competition

• Logistics

• Freecell

• Driverlog

• Zenotravel

• TPP

• Blocksworld

• Resources– 2.67Ghz Linux machine– 1GB memory– 15 minutes runtime– CPLEX 10.0

Page 27: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Experimental setup

• Distance estimates– LP

• Action selection formulation with strengthening

– LP–

• Action selection formulation without strengthening

– Lplan• Step based integer programming formulation by Lplan [Bylander, 1997]

– h+

• Optimal relaxed plan when the delete effects are ignored

– hFF

• Inadmissible but efficient relaxed plan heuristic by FF [Hoffmann, and Nebel, 2001]

– Optimal• Optimal distance estimate given by Satplanner using the –opt flag

[Rintanen, Heljanko, and Niemela, 2005]

Page 28: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Experimental resultsProblem LP LP- Lplan h+ hFF Optimallog4-0 20 16.0* 17 19 19 20log4-1 19 14.0* 15 17 17 19log4-2 15 10.0* 11 13 13 15log5-1 17 12.0* 13 15 15 17log5-2 8 6.0* 7 8 8 8log6-1 14 10.0* 11 13 13 14log6-9 24 18.0* 19 21 21 24log12-0 42 32.0* 33 39 39 -log15-1 67 54.0* - 63 66 -freecell2-1 9 9 9 9 9 9freecell2-2 8 8 8 8 8 8freecell2-3 8 8 8 8 9 8freecell2-4 8 8 8 8 9 8freecell2-5 9 9 9 9 9 9freecell3-5 12 12 13 13 14 -freecell13-3 55 55 - - 95 -freecell13-4 54 54 - - 94 -freecell13-5 52 52 - - 94 -driverlog1 7 3.0* 7 6 8 7driverlog2 19 12.0* 13 14 15 19driverlog3 11 8.0* 9 11 11 12driverlog4 15.5 11.0* 12 12 15 16driverlog6 11 8.0* 9 10 10 11driverlog7 13 11.0* 12 12 15 13driverlog13 24 15.0* 16 21 26 -driverlog19 96.6* 60.0* - 89 93 -driverlog20 89.5* 60.0* - 84 106 -

Page 29: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Experimental resultsProblem LP LP- Lplan h+ hFF Optimalzenotravel1 1 1 1 1 1 1zenotravel2 6 3.0* 5 4 4 6zenotravel3 6 4.0* 5 5 5 6zenotravel4 8 5.0* 6 6 6 8zenotravel5 11 8.0* 9 11 11 11zenotravel6 11 8.0* 9 11 13 11zenotravel13 24 18.0* 19 23 23 -zenotravel19 66.2* 46.0* - 62 63 -zenotravel20 68.3* 50.0* - - 69 -tpp1 5 3.0* 5 4 4 5tpp2 8 6.0* 7 7 7 8tpp3 11 9.0* 10 10 10 11tpp4 14 12.0* 13 13 13 14tpp5 19 15.0* 17 17 17 19tpp6 25 21.0* 23 21 21 -tpp28 - 150.0* - - 88 -tpp29 - - - - 104 -tpp30 - 174.0* - - 101 -bw-sussman 4 4 6 5 5 6bw-12step 4 4 8 4 7 12bw-large-a 12 12 12 12 12 12bw-large-b 16 16 18 16 16 18

Distance estimates from the initial state to the goal (highlighted values equal the optimal distance)

Page 30: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Experimental results

• Heuristic calculation time

0.01

0.1

1

10

100

1000lp

lp-

lplan

h+

Logistics Freecell Driverlog Zenotravel TPP Blocks

Page 31: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

Conclusions and future work

• LP-based heuristic that respects delete effects, but ignores action ordering shows very promising results– Finds the optimal distance estimate in several problem instances– Can be used to calculate admissible distance estimates for

various optimization problems in planning– Ongoing work successfully incorporated our LP-based heuristic

in a search algorithm that solves oversubscription planning

• Interesting directions for future work– Apply fluent merging more aggressively– Extend the formulation into a complete planning system

Page 32: An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu

LP-based heuristic

Relax the ordering of the actions

Setup an integer programming formulation

Solve the LP-relaxation and use the objective function value as an admissible distance estimate

Strengthen the formulation by adding valid inequalites