Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Strong-Cyclic Planning When Fairness is Not a Valid Assumption
Alberto Camacho Sheila A. McIlraith
Department of Computer ScienceUniversity of Toronto, Canada
{acamacho,sheila}@cs.toronto.edu
KnowProSJuly 10, 2016
Take Home Message
Motivation
Soundness of standard strong-cyclic solutions to Fully ObservableNon-Deterministic (FOND) planning problems is guaranteed only whenthe fairness assumption holds.
Approach
We introduce L-fairness; a more generic concept that generalizes theclassical fairness assumption.
Contribution
FOND+ class of planning problems. Soundness of solutions ispredicated on the L-fairness assumption.
Identify a class of FOND+ solutions that are also solutions to1-primary normative fault-tolerant planning problems.
We present different algorithms to solve FOND+ problems.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 2 / 21
Non-Deterministic Planning
Non-Deterministic Planning Domain D = 〈F , S ,A,T 〉
F finite set of propositions
S finite set of states S ⊆ 2F
A set of actions a = 〈Prea,Eff a〉
Preconditions PreaNon-deterministic effects Eff a = 〈Eff 1
a, . . .Effna〉
T : S ×A → 2S transition function
If s ′ ∈ T (s, a,Eff ia) then s ′ = Prog(s, a,Eff i
a) for some Eff ia ∈ Eff a
We write state transition (s, a, s ′)
In our paper, we address two classes of non-deterministic planningproblems:
Fully Observable Non-Deterministic (FOND) Planning
Fault-Tolerant Planning
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 3 / 21
FOND Planning
FOND Planning Problem P = 〈D, s0, SG 〉
D = 〈F , S ,A,T 〉 is a non-deterministic planning domain
s0 ∈ S initial state
SG ⊆ S goal states
Solutions are policies, or mappings from states into actions.
weak solutions
strong solutions
strong-cyclic solutions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 4 / 21
Solutions to a FOND Problem (cf. [Cimatti et al., 2003])
Weak Solutions
Weak solutions are plans that achieve the goal, but without guarantees.
Strong Solutions
Strong solutions guarantee goal achievement in all executions.
Strong-Cyclic Solutions
Strong-Cyclic solutions guarantee goal achievement, provided that allexecutions are fair.
An execution σ is unfair when a state-action tuple s, a appears infinitelyoften in σ, but the transition (s, a, s ′) occurs a finite number of times foran outcome s ′ ∈ T (s, a).Executions that are not unfair are said to be fair.
c.f. [Cimatti et al., 2003]
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 5 / 21
Fault-Tolerant Planning
Fault-Tolerant Planning Problem P = 〈D, s0, SG , F , κ〉
D = 〈F , S ,A,T 〉 is a non-deterministic planning domain
s0 ∈ S initial state
SG ⊆ S goal states
F is an exception model
κ is an integer parameter
F :⋃
a∈AEff a → N is an exception model:
F (e) > 0 when the effect is faulty
F (e) = 0 when the effect is normative
If |e | F (e) = 0, e ∈ Eff a| = 1 for all a ∈ A, then problem is1-primary
c.f. [Jensen et al., 2004, Domshlak, 2013]
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 6 / 21
Solutions to Fault-Tolerant Planning Problems
κ-admissible Executions
A state-effect execution (s0, e0, . . . , si , ei , . . .) is κ-admissible whenΣiF (ei ) ≤ κ.
Solutions are κ-Plans
A policy is a κ-plan when all κ-admissible executions are finite and reachthe goal.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 7 / 21
Motivation
Blocksworld domain:
Initial state: {on(A,B), ontable(B), handempty}
Actions:
pick-up-block(?b,?from):Pre = {handempty, on-block(?b,?from)}Eff 1 = {holding(?b) ∧ ¬handempty}Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)}put-block-on-table(?b)
Pre = {holding(?b)}Eff = {on-table(?b) ∧ ¬holding(?b)}put-on-block(?b1,?b2)
Pre = {handempty ∧ clear(?b2)};Eff = {on-block(?b1,?b2) ∧ ¬handempty}
Goal condition: {on-table(A)}
B
A
B A B
A
Goal achievement ispredicated on fairness.
B
A
B A B
A
Goal achievement is notpredicated on fairness.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 8 / 21
Desired Solutions
Guarantees vs. no guarantees of occurrence:
solutions need not to rely on an effect for which there is noguarantees of occurrence
Normative vs. faulty behaviour:
solutions need to achieve the goal when the system manifests itsnormative behaviour
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 9 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 9 / 21
L-fair Executions
L-fair Executions
For a labeling function L : S ×A× S → {F, U}, we say that an executionin state s0 is L-unfair when there exists a state-action tuple (s, a) suchthat
(s, a) appears infinitely often, and
there exists a transition (s, a, s ′) such that L(s, a, s ′) = F and(s, a, s ′) occurs a finite number of times.
Executions that are not L-unfair are said to be L-fair.
Note that fairness, as defined by [Cimatti et al., 2003], is a particularcase of L-fairness that occurs when L assigns F to all transitions.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 10 / 21
Planning With Unfair Non-Determinism
FOND+ Planning Problem P = 〈D, s0, SG , L〉
D = 〈F , S ,A,T 〉 is a non-deterministic planning domain
s0 ∈ S is the initial state
SG ⊆ S is a set of goal states
L : S ×A× S → {F, U} is a labeling function
Solutions
Solutions to a FOND+ problem P = 〈D, s0, SG , L〉 are policies thatguarantee goal achievement, predicated on the assumption that allexecutions of D in s0 are L-fair.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 11 / 21
Classes of FOND+ Solutions
Strictly Fair
A solution π to a FOND+ problem is strictly fair when all transitions tproduced by L-fair plan executions have L(t) = F.
Strictly Unfair
A solution π to a FOND+ problem is strictly unfair when all transitionst produced by L-fair plan executions have L(t) = U.
Mixed
A solution π to a FOND+ problem is mixed when it is neither strictly fairnor strictly unfair.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 12 / 21
FOND+ and Fault-Tolerant Planning
Normative Solutions
A FOND+ solution π is normative when, in each state s, reachable by π:
there exists a plan execution in s that reaches the goal and suchthat all transitions t have L(t) = F, and
exactly one outcome of s by π(s) produces a transition t withL(t) = F.
Normative Solutions are Fault-Tolerant
Normative solutions to a FOND+ problem P = 〈D, s0, SG , L〉 are also1-primary normative solutions to fault-tolerant planning problemsP ′ = 〈D, s0, SG ,F , κ〉 s.t. F (e) = 0 (resp. F (e) > 0) when e producestransition (s, a, s ′) such that L(s, a, s ′) = F (resp. L(s, a, s ′) = U).
Normative FOND+ solutions are robust to occurrence of anypossible number of faults during execution, as opposed to standardfault-tolerant solutions.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 13 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 13 / 21
Algorithm to Find Strictly Fair Solutions
For a FOND+ problem P, the algorithm consists of two steps:
1 P is relaxed into a FOND problem P ′ = 〈D′, s0, SG 〉.D′ is like D, but the actions applicable in a given state s arerestricted to those a’s that only yield transitions (s, a, s ′) labeledwith L(s, a, s ′) = F.
2 A sound and complete strong-cyclic FOND planner – e.g. PRP[Muise et al., 2012] – is used to search for a strong-cyclic solution toP ′, which is returned as a strictly fair solution to P.
Theorem
Algorithm is sound and complete.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 14 / 21
Algorithm to Find Strictly Unfair Solutions
For a FOND+ problem P, the algorithm consists of two steps:
1 P is relaxed into a FOND problem P ′ = 〈D′, s0, SG 〉.D′ is like D, but the actions applicable in a given state s arerestricted to those a’s that only yield transitions (s, a, s ′) labeled wthL(s, a, s ′) = U.
2 A sound and complete strong FOND planer – e.g.[Jaramillo et al., 2014] – is used to search for a strong solution toP ′, which is returned as a strictly unfair solution to P.
Theorem
Algorithm is sound and complete.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 15 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).
Init S1 S2 Goal
?
?
?
?
?
?
a1 a2 a3
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).
Step 2: Select a state result of non-determinisim, and search plan to theGoal or to a previously resolved state.
Init S1 S2 Goal
?
?
?
?
S3
?
a1 a2 a3
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).
Step 2: Select a state result of non-determinisim, and search plan to theGoal or to a previously resolved state.
Step 3: Repeat Step 2 until convergence.
Init S1 S2 Goal
?
?
?
?
S3
?
a1 a2 a3
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).
Step 2: Select a state result of non-determinisim, and search plan to theGoal or to a previously resolved state.
Step 3: Repeat Step 2 until convergence.
Difference with PRP is in the open list of states.
In PRP: First-In, Last-Out
In our algorithm: Exploration of states produced by normativeeffects have preference.
Theorem
Algorithm is sound and complete.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21
Objectives of the Experiments
Two Main Objectives:
Test the efficiency of one of our algorithms
Evaluate characteristics (planner run time and policy size) ofnormative solutions
Procedure:
Compute Normative solutions to FOND+ problems
Compute Strong-Cyclic solutions to FOND problems, using PRPplanner [Muise et al., 2012]
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 17 / 21
Blocksworld Problems
Blocksworld problems from [Muise et al., 2012], with actions:
pick-up-block(?b,?from):Pre = {handempty, on-block(?b,?from)}Eff 1 = {holding(?b) ∧ ¬handempty}Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)}
put-block-on-table(?b)
Pre = {holding(?b)}Eff = {on-table(?b) ∧ ¬holding(?b)}
put-on-block(?b1,?b2)
Pre = {handempty ∧ clear(?b2)}Eff 1 = {on-block(?b1,?b2) ∧ ¬handempty}Eff 2 = {on-table(?b1) ∧ ¬handempty}
In FOND+ problems we consider:
Eff 1 is a normative effect
Eff 2 is a faulty effect
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 18 / 21
Results
Strong-Cyclic Normativeproblem run-time size run-time sizep2 0 3 0 3p3 0.002 5 0.016 5p4 0.020 11 0.048 11p5 0.070 27 0.178 27p6 0.110 39 0.296 39p7 0.114 32 0.270 32p8 0.150 26 0.356 26p9 0.278 46 0.664 46p10 0.336 49 0.782 49p11 0.522 120 1.936 97p12 0.626 97 1.840 119.5p13 0.682 57 1.810 57p14 3.794 1117 37.10 1123p15 1.500 278 7.814 278
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 19 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 19 / 21
Summary and Future Work
Strong-cyclic planning does not guarantee goal achievement inproblems the fairness assumption is not valid
We introduced L-fairness and FOND+ model
We identified connection between FOND+ and 1-primary normativefault-tolerant planning
Introduced algorithms to search FOND+ solutions
Future Work:
Further investigate and formalise connections between FOND+ andfault-tolerant planning
More extensive experiments
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 20 / 21
Questions?
code, benchmarks, and slides available soon:
http://www.cs.toronto.edu/~acamacho
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 21 / 21
References I
Cimatti, A., Pistore, M., Roveri, M., and Traverso, P. (2003).
Weak, strong, and strong cyclic planning via symbolic model checking.Artificial Intelligence, 147:35–84.
Domshlak, C. (2013).
Fault tolerant planning: Complexity and compilation.
Hertle, A., Dornhege, C., Keller, T., Mattmller, R., Ortlieb, M., and Nebel, B. (2014).
An Experimental Comparison of Classical, FOND and Probabilistic Planning.In Proc. of 37th International Conference on Artificial Intelligence (KI 2014), Prague.
Jaramillo, A. C., Fu, J., Ng, V., Bastani, F. B., and Yen, I.-L. (2014).
Fast strong planning for fond problems with multi-root directed acyclic graphs.International Journal on Artificial Intelligence Tools, 23(06):1460028.
Jensen, R. M., Veloso, M. M., and Bryant, R. E. (2004).
Fault tolerant planning: Toward probabilistic uncertainty models in symbolicnon-deterministic planning.pages 335–344.
Little, I. and Thiebaux, S. (2007).
Probabilistic planning vs. replanning.ICAPS Workshop on IPC: Past, Present and Future.
Muise, C., McIlraith, S. A., and Beck, J. C. (2012).
Improved Non-deterministic Planning by Exploiting State Relevance.In ICAPS, pages 172–180.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 21 / 21