Cross-Domain Action-Model Acquisition for Planning viaWeb Search

Cross-Domain Action-Model Acquisition for Planning viaWeb Search

Hankz Hankui Zhuoa, Qiang Yangb, Rong Pana and Lei Lia

aSun Yat-sen University, ChinabHong Kong University of Science & Technology, Hong Kong

Motivation

There are many domains that share knowledge with each other, e.g.,

Motivation


“walking” in the driverlog domain

http://www.superstock.com/stock-photos-images/1778R-4701

Motivation


“walking” in the driverlog domain “navigating” in the rovers domain


http://www.pixelparadox.com/mars.htm

Motivation


“walking” in the driverlog domain “navigating” in the rovers domain “moving” in the elevator domain etc…


http://www.venusengineers.com/goods-lift.html


Motivation

These actions in these domains all share the common knowledge about location change, thus,

it may be possible to “borrow” knowledge from each other. specifically, next slide …


http://www.venusengineers.com/goods-lift.html


Motivation



walk(?d-driver ?l1-loc ?l2-loc)

:precondition

(and (at ?d ?l1) (path ?l1 ?l2))

:effect

(and (not (at ?d ?l1)) (at ?d ?l2)))

Motivation



navigate(?d-rover ?x-waypoint ?y-waypoint)

:precondition ??

:effect ??

guess?walk(?d-driver ?l1-loc ?l2-loc)

:precondition


:effect

(and (not (at ?d ?l1)) (at ?d ?l2)))

Motivation




:precondition


:effect

(and (not (at ?d ?l1)) (at ?d ?l2)))


:precondition

(at ?x ?y) (visible ?y ?z) …

:effect

(not (at ?x ?y)) (at ?x ?z)

guess?

Motivation




:precondition


:effect

(and (not (at ?d ?l1)) (at ?d ?l2)))


:precondition

(at ?x ?y) (visible ?y ?z) …

:effect

(not (at ?x ?y)) (at ?x ?z)

guess?

Motivation




:precondition


:effect

(and (not (at ?d ?l1)) (at ?d ?l2)))


:precondition

(at ?d ?x) (visible ?x ?y) …

:effect

(not (at ?d ?x)) (at ?d ?y)

guess?

Motivation

In this work, we aim at learning action models from a target domain, e.g., learning the model of “navigate” in rovers,

by transferring knowledge from another domain, called a source domain, e.g., the knowledge of the model “walk” in driverlog.

Problem Formulation

Formally, our learning problem can be addressed: Given as inputs:

Action models from a source domain: As

A few plan traces from the target domain: {<s0,a1,s1,…,an,sn>},

where si is a partial state, and ai is an action.

Action schemas from the target domain: A’ Predicates from the target domain: P

Problem Formulation

Formally, our learning problem can be addressed: Given as inputs:

Action models from a source domain: As

A few plan traces from the target domain: {<s0,a1,s1,…,an,sn>},

where si is a partial state, and ai is an action.

Action schemas from the target domain: A’ Predicates from the target domain: P

Output: Action models in the target domain: At

Problem Formulation

Our assumptions are: based on STRIPS domain people do not write action names randomly:

E.g., not using “eat” to express “move”!

no need to observe full intermediate states in plan traces, i.e., intermediate state can be partial or empty.

action sequences in plan traces are correct. actions in plan traces are all ordered, i.e., there are no concurrent

actions. there is information available in the Web related to “actions”.

Our Algorithm: LAWS

Constraints from

web searching

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

Our Algorithm: LAMMAS

Constraints from states

between actions

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints


Constraints imposed

on action models

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints


Constraints to ensure

causal links in traces.

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints


Solving constraints

Using a weighted

MAXSAT solver.

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

Web constraints

Used to measure the similarity between two actions.

To do this, we search two actions in the Web.

Specifically, we build predicate-action pairs from the target domain as follows:

Where, p is a predicate a is an action schema p’s parameters are included by a’s

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

))}()((

'|,{

aPARApPARA

AaPpapPAt

Web constraints

Similarly, we build predicate-action pairs from the source:

where, PAs

pre, PAsadd, PAs

del, denote sets of precondition-action pairs, add-action pairs and del-action pairs.

Note that we require p PRE(a), ∈which is different from PAt

)}(|,{

)},(|,{

)},(|,{

aDELpAaapPA

aADDpAaapPA

aPREpAaapPA

sdel

sadd

spre

s

s

s

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

Web constraints

Next, we collect a set of web documents D={di} by searching keyword

w=<p,a> ∈PAt.

We process each page di as a vector yi by calculating the tf-idf (Jones 1972).

As a result, we have a set of real-number vectors Y={yi}.

Likewise, we can easily get a set of vectors X={xi} by searching keyword w’=<p’,a’>∈PAs

pre.

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

Web constraints

We define the similarity function between two keywords w and w’ as follows:

similarity(w,w’)=MMD2(F, Y, X),

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints MMD is the Maximum Mean Discrepancy,

which is given by (Borgwardt et al. 2006).

The mathematics is like:

Web constraints



plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

)exp(),( 2

2

2ji yx

ji yxk

nm

jijimn

n

jijinn

m

jijimm

yxkyyk

xxkYXFMMD

,

1,

2)1(

1

)1(12

),(),(

),(),,(

where

MMD is the Maximum Mean Discrepancy,



Web constraints



plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

)exp(),( 2

2

2ji yx

ji yxk

nm

jijimn

n

jijinn

m

jijimm

yxkyyk

xxkYXFMMD

,

1,

2)1(

1

)1(12

),(),(

),(),,(

where

MMD is Maximum Mean Discrepancy,



a set of feature mapping function of a

Gaussian kernel.

Web constraints

Finally, we generate weighted web constraints by the following steps: For each w=<p,a>∈PAt, and

w’=<p’,a’>∈PAspre , we

calculate similarity(w,w’), Generate a constraint

p ∈PRE(a),

and associate it with similarity(w, w’) as its weight.

likewise for ADD(a) and DEL(a)

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

State constraints (given by Yang et.al 2007)

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

)PRE(then

,)()( and , before appears if

ap

aparapparaap

)ADD(then

,)()( and ,after appears if

ap

aparapparaap

The weights of all the constraints are calculated by counting their occurrences in all the plan traces.

Generally, if p frequently appears before a, it is probably a precondition of a. Specifically,

Action constraints (given by Yang et.al 2007)

Action constraints are imposed to ensure the learned action models are succinct, which is

)PRE()ADD( apap

These constraints are associated with the maximal weight of all the state constraints to ensure these constraints are maximally satisfied.

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

Plan constraints (given by Yang et.al 2007)

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

We require that causal links in plan traces are not broken. Thus, we build constraints as follows. For each precondition p of an action aj

in a plan trace, either p is in the initial state, or there is ai prior to aj that adds p, and no ak between ai and aj that deletes p:

where i < k < j. For each literal q in goal, either q is in

the initial state s0, or there is ai that adds q and no ak that deletes q:

)()()( kij aDelpaAddpaPrep

))()((0 ki aDelqaAddqsq

Plan constraints (given by Yang et.al 2007)

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

To ensure these constraints are maximally satisfied, we assign these constraints with the maximal weight of state constraints.

Solve constraints

Before solving all these constraints, we adjust the weights of web constraints by replacing the original weights wo with wo’:

where wm is the maximal value of weights of state constraints, and γ belongs to [0,1).

We can easily adjust wo’ from 0 to +∞ by varying γ from 0 to 1.

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraintsomo www

1

'

Solve constraints

. ofeffect an into converted be will

, as assigned is " )ADD(" if

ap

trueap

Solve these weighted constraints by running a weighted MAXSAT solver.

The attained result is converted to action models, e.g.

plan

traces

Web constraints

Action constraints

Plan constraints

Solve constraints

Action models At

Build constraints

predicatesaction

schemas

source

As

State constraints

Experimental Result

Example result: (:action walk(?d - rover ?x - waypoint ?y - waypoint)

:precondition (and (at ?d ?x) (visible ?x ?y))

:effect (and (not (at ?d ?x))

(at ?d ?y) (not (visible ?x ?y))))

Missing

condition

Extra

condition

We calculate the error rate by counting all the missing and extra conditions, and finally get the accuracy.

By comparing to hand-written action models, we know that there is a missing/extra condition.

Experimental Result

We compared LAWS to t-LAMP (by Zhuo et. al. 2009) and ARMS (Yang et. al. 2007), where t-LAMP “borrows” knowledge by building syntax mappings; ARMS learns without “borrowing” knowledge.

The results are shown below:

Experimental Result

We can see that LAWS > t-LAMP > ARMS: accuracies of LAWS are higher than t-

LAMP and ARMS, which empirically shows the advantage of LAWS. accuracies decrease when plan traces increase, which is consistent with

our intuition, since more information will help learning.

Experimental Result

We also test the following three cases: Case I(γ = 0): not borrowing knowledge; Case II(γ = 0.5 and wo = 1): weights of web constraints are the same,

i.e., not using similarity function; Case III(γ = 0.5): using the similarity function.

The results are shown bellow:

Experimental Result

We can see that: Case III > the other two: suggests the similarity function could

really help improve the learning result; Case II > Case I: suggests that web constraints is helpful;

Experimental Result

Next, we test different ratios of states: Accuracy generally

increases when the ratio increases;

This is consistent with our intuition, since the increasing information could help improve the learning result.

Experimental Result

We also test different values of γ: When γ increases from 0 to 0.5, the accuracy increases, which exhibits

when the effect of web knowledge enlarges, the accuracy gets higher; However, when γ is larger than 0.5, the accuracy decreases when γ

increases. This is because the impact of plan traces is relatively reduced. This suggests knowledge from plan traces is also important in learning high-quality action models.

Cpu Time

The Cpu time is smaller than 1,000 seconds on a typical 2 GHZ PC with 1GB memory.

It is quite reasonable in learning. However, it did not include web searching time, since it mainly depends on specific network quality.

Conclusion

In this paper, we propose an algorithm framework to “borrow” knowledge from another domain with web search, and empirically show the improvement of the learning quality.

Our work can be extended to more complex action models, e.g., PDDL models.

Can also be extended to multi-task action-model acquisition.

Thank You

Documents

Cross-Domain Action-Model Acquisition for Planning viaWeb Search