55
Strategic and Tactical Planning A glimps of future Zinovi Rabinovich Jeffrey S. Rosenschein School of Engineering and Computer Sciences Hebrew University in Jerusalem Strategic and Tactical Planning – p.1/23

Strategic and Tactical Planning - The Rachel and Selim Benin

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Strategic and Tactical PlanningA glimps of future

Zinovi RabinovichJeffrey S. Rosenschein

School of Engineering and Computer SciencesHebrew University in Jerusalem

Strategic and Tactical Planning – p.1/23

Agenda

Planning - the common viewBlocks world exampleDriving a carStrategic vs. Tactical Planning’Let it be’ plansTactics - proposed solutionPotential applicability

Strategic and Tactical Planning – p.2/23

Planning - AI inheritance

Recall the most classical AI view of the world: StateOriented Domains (SOD)[3]:

A world is perceived to be in one of a presetgroup of statesand a set of actions is provided to shift the worldfrom one state to another

Notice that it is a very nice kind of worldWorld’s state is knownActions perform in exactly the prescribed wayClosed world assumption holds

Strategic and Tactical Planning – p.3/23

Planning - AI inheritance

Recall the most classical AI view of the world: StateOriented Domains (SOD)[3]:

A world is perceived to be in one of a presetgroup of statesand a set of actions is provided to shift the worldfrom one state to another

Notice that it is a very nice kind of worldWorld’s state is knownActions perform in exactly the prescribed wayClosed world assumption holds

Strategic and Tactical Planning – p.3/23

Planning - AI inheritance

Recall the most classical AI view of the world: StateOriented Domains (SOD)[3]:

A world is perceived to be in one of a presetgroup of statesand a set of actions is provided to shift the worldfrom one state to another

Notice that it is a very nice kind of world

World’s state is knownActions perform in exactly the prescribed wayClosed world assumption holds

Strategic and Tactical Planning – p.3/23

Planning - AI inheritance

Recall the most classical AI view of the world: StateOriented Domains (SOD)[3]:

A world is perceived to be in one of a presetgroup of statesand a set of actions is provided to shift the worldfrom one state to another

Notice that it is a very nice kind of worldWorld’s state is knownActions perform in exactly the prescribed wayClosed world assumption holds

Strategic and Tactical Planning – p.3/23

Planing - AI inheritance (cont)

In SODs, planning’s subject is to sway the worldfrom one state to another in some optimal way.

Thus, a plan is a sequence of actions that brings theworld from the current state to a certain target state.The usual level of optimality of such plans is thenumber of steps (actions) prescribed by the plan toreach the desired state.

Strategic and Tactical Planning – p.4/23

Planing - AI inheritance (cont)

In SODs, planning’s subject is to sway the worldfrom one state to another in some optimal way.Thus, a plan is a sequence of actions that brings theworld from the current state to a certain target state.

The usual level of optimality of such plans is thenumber of steps (actions) prescribed by the plan toreach the desired state.

Strategic and Tactical Planning – p.4/23

Planing - AI inheritance (cont)

In SODs, planning’s subject is to sway the worldfrom one state to another in some optimal way.Thus, a plan is a sequence of actions that brings theworld from the current state to a certain target state.The usual level of optimality of such plans is thenumber of steps (actions) prescribed by the plan toreach the desired state.

Strategic and Tactical Planning – p.4/23

Blocks WorldFor example consider the Blocks World domain...

Plan:Move black from 2 onto white at 1Move gray from 3 onto table at 2Move black from 1 onto gray at 3Move white from 1 onto gray at 2

Strategic and Tactical Planning – p.5/23

Blocks WorldFor example consider the Blocks World domain...

Plan:Move black from 2 onto white at 1Move gray from 3 onto table at 2Move black from 1 onto gray at 3Move white from 1 onto gray at 2

Strategic and Tactical Planning – p.5/23

Blocks WorldFor example consider the Blocks World domain...

Plan:Move black from 2 onto white at 1Move gray from 3 onto table at 2Move black from 1 onto gray at 3Move white from 1 onto gray at 2

Strategic and Tactical Planning – p.5/23

Blocks WorldFor example consider the Blocks World domain...

Plan:Move black from 2 onto white at 1Move gray from 3 onto table at 2Move black from 1 onto gray at 3Move white from 1 onto gray at 2

Strategic and Tactical Planning – p.5/23

Blocks WorldFor example consider the Blocks World domain...

Plan:Move black from 2 onto white at 1Move gray from 3 onto table at 2Move black from 1 onto gray at 3Move white from 1 onto gray at 2

Strategic and Tactical Planning – p.5/23

Blocks WorldFor example consider the Blocks World domain...

Plan:Move black from 2 onto white at 1Move gray from 3 onto table at 2Move black from 1 onto gray at 3Move white from 1 onto gray at 2

Strategic and Tactical Planning – p.5/23

Rigidity of approach

However all the major assumptions of SOD cease to existas we attempt to move toward a more realistic setup.

Our sensory system is clogged by noise and is asubject to aliasingActions are seldom accurate and tend to have sideeffectsThe world keeps ’spinning’ without any interventionfrom us

Strategic and Tactical Planning – p.6/23

Rigidity of approach

However all the major assumptions of SOD cease to existas we attempt to move toward a more realistic setup.

Our sensory system is clogged by noise and is asubject to aliasing

Actions are seldom accurate and tend to have sideeffectsThe world keeps ’spinning’ without any interventionfrom us

Strategic and Tactical Planning – p.6/23

Rigidity of approach

However all the major assumptions of SOD cease to existas we attempt to move toward a more realistic setup.

Our sensory system is clogged by noise and is asubject to aliasingActions are seldom accurate and tend to have sideeffects

The world keeps ’spinning’ without any interventionfrom us

Strategic and Tactical Planning – p.6/23

Rigidity of approach

However all the major assumptions of SOD cease to existas we attempt to move toward a more realistic setup.

Our sensory system is clogged by noise and is asubject to aliasingActions are seldom accurate and tend to have sideeffectsThe world keeps ’spinning’ without any interventionfrom us

Strategic and Tactical Planning – p.6/23

Planning solutions to world dynamics

Conditional plans[2, 1] came to help to deal withcontingencies of plan executionas well as Partial (Global) Planning, to use multipleparticipating entities and develop the plan on-the flyand many others: via negotiation, mixed-initiative,etc.

... but they all assume that we’d like to keep the worldstate under control, or at least in certain descriptivebounds...

Strategic and Tactical Planning – p.7/23

Planning solutions to world dynamics

Conditional plans[2, 1] came to help to deal withcontingencies of plan executionas well as Partial (Global) Planning, to use multipleparticipating entities and develop the plan on-the flyand many others: via negotiation, mixed-initiative,etc.

... but they all assume that we’d like to keep the worldstate under control, or at least in certain descriptivebounds...

Strategic and Tactical Planning – p.7/23

Driving a car

Imagine a car running in a single lane road, e.g. formulaone race-car

What is the set of all possible states?

Set of all possible margins from road edgeWe can discretize the domain for simplicity

What is the subset of all states we’d like to be in?

Those in the middle of the road, far-far awayfrom the edges, raw-ground and pedestrians

Strategic and Tactical Planning – p.8/23

Driving a car

Imagine a car running in a single lane road, e.g. formulaone race-car

What is the set of all possible states?Set of all possible margins from road edgeWe can discretize the domain for simplicity

What is the subset of all states we’d like to be in?

Those in the middle of the road, far-far awayfrom the edges, raw-ground and pedestrians

Strategic and Tactical Planning – p.8/23

Driving a car

Imagine a car running in a single lane road, e.g. formulaone race-car

What is the set of all possible states?Set of all possible margins from road edgeWe can discretize the domain for simplicity

What is the subset of all states we’d like to be in?Those in the middle of the road, far-far awayfrom the edges, raw-ground and pedestrians

Strategic and Tactical Planning – p.8/23

Driving a car (cont)

Consider the following plan: push the car into (roughly)the middle of the road and leave it there

The plan achieves the presence of a car in the middleof the roadThe plan works almost always (we can’t eradicate the case where a

16ton carrier will propel the car into oblivion) and does not need revisionThough the plan was correct, we didn’t mean for the carto stay stationary...So what happened?

Strategic and Tactical Planning – p.9/23

Driving a car (cont)

Consider the following plan: push the car into (roughly)the middle of the road and leave it there

The plan achieves the presence of a car in the middleof the roadThe plan works almost always (we can’t eradicate the case where a

16ton carrier will propel the car into oblivion) and does not need revision

Though the plan was correct, we didn’t mean for the carto stay stationary...So what happened?

Strategic and Tactical Planning – p.9/23

Driving a car (cont)

Consider the following plan: push the car into (roughly)the middle of the road and leave it there

The plan achieves the presence of a car in the middleof the roadThe plan works almost always (we can’t eradicate the case where a

16ton carrier will propel the car into oblivion) and does not need revisionThough the plan was correct, we didn’t mean for the carto stay stationary...So what happened?

Strategic and Tactical Planning – p.9/23

Driving a (moving) car

There are actually two different reasoning levels fordriving a (moving) car:

The reason for being in that car - desire to trace atrajectory from point A to point B over timeThe reason for wheels adjustment - forcing a car tostay at a given trajectory over time

We do not have a stationary, goal margin(s) to road edges.Rather we’d like it to develop according to a certain de-sign.

Strategic and Tactical Planning – p.10/23

Driving a (moving) car

There are actually two different reasoning levels fordriving a (moving) car:

The reason for being in that car - desire to trace atrajectory from point A to point B over time

The reason for wheels adjustment - forcing a car tostay at a given trajectory over time

We do not have a stationary, goal margin(s) to road edges.Rather we’d like it to develop according to a certain de-sign.

Strategic and Tactical Planning – p.10/23

Driving a (moving) car

There are actually two different reasoning levels fordriving a (moving) car:

The reason for being in that car - desire to trace atrajectory from point A to point B over timeThe reason for wheels adjustment - forcing a car tostay at a given trajectory over time

We do not have a stationary, goal margin(s) to road edges.Rather we’d like it to develop according to a certain de-sign.

Strategic and Tactical Planning – p.10/23

Driving a (moving) car

There are actually two different reasoning levels fordriving a (moving) car:

The reason for being in that car - desire to trace atrajectory from point A to point B over timeThe reason for wheels adjustment - forcing a car tostay at a given trajectory over time

We do not have a stationary, goal margin(s) to road edges.Rather we’d like it to develop according to a certain de-sign.

Strategic and Tactical Planning – p.10/23

Strategic vs. Tactical Planning

Planning (and especially in dynamic environment) is(roughly) a two level construction:

Strategic - high level - transforming system goalsinto desired system dynamicsTactical - low level - building a sequence of actionsthat attempt to force the system into the desired formof dynamics.

Strategic and Tactical Planning – p.11/23

Strategic/Tactical Loop

The two levels of the hierarchy create a relentless flow ofplanning:

Given global goal and previous success of followingstrategic directives, update and formulate analternative strategyGiven a strategy, provide tactical (= implementation)support and evaluation of feasibility

Strategic and Tactical Planning – p.12/23

Compare:

In classical planning tactical level is degenerative

Even in conditional planning we allow the system todevelop freely and simply describe for each contingencythe desired continuation

Strong ’tactical’ level allows re-planning procedures(should such occur) to be part of a standard planningloop

Previously exceptional, radical, potentially fatal (strategic)plan failure, now becomes a common, mild, recoverablesituation, part of normal activity

Strategic and Tactical Planning – p.13/23

Compare:

In classical planning tactical level is degenerativeEven in conditional planning we allow the system todevelop freely and simply describe for each contingencythe desired continuation

Strong ’tactical’ level allows re-planning procedures(should such occur) to be part of a standard planningloop

Previously exceptional, radical, potentially fatal (strategic)plan failure, now becomes a common, mild, recoverablesituation, part of normal activity

Strategic and Tactical Planning – p.13/23

Compare:

In classical planning tactical level is degenerativeEven in conditional planning we allow the system todevelop freely and simply describe for each contingencythe desired continuation

Strong ’tactical’ level allows re-planning procedures(should such occur) to be part of a standard planningloop

Previously exceptional, radical, potentially fatal (strategic)plan failure, now becomes a common, mild, recoverablesituation, part of normal activity

Strategic and Tactical Planning – p.13/23

Compare:

In classical planning tactical level is degenerativeEven in conditional planning we allow the system todevelop freely and simply describe for each contingencythe desired continuation

Strong ’tactical’ level allows re-planning procedures(should such occur) to be part of a standard planningloop

Previously exceptional, radical, potentially fatal (strategic)plan failure, now becomes a common, mild, recoverablesituation, part of normal activity

Strategic and Tactical Planning – p.13/23

FormalismTo formalize the tactical level operations we use aPOMDP like description:

Given a system described by:

A set of possible states

Possible control actions

System transition dynamics:

��� ��� � � � �

An initial state � �

Set of possible observations�

Observation probabilities�� �� � � � �

Strategic target � � � � � �

Find the sequence of actions such that observed system dynamics���� � � � �

would be as close as possible to � - minimize tacticaldistance

Strategic and Tactical Planning – p.14/23

Stayin’ alive plans

How can we measures distance between twofunctions?

The functions are actually probabilities useKulbach-Leibler distance

How do we treat time and value over time?Compute resulting probability distribution ofdistance and keep the probability of breaking athreshold low - just stay alive

Strategic and Tactical Planning – p.15/23

Stayin’ alive plans

How can we measures distance between twofunctions?

The functions are actually probabilities useKulbach-Leibler distance

��� � � � � � ���

� � ��� � � � � �

How do we treat time and value over time?Compute resulting probability distribution ofdistance and keep the probability of breaking athreshold low - just stay alive

Strategic and Tactical Planning – p.15/23

Stayin’ alive plans

How can we measures distance between twofunctions?

The functions are actually probabilities useKulbach-Leibler distance

��� � � � � � ���

� � ��� � � � � �

How do we treat time and value over time?

Compute resulting probability distribution ofdistance and keep the probability of breaking athreshold low - just stay alive

Strategic and Tactical Planning – p.15/23

Stayin’ alive plans

How can we measures distance between twofunctions?

The functions are actually probabilities useKulbach-Leibler distance

��� � � � � � ���

� � ��� � � � � �

How do we treat time and value over time?Compute resulting probability distribution ofdistance and keep the probability of breaking athreshold low - just stay alive

Strategic and Tactical Planning – p.15/23

Tactics - proposed solution

Keep track of probable system state - � � !#" $Keep track of estimated system transitions - ��� !" %& " $

Given current beliefs select an action:

' ( ) *+ , - .0/12 3 4�5 6

7 4�5 89 1�: 5 6

3 4�; 9 5 8 : 1 : 5 6<>= ? @BA 7C D @BE FG E H G GBI @BE F G E H H

But how can we keep track of and ?

Strategic and Tactical Planning – p.16/23

Tactics - proposed solution

Keep track of probable system state - � � !#" $Keep track of estimated system transitions - ��� !" %& " $

Given current beliefs select an action:

' ( ) *+ , - .0/12 3 4�5 6

7 4�5 89 1�: 5 6

3 4�; 9 5 8 : 1 : 5 6<>= ? @BA 7C D @BE FG E H G GBI @BE F G E H H

But how can we keep track of � !" $

and � !" %& " $

?

Strategic and Tactical Planning – p.16/23

Proposed solution (cont)

Initialize your beliefs to some prior distribution

Use “Bayesian anti-aliasing” for � !#" $

:

A 7C D @BE H J A @BK G EML ' H5 8

N @BE G 'L E F H A 7 @BE F H

For � !#" %& " $

solve the following:

- .0/3 4�5 8 9 5 6

2 3O 4 5 6 <>= ? @BA @BE F G E H G G A 7 @E F G E H H

s.t.

A 7C D @E H ) PE F5

A @BE FG E H A 7 @BE H

PEF

5A @E F G E H ) Q

Strategic and Tactical Planning – p.17/23

Conditional applicability

Consider a multi-agent system with communication under thefollowing assumptions:

Communication activity does not changes the environment andis integrated into the global action set

Action cost and state transition evaluation are separable

denote R @BE FL 'L E H

the overall value of transition from state E

to E F

under action ' then exists:

R @BE FL 'L E H ) I @E F G E H S T @ 'G E H

Then it is possible to create (optimal) control of the system using the

above strategic vs. tactical planning paradigmStrategic and Tactical Planning – p.18/23

MAS Strategic vs. Tactical protocol

Strategic levels of different agents will agree upon atarget in a communication session

Basically creating a common evaluation function, and converting it into a target distribution

Under assumption of complete coordination, anagent will use proposed tactical planning to complywith the strategy

Strong failure of the strategy, will initiate communicationfor the repetition of the strategic layer operation

Strategic and Tactical Planning – p.19/23

MAS Strategic vs. Tactical protocol

Strategic levels of different agents will agree upon atarget in a communication session

Basically creating a common evaluation function

I U VXW V Y Z

, and converting it into a target distribution

Under assumption of complete coordination, anagent will use proposed tactical planning to complywith the strategy

Strong failure of the strategy, will initiate communicationfor the repetition of the strategic layer operation

Strategic and Tactical Planning – p.19/23

MAS Strategic vs. Tactical protocol

Strategic levels of different agents will agree upon atarget in a communication session

Basically creating a common evaluation function

I U VXW V Y Z

, and converting it into a target distribution

Under assumption of complete coordination, anagent will use proposed tactical planning to complywith the strategy

Strong failure of the strategy, will initiate communicationfor the repetition of the strategic layer operation

Strategic and Tactical Planning – p.19/23

MAS Strategic vs. Tactical protocol

Strategic levels of different agents will agree upon atarget in a communication session

Basically creating a common evaluation function

I U VXW V Y Z

, and converting it into a target distribution

Under assumption of complete coordination, anagent will use proposed tactical planning to complywith the strategy

Strong failure of the strategy, will initiate communicationfor the repetition of the strategic layer operation

Strategic and Tactical Planning – p.19/23

Tactical communication timing

Communication cost is equivalent to one decision step:Tactical level communication will ariseautomatically as a sole action that does not have thecapability to hinder the system developmentdynamics

[ \ � ]^ �_ `bacd e f � gh f � ij c k � ge fml j � i k c k � g�n� � � ho p � q � � � � � � q � �

Strategic and Tactical Planning – p.20/23

Tactical communication timing

Communication cost is an elaborate function:rs t

Convert the function into distributionKeep track of action usage distribution u !#v & " $

.Use the following to select an action:

[ \ � ]^ �_ `acd e f � gh f � ij c k � ge fml j � i k c k � g�� � � h o p � q � � � � � � q � � xw y �n� � {z ho p [ � � � �{| [ � �

Strategic and Tactical Planning – p.20/23

ConclusionA novel view of planning and plans was introduced.New optimality measure for agent behavior in astochastic environment was developed in theframework of continual planning.A feasible algorithm for multi-agent communicationutilization under the measure was proposed.

Strategic and Tactical Planning – p.21/23

Future WorkInvestigate the connections between the classicaloptimality measure(s) and tactical distanceProve/disprove that proposed tactical solution is anoptimal one relative to tactical distanceInvestigate the effects of tactical solution onmulti-agent system with communication

Strategic and Tactical Planning – p.22/23

References[1] Jim Blythe. An overview of planning under

uncertainty. volume 1600 of Lecture Notes inComputer Science, pages 85–?? 1999.

[2] Craig Boutilier, Thomas Dean, and Steve Hanks.Decision-theoretic planning: structural assumptionsand computational leverage. Journal of ArtificialIntelligence Research, 11:1–94, 1999.

[3] Jeffrey S. Rosenschein and Gilad Zlotkin. Rules ofEncounter: Designing Conventions for AutomatedNegoti ation Among Computers. MIT Press,Cambridge, Massachusetts, 1994.

Strategic and Tactical Planning – p.23/23