Autonomous Target Assignment: A Game Theoretical Formulation Gurdal Arslan & Jeff Shamma Mechanical and Aerospace Engineering UCLA AFOSR / MURI

Autonomous Target Assignment:

A Game Theoretical Formulation

Gurdal Arslan & Jeff Shamma

Mechanical and Aerospace EngineeringUCLA

AFOSR / MURI

2

max Global Utility ( assignment )

Setup for Target Assignment Problem

3

Global Utility Example

Global Utility = Utility generated at target j

E [ total value of ( destroyed target – vehicles lost ) ]

j

No engagement hereIndependent engagements

(for example)

4

Joint Optimization

• Can be formulated as an integer programming problem

- Computationally hard

- Relaxation techniques available for suboptimal solutions

• Decentralized implementation

- Requires global information

- Agreement issues can arise

) ( maxa

aU g

Global Utility

),...,( 1 naaAssignment Profile :

5

Game Theory Formulation

• Vehicles are self-interested players with private utilities

• A vehicle need not know other vehicles’ utilities.

• Individual utilities depend on local information only.

• Vehicles negotiate an agreeable assignment.

)(aU i

6

Autonomous Target Assignment Problem

Design

- Vehicle utilities

- Negotiation mechanisms

so that

vehicles agree on an assignment with high Global Utility

using

- low computational power

- low inter-vehicle communication

)(aU i

7

Agreeable Assignment - Nash Equilibrium

• An assignment is a ( pure ) Nash equilibrium if no player has an incentive to unilaterally deviate from it.

• Example :

• Pure Nash equilibria : (1,1), (2,2), (3,3)

• Mixed Nash equilibria : ([ .54 .27 .18] , [ .27 .54 .18])

21 3

2,1 0,0 0,0

0,0 1,2 0,0

0,0 0,0 3,3

1

1 2

2

3

3

8

Utility Design

• Vehicle utilities should be aligned with Global Utility

• Ideal alignment :

– Only globally optimal assignments should be agreeable

– Not possible without computing globally optimal assignments

• Relaxed alignment ( factoredness in Wolpert et al. 2000 ) :

– Globally optimum assignment is always agreeable (pure Nash)

),...,,...,( ),...,~,...,(

),...,,...,( ),...,~,...,(

11

11

nignig

niinii

aaaUaaaU

aaaUaaaU

9

Aligned Utilities - Team Play

• For every vehicle,

• Example :

• Not localized- Each vehicle needs global information

- Low Signal-to-Noise-Ratio (Wolpert et al. 2000)

)()( aUaU gi

21

2 , 2 0 , 0

0 , 0 1 , 1

1

1 2

2

Suboptimal Nash

10

Aligned Utilities - Wonderful Life Utility (Wolpert et al. 2000)

• Marginal contribution of vehicle # i to Global Utility, i.e.,

• Localized :

- Equal marginal contribution to engagements within range

- Signal-to-Noise-Ratio is maximized

)" 0 " :()()( iggi aaUaUaU

no engagement

11

Aligned Utilities - Wonderful Life Utility (Wolpert et al. 2000)

• Aligned :

• Leads to a Potential Game with potential

• Convergent negotiation mechanisms for potential games

),...,,...,( ),...,~,...,(

),...,,...,( ),...,~,...,(

11

11

nignig

niinii

aaaUaaaU

aaaUaaaU

)(aU g

12

A Misaligned Utility Structure

• Equally Shared Utilities :

• Hence

• Global optimum may not be Nash agreeable

• A pure Nash agreeable assignment may not exists at all !

aaU i

i vehiclesingparticipat ofnumber

with engagementby generatedutility )(

i

ig aUaU )()(

13

Negotiation Mechanisms

• At step k, vehicle # i proposes a target

based on the past proposal profiles

• Is there a reasonable negotiation mechanism that leads to a Nash equilibrium ?

• Adopt learning methods in repeated games

)1(),...,1( kaa

)(kai

14

Fictitious Play (FP)

• Empirical frequencies of past proposals

• Vehicle # i proposes the best response to

• Initial proposals are random

.....) , %30 , 20(%)( kqi

target # 1 target # 2

) )1(,...),...,1( ( maxarg)( 1 kqakqUka niia

ii

15

Convergence of FP in Potential Games

• Convergent to a (possibly mixed) Nash equilibrium

• Believed to be generically convergent to a pure Nash

• May be trapped at a suboptimal Nash

21

2 , 2 0 , 0

0 , 0 1 , 1

1

1 2

2

1k allfor ) 2 , 2 () ( ) 2 , 2 () 1 ( kaa

16

Stochastic FP

• Randomness may help avoid a suboptimal assignment

• Positive probability of convergence to any pure Nash equilibrium in almost all potential games

• Conjecture :

Stochastic FP converges to one of the pure Nash equilibria almost surely, in almost all potential games.

)( ) )1(,...),...,1( ( maxarg~)( 1 iniip

i pHkqpkqUkai

17

Stochastic FP with Utility Measurements

• Empirical average payoff for past proposals

• Propose the target with largest empirical average payoff

Ti kVkVkV

ii.....) , )( , )( ()( 21

)( maxarg~)( iiT

pi pH(k)Vpka

ii

target # 1 target # 2

18

Stochastic FP with Utility Measurements

• Advantage :

Don’t need to keep track of empirical frequencies

• Conjecture :

Converges to one of the pure Nash equilibria almost surely, in almost all potential games.

• Convergence may be slower than FP

19

Near Optimum Performance

• Example:

40 uniform weapons negotiate 40 non-uniform targets

20

Simulation Environment

• Consists of entities and a battlefield

21

Entity Types

• Entities can be of different types

• Each entity type represented by a data structure

• For example:

type = ‘uav’

side = ‘blue’

attributes = {‘radius’ ‘pkill’ ‘SARscanrate’ … }

states = {‘health’ ‘location’ ‘heading’ ‘nstores’ … }

routine = ‘uav_rule’;

22

Entity Rules

• How an entity type interacts and performs tasks

• For example, routine for `uav’ type :

• If UAV is alive

– Find all SAM’s within radius

– If no missile is launched & UAV has ammo

Target closest live enemy

23

State Space

• State space consists of

- states of all entities

- environment states

number of iterations left, etc.

• Simulation state is updated essentially based on the entity rules

• At each step, simulation state is displayed using visualization tools.

24

Simulation

• Scalable: One routine for each entity type

• Easy to modify, introduce new types, rules, etc.

25

Extensions & Issues

• Investigate other negotiation mechanisms

- Gradient based mechanisms

- Replicator dynamics

- Finite Memory

- Asynchronous mechanisms

• Explore applications

- Traffic management

- Routing in communication networks

Documents

Autonomous Target Assignment: A Game Theoretical Formulation Gurdal Arslan & Jeff Shamma Mechanical and Aerospace Engineering UCLA AFOSR / MURI