Seminar on Coordinated Systems

Seminar onCoordinated Systems

Department of Systems and Information Engineering

University of Virginia

SYS 793Fall 2004

2

Motivation• We are increasingly reliant on large-scale,

distributed engineering systems: Internet, national power grid, etc.– coordination is often achieved in engineering systems through the

specification of ad hoc protocols for relatively well defined (constrained) interactions between distributed systems.

– existing protocols are not adequately tuned for new applications and/or unexpected situations

– there appears to be little in the way of underlying guiding principles and theory for designing and operating such systems

– Notions of game theory and decentralized control go only part way toward revealing the basic problems associated with distributed engineering systems

3

SYS 793• In this seminar:

– we will explore the recent literature on the intersections between game theory, distributed planning in robotics and artificial intelligence, and distributed control.

– Faculty and students are expected to present a paper (to be chosen from the list given below).

• After each presentation we shall have a discussion session where the main contributions of each paper will be critically assessed.

4

Parameters

• Time:– Wednesdays, 5-6:15

• Place:– Olsson 005

• Credit:– One hour, pass/fail

• Registration:– Schedule number 95083 (SYS 793, Section 2)

• Webpage:https://toolkit.itc.virginia.edu/cgi-local/tk/UVa_SEAS_2004_Fall_SYS793-2

https://toolkit.itc.virginia.edu/cgi-local/tk/UVa_SEAS_2004_Fall_SYS793-2



5

Student Responsibilities

• To earn a passing grade, each student will:– prepare a presentation on at least one

approved paper– lead subsequent discussion on salient features

of the paper– hand over slides for publication on the course

website

6

Partial List of Approved PapersR. Aumann. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1:67–96, 1974.

H. Bui, S. Venkatesh, and D. Kieronska. A framework for coordination and learning among teams of agents. Lecture Notes in Computer Science, 1441:164–178, 1998.

J. Doran, S. Franklin, N. Jennings, and T. Norman. On cooperation in multi-agent systems. The Knowledge Engineering Review, 12(3):309–314, 1997.

G. Ellison. Learning, local interaction and coordination. Econometrica, 61:1047–1071, 1993.

D. Gauthier. Coordination. Dialogue, 14:195–221, 1975.

D. Gilbert. Rationality and salience. Philosophical Studies, 57:61–77, 1989.

V. Gervasi and G. Prencipe. Robotic cops: The intruder problem. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, 2003.

M. Kandori, G. Mailath, and R. Rob. Learning, mutation and long-run equilibria in games. Econometrica, 61:29–56, 1993.

G. Prencipe. Corda: Distributed coordination of a set of autonomous mobile robots. In Proceedings of the Fourth European Research Seminar on Advances in Distributed Systems, pages 185–190, 2001.

P. Vanderschraaf. Learning and coordination: inductive deliberation, equilibrium and convention. Routledge, 2001.

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibirum in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002), 2002.

G. Weiss. Multiagent Systems: a Modern Approach to Distributed Artificial Intelligence. The MIT Press, Cambridge, 1999.

7

Agenda

• Coordinated Systems Research Group (CSRG)

• Introduction to “Coordinated Systems”

Coordinated Systems Research Group

9

CSRG Mandate• The CSRG focuses on issues of coordination in large

scale decentralized engineering systems.– Engineering systems are increasingly reliant on coordination,

more than on (centralized) optimization processes and/or control.

– Typically, coordination is achieved through ad hoc protocols for relatively well defined (constrained) interactions between distributed systems.

– However, as information systems become more integrated into society, we find that existing protocols aren’t adequately tuned for new applications and/or unexpected situations.

• CSRG goals:– Develop theory for achieving coordination with limited

communication and without the benefit or predefined protocols– Transform theory into practice by developing and evaluating

prototype applications

10

CSRG Activities

• Distributed decision-making without communication– fault tolerant and adaptive construction of overlay networks for

group communication and collaboration.– remote sensing.– robotic coordination.

• Auction mechanisms for “on-demand” IT services– market mechanisms for efficient allocation of distributed

computation and communication services

• Fictitious play in Internet traffic engineering– account for competing interests of end-to-end connections within

and across Internet autonomous systems

11

Peter Beling

• Associate Professor– Department of Systems and

Information Engineering

• Research:– Financial engineering– Optimization theory &

computational complexity

• Funded Projects:– Solution Concepts for Static

Coordination Problems• NASA LaRC Grant NNL-04-AA66G

12

Alfredo Garcia• Assistant Professor

– Department of Systems and Information Engineering

• Research:– Modeling and control of

communications networks– Stochastic Optimization and

Optimal Control• Funded Projects:

– Complex Networks Optimization

• NSF Grant DMI-0217371– Security of Supply & Strategic

Learning in Restructured Power Markets

• NSF Grant ECS-0224747

13

Stephen Patek• Assistant Professor

– Department of Systems and Information Engineering

• Research:– Modeling and control of

communications networks– Stochastic Optimization and

Optimal Control– Coordination Processes

• Funded Projects:– Dynamic Coordination Processes

for Distributed Planning with Limited Communication, NSF (DST-0414727)

14

Current Students

• Himanshu Gupta

• Kaushik Sinha

• Yijia Zhao

Introduction to Coordinated Systems

16

Example: Dial / Wait• Two-Players, Two Actions:

• What makes sense?– Two “coordinated minimum-cost solutions” :

(Dial, Wait) and (Wait, Dial).– Unfortunately, neither player knows which one to select.

Dial

Dial

Wait

Wait

17

Some Thoughts• Arbitrarily selecting an action is irrational!

– If Player 1 arbitrarily chooses to Dial, Player 2 could also arbitrarily choose to Dial, resulting in a worst case outcome.

– Worst case solutions are easy to achieve arbitrarily (without coordination).

• Randomly selecting an action makes more sense!– If Player 1 chooses to Dial with probability p and Player 2 chooses to

Dial with probability q, then the expected cost of the outcome is

– Only in introducing “mixed actions” (randomized decisions) is it meaningful to talk about expected cost.

)connectingPr(Not

)1)(1(

],|[E),(

qppq

qpCostqpV

18

Interesting Observations• Suppose Player 1 chooses p = .5, then

In other words, regardless of Player 2’s decision, Player 1 is able to “lock in” an expected cost of .5.

• Suppose Player 2 chooses q = .5, then

In other words, regardless of Player 1’s decision, Player 2 is able to lock in an expected cost of .5.

5.)]1([5.)1)(5.1(5.],5.|Cost[E qqqqq

5.)]1([5.)5.1)(1()5(.]5,.|Cost[E ppppp

19

A Strong Equilibrium

Each player has the ability to lock in an expected have of .5, namely by choosing p = .5 and q = .5, respectively.– In fact, given p = .5 and q = .5, neither player has

the ability to change (let alone) improve the expected cost of the solution.

– So, p = .5 and q = .5 constitutes a strong equilibrium solution for the Dial / Wait problem.

20

Connection to Game Theory

• The Dial / Wait problem is a finite, two-player, non-cooperative, game in strategic form (with identical interests)

• For the Dial / Wait game there are two pure strategy Nash equilibria:– (Dial, Wait)– (Wait, Dial)

and exactly one mixed (non-pure) strategy mixed Nash equilibrium:– p = .5, q = .5.

21

Remarks

• The pure strategy equilibria are completely uninteresting.– They can only be achieved if the players are allowed to coordinate their

actions.

• Minimax is correct here, but it is not a reasonable approach in general.

• The mixed strategy equilibrium makes a lot of sense.– It achieves a decent value of expect cost and is not the result of an

arbitrary decision.– In fact, all that’s required in this example is that one of the two players

play their equilibrium strategy.

• But, in general, there can be multiple mixed strategy equilibria, and it is not obvious which one each player would select.

N-Agent, Multistage Problems

23

Uncoordinated Decision-Making

We consider N-agent multistage decision situations, where– all N agents must select an available action at each

stage, without coordinating their actions in advance (simultaneous decision making)

– all N agents perceive the same cost (disutility) associated the joint selection of actions at each stage.

24

Key Idea

Since all agents share the same notion of cost, they would coordinate their actions (if they could) to pick out a minimum-cost joint solution.– This would an easy, even without coordination, if

there were a single minimum-cost joint selection of actions.

– Unfortunately, if more than one “coordinated minimum-cost solution” exists, then there may not be a clear course of action for all agents.

25

Dynamic Version of the Dial / Wait Problem

• Both players decide to Dial or Wait in stages.– If both decide to Dial or if both decide to Wait, then they remain

unconnected.

• Both players are interested in reaching in as few stages as possible.

not connected1

connected

(Dial, Dial)(Wait, Wait)

(Dial, Wait)(Wait, Dial)

26

Formalization

• Statespace, X– Set of conditions associated with the operation of an underlying system.

• Mixed Action sets, Ai(x)

– Actions available to player i when the system is in state x 2 X.

• Transition reward function, gx(a1, …, aN)

– Expected reward (perceived equally by all players) associated with the profile of mixed actions (a1, …., aN) at state x 2 X.

• State transition probability, pxy(a1, …., aN)

– Probability of transitioning from x 2 X to y 2 X under the profile of actions (a1, …, aN).

• Time horizon, T– Number of stages of decision-making before the process ends.

Network Connection Recovery

Example

28

A Bn

m

n-m

…

n connections

…

Set Up• Consider a network in which n connections are

served by a direct link between two nodes A and B.

• Suppose two alternative links exist with capacities m and (n-m), respectively.

• Note that all n connections can still be accommodated, but how should they re-route themselves?

29

Dynamic Recovery

• The initial state of this process is < m, n-m >.• Rules:

– If k < m connections select the top link, then those k connections are satisfied, but the other link is overwhelmed.

• The system transitions to < m-k, n-m >, with n-k connections left to be satisfied.

• We are left with a network routing problem similar to the original one but involving the same or fewer unsatisfied connections.

– If k = m connections select the top link, then all n connections are satisfied.

• The system transitions to – If k > m connections select the top link, then the top link is

overwhelmed, but the n-k other connections are satisfied.• The system transitions to < m, k-m >, with k connections left to be

satisfied.• We are left with a network routing problem similar to the original one

but involving the same or fewer unsatisfied connections.

• All players randomize their selection of links.

Remote Sensing

Example

31

Problem Overview

• N sensing platforms– Identical in capability

• M targets

• Autonomous planning

• Strict collision avoidance protocol– Impossible for two or more platforms investigate a

single target simultaneously

• Finite Time Horizon, T

32

Simple Example

• Parameters– 3 UAVs– 5 points of interest– 2 sensing opportunities

• Stage 1– Green UAV successfully

reaches its target– Blue and Red UAVS

compete and fail to make the necessary observations

33

Simple Example

• Stage 2– Green and Blue UAVs

compete a new target– Red UAV is successful

• Summary– We end up only

observing two targets!– (With coordination, we

could have arranged for all points of interest to be revealed.)

34

Research Question

What decision rule should we implement within each sensing platform so that we gather as much information as possible within the available time, without requiring the platforms to coordinate their actions?

SYS 793 Topics of Interest

36

Topics of Interest• Game Theory

– solution concepts for rational decision-making– alternatives to Nash and correlated equilibria for non-

cooperative games– equilibrium selection

• Learning algorithms for games– including fictitious play– reinforcement learning

• Multi-Agent Frameworks/Applications– Robotic Cops, Robo-anything– Team theory

• Philosophy of Coordination

Documents

Seminar on Coordinated Systems