33
Introduction to Collectives Kagan Tumer NASA Ames Research Center [email protected] http://ic.arc.nasa.gov/~kagan http://ic.arc.nasa.gov/projects/COIN/ index.html (Joint work with David Wolpert)

Introduction to Collectives

  • Upload
    kylia

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Introduction to Collectives. Kagan Tumer NASA Ames Research Center [email protected] http://ic.arc.nasa.gov/~kagan http://ic.arc.nasa.gov/projects/COIN/index.html (Joint work with David Wolpert). Outline. Introduction to collectives Definition / Motivation - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to Collectives

Introduction to Collectives

Kagan Tumer

NASA Ames Research Center

[email protected]

http://ic.arc.nasa.gov/~kagan

http://ic.arc.nasa.gov/projects/COIN/index.html

(Joint work with David Wolpert)

Page 2: Introduction to Collectives

CDCS 2002 K. Tumer 2

Ames Research Center

Outline

• Introduction to collectives– Definition / Motivation– A naturally occurring example

• Illustration of theory of collectives I– Central equation of collectives

• Interlude 1:– Autonomous defects problem (Johnson and Challet)

• Illustration of theory of collectives II– Aristocrat utility– Wonderful life utility

• Interlude 2:– El Farol bar problem: System equilibria and global optima– Collective of rovers: Scientific return maximization

• Final thoughts

Page 3: Introduction to Collectives

CDCS 2002 K. Tumer 3

Ames Research CenterMotivation

• Most complex systems, not only can be, but need to be viewed as collectives. Examples include:– Control of a constellation of communication satellites– Routing data/vehicles over a communication network/highway– Dynamic data migration over large distributed databases– Dynamic job scheduling across a (very) large computer grid– Coordination of rovers/submersibles on Mars/Europa– Control of the elements of an amorphous computer/telescope– Construction of parallel algorithms for optimization problems– Autonomous defects Problem

Page 4: Introduction to Collectives

CDCS 2002 K. Tumer 4

Ames Research Center

Collectives

• A Collective is– A (perhaps massive) set of agents;– All of which have “personal” utilities they are trying to achieve;– Together with a world utility function measuring the full

system’s performance.

• Given that the agents are good at optimizing their personal utilities, the crucial problem is an inverse problem:

How should one set (and potentially update) the personal utility functions of the agents so that they “cooperate unintentionally” and optimize the world utility?

Page 5: Introduction to Collectives

CDCS 2002 K. Tumer 5

Ames Research Center

Natural Example: Human Economy

• World utility is GDP– Agents are the individual humans– Agents try to maximize their own “personal” utilities

• Design problem is:– How to modify personal utilities of the agents through

incentives or regulations (e.g., tax breaks, SEC regulations against insider trading, antitrust laws) to achieve high GDP?

– Note: A. Greenspan does not tell each individual what to do.

• Economics hamstrung by “pre-set agents” – No such restrictions for an artificial collective

Page 6: Introduction to Collectives

CDCS 2002 K. Tumer 6

Ames Research Center

Outline• Introduction to Collectives

– Definition / Motivation– A naturally occurring example

• Illustration of Theory of Collectives IIllustration of Theory of Collectives I– Central Equation of CollectivesCentral Equation of Collectives

• Interlude 1:– Autonomous defects problem (Johnson and Challet)

• Illustration of theory of collectives II– Aristocrat utility – Wonderful life utility

• Interlude 2:– El Farol bar problem: System equilibria and global optima– Collective of rovers: Scientific return maximization

• Final thoughts

Page 7: Introduction to Collectives

CDCS 2002 K. Tumer 7

Ames Research Center

Nomenclature

an agentstate of all agents across all time t : state of agent at time t ^t : state of all agents other than at time t

tn

1,t0

^4,t0

4

Page 8: Introduction to Collectives

CDCS 2002 K. Tumer 8

Ames Research CenterKey Concepts for Collectives

• Intelligence: Percentage of states that would have resulted in agent having a worse utility (e.g., SAT-like percentile concept).

• Learnability: Signal-to-noise measure. Quantifies how sensitive an agent’s personal utility function is to a change in its state.

• Factoredness: Degree to which an agent’s personal utility is aligned with the world utility (e.g., quantifies “if you get rich, world benefits” concept).

Page 9: Introduction to Collectives

CDCS 2002 K. Tumer 9

Ames Research Center

• Our ability to control system consists of setting some parameters s (e.g, agents' goals):

Central Equation of Collectives

P(G |s) = dr ε G∫ P(G |

r ε G,s) d

r ε gP(

r ε G |

r ε g,s)P(

r ε g |s)∫

Learnability Factoredness Explore vs. Exploit

Operations Research Economics Machine Learning

– G and g are intelligences for the agents w.r.t the world utility (G) and their personal utilities (g) , respectively

Page 10: Introduction to Collectives

CDCS 2002 K. Tumer 10

Ames Research Center

Outline• Introduction to Collectives

– Definition / Motivation– A naturally occurring example

• Illustration of Theory of Collectives I– Central Equation of Collectives

• Interlude 1:Interlude 1:– Autonomous defects problem (Johnson and Autonomous defects problem (Johnson and

Challet)Challet)• Illustration of Theory of Collectives II

– Aristocrat utility – Wonderful life utility

• Interlude 2:– El Farol bar problem: System equilibria and global optima– Collective of rovers: Scientific return maximization

• Final thoughts

Page 11: Introduction to Collectives

CDCS 2002 K. Tumer 11

Ames Research Center

Autonomous Defects Problem

• Given a collection of faulty devices, how to choose the subset of those devices that, when combined with each other, gives optimal performance (Johnson & Challet).

G(ζ ) =n j a j

j =1

N

nk

k =1

N

∑ nk: action of agent k (nk = 0 ; 1)

aj distortion of component j

• Collective approach: Identify each agent with a component.• Question: what utility should each agent try to maximize?

Page 12: Introduction to Collectives

CDCS 2002 K. Tumer 12

Ames Research Center

Autonomous Defects Problem (N=100)

Page 13: Introduction to Collectives

CDCS 2002 K. Tumer 13

Ames Research Center

Autonomous Defects Problem (N=1000)

Page 14: Introduction to Collectives

CDCS 2002 K. Tumer 14

Ames Research Center

Autonomous Defects Problem: Scaling

Page 15: Introduction to Collectives

CDCS 2002 K. Tumer 15

Ames Research Center

Outline• Introduction to Collectives

– Definition / Motivation– A naturally occurring example

• Illustration of Theory of Collectives I– Central Equation of Collectives

• Interlude 1:– Autonomous defects problem (Johnson and Challet)

• Illustration of Theory of Collectives IIIllustration of Theory of Collectives II– Aristocrat utility Aristocrat utility – Wonderful life utilityWonderful life utility

• Interlude 2:– El Farol bar problem: System equilibria and global optima– Collective of rovers: Scientific return maximization

• Final thoughts

Page 16: Introduction to Collectives

CDCS 2002 K. Tumer 16

Ames Research Center

• Recall central equation:

Personal Utility

P(G |s) = dr ε G∫ P(G |

r ε G,s) d

r ε gP(

r ε G |

r ε g,s)P(

r ε g |s)∫

Learnability Factoredness

• Solve for personal utility g that maximizes learnability, while constrained to the set of factored utilities

Page 17: Introduction to Collectives

CDCS 2002 K. Tumer 17

Ames Research CenterAristocrat Utility

• One can solve for factored U with maximal learnability i.e., a U with good term 2 and 3 in central equation:

• Intuitively, AU reflects the difference between the actual G and the average G (averaged over all actions you could take).

• For simplicity, when evaluating AU here, we make the following approximation:

AUη (ζ ) ≡ G(ζ ) − E[G(ζ ) | ζ ^η ]

= G(ζ ) − pi.G(ζ

^η,CL

η

r s i )

i∑

1

Number of possible actions for pi() =

Page 18: Introduction to Collectives

CDCS 2002 K. Tumer 18

Ames Research Center

• Clamping parameter CLv: replace ’s state (taken

to be unary vector) with constant vector v• Clamping creates a new “virtual” worldline• In general v need not be a “legal” state for • Example: four agents, three actions. Agent 2 clamps

to “average action” vector a = (.33 .33 .33):

Clamping

0 0 0 1 1 1 3 0 9 0 0 0

Page 19: Introduction to Collectives

CDCS 2002 K. Tumer 19

Ames Research CenterWonderful Life Utility

• The Wonderful Life Utility (WLU) for is given by:

– Clamping to “null” action (v = 0) removes player from system (hence the name).

– Clamping to “average” action disturbs overall system minimally (can be viewed as approximation to AU).

– Theorem: WLU is factored regardless of v– Intuitively, WLU measures the impact of agent on the world

• Difference between world as it is, and world without • Difference between world as it is, and world where takes average

action

– WLU is “virtual” operation. System is not re-evolved.

WLUη (ζ ) ≡ G(ζ ) − G(ζ ^η ,CLη

r v )

Page 20: Introduction to Collectives

CDCS 2002 K. Tumer 20

Ames Research Center

Outline• Introduction to Collectives

– Definition / Motivation– A naturally occurring example

• Illustration of Theory of Collectives I– Central Equation of Collectives

• Interlude 1:– Autonomous defects problem (Johnson and Challet)

• Illustration of Theory of Collectives II– Aristocrat utility – Wonderful life utility

• Interlude 2:Interlude 2:– El Farol bar problem: System equilibria and global El Farol bar problem: System equilibria and global

optimaoptima– Collective of rovers: Scientific return maximization

• Final thoughts

Page 21: Introduction to Collectives

CDCS 2002 K. Tumer 21

Ames Research CenterEl Farol Bar Problem

• Congestion game: A game where agents share the same action space, and world utility is a function purely of how many agents take each action.

• Illustrative Example: Arthur’s El Farol bar problem:– At each time step, each agent decides whether to attend a bar:

• If agent attends and bar is below capacity, agent gets reward

• If agent stays home and bar is above capacity, agent gets reward

– Problem is particularly interesting because rational agents cannot all correctly predict attendance:

• If most agents predict attendance will be low and therefore attend, attendance will be high

• If most agents predict high attendance and therefore do not attend …

Page 22: Introduction to Collectives

CDCS 2002 K. Tumer 22

Ames Research Center

Modified El Farol Bar Problem

• Each week agents select one of seven nights to attend a bar

G(ζ ) = xk (ζ t )e− xk (ζ t )

c

k =1

7

∑t

Reward for night k at week t

Rt : Reward for week t

Attendance for night k at week t

Capacity of bar

• Further modifications:– Each week each agent selects two nights to attend bar.– ...– Each week each agent selects six nights to attend bar.

Page 23: Introduction to Collectives

CDCS 2002 K. Tumer 23

Ames Research CenterPersonal Utility Functions

• Two conventional utilities:– Uniform Division (UD): Divide each night’s total reward among

all agents that attended that night (the “natural” reward)

– Team Game (TG): Total world reward at time t (Rt)

• Three collective-based utilities:– WL 0 : WL utility with clamping parameter set to vector of 0s

(world utility minus “world utility without me”)

– WL 1 : WL utility with clamping parameter set to vector of 1s (world utility minus “world utility where I attend every night”)

– WL a : WL utility with clamping parameter set to vector of average action (world utility minus “world utility where I do what is “expected of me”)

Page 24: Introduction to Collectives

CDCS 2002 K. Tumer 24

Ames Research Center

Bar Problem: Utility Comparison

(Attend one night, 60 agents, c=3)

Page 25: Introduction to Collectives

CDCS 2002 K. Tumer 25

Ames Research CenterTypical Daily Bar Attendance

0

20

40

60

80

100

120

140

Daily Attendance

WLU TG UD

Days of week

(c=6; t=1000 s ; Number of agents = 168)

Page 26: Introduction to Collectives

CDCS 2002 K. Tumer 26

Ames Research Center

Scaling Properties (attend one night)

c=2,3,4,6,8,10,15, respectively

Page 27: Introduction to Collectives

CDCS 2002 K. Tumer 27

Ames Research Center

Performance vs. # of Nights to Attend

60 agents; c= 3,6,8,10,10,12,15 respectively

Page 28: Introduction to Collectives

CDCS 2002 K. Tumer 28

Ames Research Center

Collectives of Rovers

• Design a collective of autonomous agents to gather scientific information (e.g., rovers on Mars, submersibles under Europa)

– Some areas have more valuable information than others

– World Utility: Total importance weighted information collected

– Both the individual rovers and the collective need to be flexible so they can adapt to new circumstances

– Collective-based payoff utilities result in better performance than more “natural” approaches

Page 29: Introduction to Collectives

CDCS 2002 K. Tumer 29

Ames Research CenterWorld Utility

• Token value function:

– L : Location Matrix for all agents– L : Location Matrix agent – Lt

a: Location Matrix of agent at time t, had it taken action a at t-1

– : Initial token configuration

V (L,Θ) = Θx ,yx ,y∑ min(1,Lx ,y )

G(ζ ) = V (L ,Θ)

• World Utility :

• Note: Agents’ payoff utilities reduce to figuring out what “L” to use.

Page 30: Introduction to Collectives

CDCS 2002 K. Tumer 30

Ames Research CenterPayoff Utilities

WLUη

r 0 (ζ ) = G(ζ ) − V (L^η ,Θ)

SUη (ζ ) = V (Lη ,Θ)

AUη (ζ ) = G(ζ ) − p r a V (L^η + Lη

r a ,Θ)

r a ∈

r A η

∑• Collectives-Based Utility (theoretical):

• Selfish Utility :

TGη (ζ ) = V (L,Θ)• Team Game Utility :

• Collectives-Based Utility (practical):

WLUη

r a (ζ ) = G(ζ ) − V (L^η + p r

a Lη

r a

r a ∈

r A η

∑ ,Θ)

Page 31: Introduction to Collectives

CDCS 2002 K. Tumer 31

Ames Research CenterUtility Comparison in Rover Domain

100 rovers on a 32x32 grid

Page 32: Introduction to Collectives

CDCS 2002 K. Tumer 32

Ames Research CenterScaling Properties in Rover Domain

Page 33: Introduction to Collectives

CDCS 2002 K. Tumer 33

Ames Research Center

Summary

• Given a world utility, deploying RL algorithms provides a solution to the distributed design problem. But what utilities does one use?

• Theory of collectives shows how to configure and/or update the personal utilities of the agents so that they “unintentionally cooperate” to optimize the world utility

• Personal utilities based on collectives successfully applied to many domains (e.g., autonomous rovers, constellations of communication satellites, data routing, autonomous defects)

• Performance gains due to using collectives-based utilities increase with size of problem

• A fully fleshed science of collectives would benefit from and have applications to many other sciences