45
Connections between Learning Connections between Learning Theory, Game Theory, and Theory, Game Theory, and Optimization Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Embed Size (px)

Citation preview

Page 1: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Connections between Learning Connections between Learning Theory, Game Theory, and Theory, Game Theory, and

OptimizationOptimization

Maria Florina (Nina) Balcan

Lecture 14, October 7th 2010

Page 2: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Improved Equilibria via Public Improved Equilibria via Public Service AdvertisingService Advertising

Page 3: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Good equilibria, Bad equilibriaGood equilibria, Bad equilibriaMany games have both bad and good equilibria.

• In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.

Page 4: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

G

Fair cost-sharingFair cost-sharingFair cost-sharing: n players in weighted directed

graph G. Player i wants to get from si to ti, and they share cost of edges they use with others.

Page 5: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair cost-sharingFair cost-sharing

s

t

1n

• Player i wants to get from si to ti. • All players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

Good equilibrium: all use edge of cost 1. (paying 1/n each)

Bad equilibrium: all use edge of cost n. (paying 1 each)

• Each player wants to minimize his own cost.

Page 6: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Inefficiency of equilibria, PoA and Inefficiency of equilibria, PoA and PoSPoS

Price of Stability (PoS): ratio of best Nash equilibrium to OPT.

Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT.

Significant effort spent on understanding these in CS.

[Koutsoupias-Papadimitriou’99]

[Anshelevich et. al, 2004]

E.g., for fair cost-sharing, PoS is log(n), whereas PoA is n.

“Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani

Page 7: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

PoA is n; PoS is log(n).

s

t

1n

PoA is O(n):in any Nash no player pays more than OPT s

t

1n

PoA is (n):

Page 8: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

…1

1/2 1/n-1

s1 sn

t

0 00

1+²

PoA is n; PoS is log(n).

PoS is (log(n)):

1/n

00

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

Page 9: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

…1

1/2 1/n-1

s1 sn

t

0 00

1+²

PoA is n; PoS is log(n).

PoS is (log(n)):

1/n

00

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

Page 10: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

PoS is (log(n)):potential function argument

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

PoA is n; PoS is log(n).

Page 11: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

where

Social cost of is

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

Page 12: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

where

Social cost of is

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

A player moves, change in player’s cost = change in potential

Proof: player i moves, get from S to S’; let A be the edges in S but not in S’, and B the edges in S’ but not in S.

Its change in cost:Change in the potential

Page 13: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

where

Social cost of is

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

Page 14: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

PoS is (log(n)):potential function argument

• The potential does not increase & reach a pure Nash of cost · H(n) ¢ OPT.

• Iterate best-response dynamics starting from an optimal solution [i.e, while there is a player that can improve, pick an arbitrary such player and let him to best response].

• Player i wants to get from si to ti, and minimize its cost.

• all players share cost of edges they use with others.

• n players in directed graph G, each edge e costs ce.

• Potential always decreases, finite # of states, so reach a pure Nash.

Page 15: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Congestion games more generallyCongestion games more generallyGame defined by n players and m resources.

• Cost of a resource j is a function fj(nj) of the number nj of players using it.

• Each player i chooses a set of resources (e.g., a path) from collection Si of allowable sets of resources (e.g., paths from si to ti).

• Cost incurred by player i is the sum, over all resources being used, of the cost of the resource.

• Generic potential function:

Best-response dynamics always gives an equilibrium.

Page 16: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Congestion games more Congestion games more generallygenerally

• Always have a pure-strategy equilibrium.

• Have a potential function s.t. whenever a player switches, potential drops by exactly that player’s improvement.

• Nice general class of games with many players.

– Best-response dynamics always gives an equilibrium.

• But maybe a large gap between the quality of the best and the worst equilibrium.

• Lots of work on understanding properties of these games and quality of their equilibria.

Page 17: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Good equilibria, Bad equilibriaGood equilibria, Bad equilibriaMany games have both bad and good equilibria.

• In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.

Page 18: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Guiding from Bad to GoodGuiding from Bad to Good

Standard motivation for PoS: If a central authority could suggest a low-cost Nash

(ride public transit), and everyone followed the suggestion, then this would be stable.

Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT. Price of Stability (PoS): ratio of best Nash equilibrium to OPT.

Can a helpful authority encourage (guide) behavior to move from a bad state to a good state?

Page 19: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

What if only some fraction will pay attention?

• Can the authority guide behavior to a good state?

• Will it just snap back? How does this depend on ?

Guiding from Bad to Good

[Balcan-Blum-Mansour, SODA 2009]

Page 20: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Main ModelMain Model

1. Authority launches advertising, proposing joint action sad.

0. n players initially playing some arbitrary equilibrium.

…1 1 1 1

s1 sn

t

0 00

k

Page 21: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Main ModelMain Model

1. Authority launches advertising, proposing joint action sad. Each player i follows with probability

. Call players that follow receptive players

0. n players initially playing some arbitrary equilibrium.

…1 1 1 1

s1 sn

t

0 00

k

Page 22: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Main ModelMain Model

1. Authority launches advertising, proposing joint action sad.

2. Remaining (non-receptive) players fall to some arbitrary equilibrium for themselves, given play of receptive players.

3. All players follow best-response dynamics to an overall Nash equilibrium.

potential games, pure Nash eqs.

Each player i follows with probability . Call players that follow receptive players

Notes:

social cost:

0. n players initially playing some arbitrary equilibrium.

Page 23: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Main ResultsMain Results

• If only a constant fraction of the players follow the advice, then we can still get within O(1/) of the PoS.• Extend to cost-sharing + linear delays.

(PoS = log(n), PoA = n)

(PoS = 1, PoA = (n2))

• Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)).

Page 24: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

…1 1 1 1

s1 sn

t

0 00

k

Note: this is best you can hope for. E.g., k =2n.

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

Page 25: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

Advertiser proposes OPT (any apx also works)

random vars

Phase 1:

Page 26: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

- Moreover, this option is guaranteed to be at least as good as if other NR players didn’t exist.

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

- In any NE a non-receptive player i, can’t improve by switching to his path Pi

OPT in OPT.

Cost of non-receptive players at the end of Phase 2

Page 27: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

- In any NE a non-receptive player i, can’t improve by switching to his path Pi

OPT in OPT.

Cost of non-receptive players at the end of Phase 2

Page 28: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

- In any NE a non-receptive player i, can’t improve by switching to his path Pi

OPT in OPT.

Cost of non-receptive players at the end of Phase 2

- Calculate total cost of these guaranteed options.

Rearrange sum...

Page 29: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

Cost of non-receptive players at the end of Phase 2

Cost of receptive players at the end of Phase 2

Page 30: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

Cost of non-receptive players at the end of Phase 2

Use: X ~ Bi(n,p)

Cost of receptive players at the end of Phase 2

Page 31: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Fair Cost SharingFair Cost Sharing

If only a constant fraction of the players follow the advice, then we get within O(1/) of the PoS.

(PoS = log(n), PoA = n)

Expected total cost at the end of Phase 2: O(OPT/). In Phase 3, potential argument shows behavior cannot get worse by more than an additional log(n) factor.

Page 32: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Cost Sharing, ExtensionCost Sharing, Extension

- Still get same guarantee, but proof is trickier

+ linear delays:

Problem: can’t argue as if remaining NR players didn’t exist since they add to delays

Page 33: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Cost Sharing, ExtensionCost Sharing, Extension

- Still get same guarantee, but proof is trickier

- Shadow game wrt non-receptieve players: pure linear latency fns. Offset defined by equilib. at end of phase 2.

# users on e at end of phase 2

- This game has good PoA (5/2) .

+ linear delays:

Page 34: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Cost Sharing, ExtensionCost Sharing, Extension

- Still get same guarantee, but proof is trickier

- Shadow game: pure linear latency fns

- Behavior of NR at end of phase 2 is equilib for this game too.- Show

Cost of the of nonreceptive players at the end of step 2: O(OPT/).

+ linear delays:

Page 35: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Cost Sharing, ExtensionCost Sharing, Extension

- Still get same guarantee, but proof is trickier

Need to still argue about the cost of the receptive players.

Edge by edge charging:

- more receptive players, loose a factor of two compared to OPT

- more non-receptive players, already paid for, loose a factor of two

Cost of the of nonreceptive players at the end of step 2: O(OPT/).

+ linear delays:

Page 36: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games• Given graph G, each edge labeled + or -.• Vertices have two actions: RED or BLUE.

Pay 1 for each + edge with endpoints of different color, and each – edge with endpoints of same color.

• Special cases:

+

+

+

--

• All + edges is consensus game. • All – edges is cut-game.

Page 37: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games OPT is an equilibrium so PoS = 1.

But even for consensus games, PoA = (n2)

Clique with perfect matching removed

all edges labeled plus

Page 38: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games(PoS = 1, PoA = (n2))

- Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)).

- Same example as for consensus PoA, but sparser across cut.

(lower bound)

Degree ° n/8 across cut, °=1/2-®

Page 39: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games(PoS = 1, PoA = (n2))

- Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)).

- Same example as for consensus PoA, but sparser across cut.

• For large n, whp all nodes have at most a 1/2-°/2fraction on neighbs in R• Initially, each node has a °/4 fraction on nodes of the other color.

• So, players “locked” into place

Degree ° n/8 across cut, °=1/2-®

(lower bound)

Page 40: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games

(upper bound, consensus games)

- Advertising strategy = follow OPT, e.g. all red.- By Hoeffding, all nodes with degree log n/(®-1/2)2 have more than half of their neighbors in the set R, with prob. 1-1/n.- At the end of step two, all nodes are red.

(PoS = 1, PoA = (n2))

- Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)).

Note: for general cut games, OPT might not have zero cost for each player.

Page 41: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games

- Split nodes into those incurring low-cost vs those incurring high-cost under OPT.

(upper bound, general party affiliation games)

- Advertising strategy = follow OPT.

- Show that low-cost will switch to behavior in OPT. For high-cost, don’t care.

- Cost only improves in final best-response process.

(PoS = 1, PoA = (n2))

- Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)).

Page 42: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Party affiliation gamesParty affiliation games

• S is a ¯-dominating if every vertex not in S has more than a ½+¯ fraction of neighbs in S.

• If ® > ½+2¯, then set R of receptive players is ¯-dominating whp

(PoS = 1, PoA = (n2))

- Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)).

• Split nodes into those incurring low-cost (less than a ¯-fraction of incident edges incur a cost in OPT) vs those incurring high-cost under OPT.

• Low-cost will switch to behavior in OPT. For high-cost, can only incur a cost of only 1/¯ more their cost in OPT.

(upper bound, general party affiliation games)

Page 43: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

SummarySummary

Analyze ability of a central authority to guide behavior to a good equilibrium even if only ® fraction of players are paying attention.

Page 44: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Influencing DynamicsInfluencing Dynamics

Play Best Response

Play the Advertised Behavior

Each player has a few abstract actions.

Expert 1 Expert 2

Uses a learning, experts based alg. to decide which one to use

A more adaptive model [Balcan Blum Mansour, ICS 2010]

[no rigid separation between receptive vs non-receptive players]

Page 45: Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Open QuestionsOpen Questions

Get around problem of natural dynamics converging to poor equilibrium without central authority by giving players more information about the game?