47
When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley) and Aviv Zohar (Hebrew U)

When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

When is it Best to Best-Reply?

Michael Schapira(Yale University and UC Berkeley)

Joint work withNoam Nisan (Hebrew U),

Gregory Valiant (UC Berkeley)and Aviv Zohar (Hebrew U)

Page 2: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Motivation: Internet Routing

Establish routes between Autonomous Systems (ASes).

Currently handled by the Border Gateway Protocol (BGP).

AT&T

Qwest

Comcast

Sprint

Page 3: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Internet Routing as a Game[Levin-S-Zohar]

• Internet routing is a game!– players = ASes – players’ types = preferences over routes– strategies = routes

• BGP = Best-Response Dynamics– each AS constantly selects its best

available route to each destination– … until a “stable state” (= PNE) is reached.

Page 4: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

But…

• Challenge I: No synchronization ofplayers’ actions– players can best-reply simultaneously.– players can best-reply based on outdated information.– When is BGP guaranteed to converge to a stable state?

• Challenge II: Are players incentivized to follow best-response dynamics?– Can an AS gain from not executing BGP?

Page 5: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Agenda

• Mechanism design approach to best-response dynamics.(main focus of this talk)

• Convergence of best-response dynamics in asynchronous environments. [Jaggard-S-Wright]

(if time permits)

Page 6: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Agenda

•Part I: mechanism design approach to best-response dynamics.

•Part II: on the convergence of best-response dynamics in asynchronous environments.

Incentive-Compatible Best-Response

Dyanmics

Page 7: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Main Questions

• When is myopic best-replying also good in the long run?

• When can stable outcomes be implemented in partial-information settings?

• Can we reason about partial-information settings via complete-information games?

Page 8: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Our Results Have Implications For

• Internet protocols– Internet routing (BGP), congestion control (TCP)

• Auctions– 1st-price auctions, unit-demand auctions, GSP

• Matching– correlated markets, interns and hospitals

• Cost-sharing mechanisms– Moulin mechanisms, …

Page 9: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

1st Price Auction

Bids 0 1 2 3 4 5

0 B:2 A:3 A:2 A:1 A:0 A:-1

1 B:1 B:1 A:2 A:1 A:0 A:-1

2 B:0 B:0 B:0 A:1 A:0 A:-1

3 B:-1 B:-1 B:-1 B:-1 A:0 A:-1

Alice (va=4)

Bob(vb=2

)

winner:utility

Page 10: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Bids 0 1 2 3 4 5

0 B:2 A:3 A:2 A:1 A:0 A:-1

1 B:1 B:1 A:2 A:1 A:0 A:-1

2 B:0 B:0 B:0 A:1 A:0 A:-1

3 B:-1 B:-1 B:-1 B:-1 A:0 A:-1

Alice (va=4)

Bob(vb=2

)

Ascending-Price English Auction

Page 11: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Bids 0 1 2 3 4 5

0 B:2 A:3 A:2 A:1 A:0 A:-1

1 B:1 B:1 A:2 A:1 A:0 A:-1

2 B:0 B:0 B:0 A:1 A:0 A:-1

3 B:-1 B:-1 B:-1 B:-1 A:0 A:-1

Alice (va=4)

Bob(vb=2

)

Best-Reply(with some-tie breaking)

Page 12: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

The Model

• n players

• Player i has – action set Ai

– (private) type ti єTi

– utility function ui

),(1

atuuAaTt iiij

n

jii

Page 13: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

The Model: Dynamic Interaction

• Discrete time steps. Initial action profile a0.

• One player is activated in each time step– round-robin (cyclic) order– our results are independent of the order (and also hold for asynchronous

environments)

• Players’ strategies specify which actions are selected in each time step.– can be history-dependent

• Best-response dynamics = the strategy profile in which each player constantly best-replies to others’ actions

Page 14: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Two Possible Payoff Models

Cumulative model

– Payoffs are accumulated

– Alternative formulation with discount factors

Payoff at the limit

– If the dynamics converges to a stable outcome a*

– If no convergence, the resulting payoff is low.

More natural.sometimes too

restrictive

1

),(1

suplimk

kiii atuU *),( atuU iii

Weaker (actively discourages oscillations), interesting applications

Page 15: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Solution Concept• A strategy profile is an ex-post Nash equilibrium if no player wishes to deviate

from regardless of the types

(this is essentially the best possible in a distributed environment [Shneidman-Parkes])

),,'(),,('1

tUtUTti iiiiiiijj

n

Page 16: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

2,1 0,0

3,0 1,3

Row Player: Type 1

3,1 1,0

2,0 0,3

Row Player: Type 2

Best-Replying is Not Always Best

• dominance-solvable• potential game• unique and Pareto optimal PNE

Page 17: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

When is it Good to Best-Reply?

• Goal: identify a class of games in which best-response dynamics is an ex-post Nash equilibrium.– i.e., best-replying is incentive-compatible– close in spirit to “learning equilibria” [Brafman-tennenholtz]

• This class is going to be VERY restricted. Still… a variety of mechanisms/protocols.

• Remark: The best replies are not always unique. Thus, we must handle tie-breaking.

Page 18: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

One Class of Games

• Lemma: If each realization of types yields a game in which each player has a single dominant strategy, then best-response dynamics is an ex-post Nash equilibrium.

Page 19: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

9,0 1,1 1,3

10,0 0,2 0,1

10,0 0,1 0,3

9,0 1,2 1,1

• no player has a dominant strategy (in both realizations).

• best-response dynamics is an ex-post Nash equilibrium.

• This game is blindly solvable.

On the Other Hand…

Row Player: Type 1

Row Player: Type 2

Page 20: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Blindly-Dominated Strategy Sets

8 7 9

5 6 8

3 2 1

3 4 0

)','(min),(max',\',

iiisTSs

iiisTs

ssussuiiiii

T

Page 21: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Blindly-Solvable Games

• Defn: A game is blindly-solvable if iterated elimination of blindly-dominated strategy sets results in a single strategy profile.– Observation: the “surviving” strategy profile is the

unique PNE of the game.

• Defn: A partial-information game is blindly-solvable if every realization of types yields a blindly-solvable game.

Page 22: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Bids 0 1 2 3 4 5

0 B:2 A:3 A:2 A:1 A:0 A:-1

1 B:1 B:1 A:2 A:1 A:0 A:-1

2 B:0 B:0 B:0 A:1 A:0 A:-1

3 B:-1 B:-1 B:-1 B:-1 A:0 A:-1

Alice (va=4)

Bob(vb=2

)

1st-Price Auctions Revisited

Page 23: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Merits of Blindly-Solvable Games

• Thm: Let G be a blindly-solvable partial-information game. Let a* be the surviving strategy profile. Then,

1. Best-response dynamics converges to a* within n(j|Aj|) time steps.

2. In the “payoff at the limit” model, best-response dynamics is incentive-compatible, and even collusion-proof, in ex-post Nash.

Page 24: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Intuition for Proof of (2)

• The first action that was not “eliminated” in the elimination sequence of G must belong to a manipulator.

• The manipulator’s utility from that action is lower than his utility from a*.

Page 25: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Bids 0 1 2 3 4 5

0 B:2 A:3 A:2 A:1 A:0 A:-1

1 B:1 B:1 A:2 A:1 A:0 A:-1

2 B:0 B:0 B:0 A:1 A:0 A:-1

3 B:-1 B:-1 B:-1 B:-1 A:0 A:-1

Alice (va=4)

Bob(vb=2

)

Best-Response 1st-PriceAuction Mechanism

Page 26: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Implications forInternet Environments

• Under realistic conditions routing with the Border Gateway Protocol is incentive compatible. [Levin-S-Zohar]

• Convergence and incentive compatibility results for congestion control. [Godfrey-S-Zohar-Shenker]

Mechanism design without money!

Page 27: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

BEYOND BLINDLY-SOLVABLE GAMES

Page 28: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Generalized 2nd-Price Auction (GSP)

• Used for selling ads on search engines.

• k slots. Each slot j with click-through-rate j.

• Users submit bids (per click) bi.

• They are ranked in order of bids.

• If ad is clicked: pay next highest bid.

Page 29: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

• No dominant strategy equilibrium.

• There exists an equilibrium with VCG payments. [Edelman-Ostrovsky-Schwarz, Varian]

• Best-response dynamics (with tie-breaking) converge with probability 1 to that equilibrium. [Cary et al.]

• Thm (informal): Best-replying in GSP is incentive-compatible.– Generalizes the English auction of [Edelman-Ostrovsky-Schwarz]

Generalized 2nd-Price Auction (GSP)

Page 30: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Auctions With Unit-Demand Bidders

• n bidders. m items.

• Each bidder i has value vi,j for each item j, and is interested in at most one item.

• Thm: There exists a best-response mechanism for auctions with unit-demand bidders that is incentive-compatible in ex-post Nash and converges to the VCG outcome.– Generalizes the English auction of [Demange-Gale-Sotomayer]

• The proof of incentive-compatibility is simple. The proof of convergence is more complex and is based on Kuhn’s Hungarian method.

Page 31: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

A NEW PERSPECTIVE ON SOME CENTRALIZED MECHANISMS

Page 32: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Centralized vs. Distributed

players declare types

output the outcome

simulate interaction

players reach a stable outcome in a

distributed manner

ex-post equilibrium in the decentralized

setting

dominant strategyimplementation in the

centralized setting.

centralized distributed

Page 33: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

The Centralized Setting

• Each player i has an action set Ai, a private type ti, and a utility function ui (as before).

• Wanted: a direct revelation mechanism that outputs a pure Nash equilibrium of the game.

and incentivizes truthfulness

i

n

ii

n

iATM

11:

Page 34: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

2,1 0,0

3,0 1,3

Row Player: Type 1

3,1 1,0

2,0 0,3

Row Player: Type 2

Clearly, This is Not Always Possible

Page 35: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Corollary I

• If every player has a single dominant strategy in every realization, then the direct-revelation mechanism is truthful.– Give each player his dominant strategy in the reported

realization.

Page 36: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Corollary II

•If the game is blindly solvable, then the direct-revelation mechanism is truthful.

9,0 1,1 1,3

10,0 0,2 0,1

10,0 0,1 0,3

9,0 1,2 1,1

Row Player: Type 1

Row Player: Type 2

Page 37: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

More Blindly-Solvable Games

•Cost-Sharing mechanisms– Moulin mechanisms [Moulin, Moulin-Shenker]

– Acyclic mechanisms [Mehta-Roughgarden-Sundararajan]

•Matching games– Interns and Hospitals– Correlated two sided markets

Page 38: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Directions for Future Research

•Implementability of other kinds of equilibria (mixed Nash, correlated, …)?

•Incentive-compatibility of other kinds of dynamics (fictitious play, regret minimization)?

Page 39: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Agenda

•Part I: mechanism design approach to best-response dynamics.

•Part II: on the convergence of best-response dynamics in asynchronous environments.

Best-Response Dynamics

Out of Sync

Page 40: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Synchronous Environments

• In traditional best-response dynamics players are activated one at a time.

• More generally, the study of game dynamics normally supposes synchrony.

• What if the interaction between players is asynchronous? (Internet, markets)

Page 41: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Illustration

2,1 0,0

1,20,0

RowPlayer

ColumnPlayer

Page 42: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Illustration

2,1 0,0

1,20,0

RowPlayer

ColumnPlayer

Page 43: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

But…

2,1 0,0

1,20,0

RowPlayer

ColumnPlayer

Page 44: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

• Infinite sequence of discrete time-steps

• In each time-step a subset of the players best-replies.

• The “schedule” is chosen by an adversarial entity (“the Scheduler”).

• The schedule must be fair (no player is indefinitely “starved” from best-replying).

Model for Analyzing Asynchronous Best-Response Dynamics

Page 45: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

•Thm: If two pure Nash equilibria(or more) exist in a game then asynchronous best-reply dynamics can potentially oscillate.

• Implications for Internet protocols, diffusion of innovations in social networks, and more.

Result [Jaggard-S-Wright]

Page 46: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

Directions for Future Research

•Characterization of games for which asynchronous best-response dynamics converge.

•More generally, exploring game dynamics in the realm that lies beyond synchronization (fictitious play, regret minimization).

Page 47: When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

THANK YOU!