What Coalitions Can Achieve - TU Clausthal...(2) south (weather ok). Trip should take 3 days. The allies want to bomb the convoy as long as possible. If they search north, they can

What Coalitions Can AchieveJürgen Dix and Wojtek Jamroga

Department of InformaticsClausthal University of Technology, Germany

European Agent Systems Summer School(Lisbon 2008)

Jürgen Dix and Wojtek Jamroga · European Agent Systems Summer School (Lisbon 2008) 1/177

Lecture Overview

1. Introduction

2. How to Form a Coalition

3. Reasoning about Coalitions


Pointers to Other Courses

Paul Harrenstein/ Mathijs de Weerdt:Introduction to Game Theory andMechanism Design (Tuesday 11-16)John-Jules Meyer: Logics for MultiagentSystems (Monday/Tuesday afternoon)


Timetable

Section 1: Required Concepts.20-30 min. Very brief recap.Section 2: How to form a coalition.60-70 min: Standard.Section 3: Reasoning about Coalitions.90 min: Advanced.

Updated slides: http://cig.in.tu-clausthal.de/index.php\relax?id=159


http://cig.in.tu-clausthal.de/index.php\relax ?id=159

http://cig.in.tu-clausthal.de/index.php\relax ?id=159

1. Required Concepts

Chapter 1. Required Concepts

Required Concepts1.1 Evaluation Criteria1.2 Non-Coop Games in NF1.3 Non-Coop Extensive Games1.4 References



Outline (1)

Weconsider in this chapter non-cooperativegames,state several evaluation criteria, the minmaxtheorem and its famous generalization:Nash's theorem,introduce two different sorts of games: thosein normal form and those in extensive form.



Classical DAI: System Designer fixes anInteraction-Protocol which is uniformfor all agents. The designer also fixesa strategy for each agent.

What is a the outcome, assuming that theprotocol is followed and the agents follow thestrategies?



MAS: Interaction-Protocol is given. Eachagent determines its own strategy(maximising its own good, via a utilityfunction, without looking at the globaltask).

Global optimum

What is the outcome, given a protocol thatguarantees that each agent’s desired localstrategy is the best one (and is therefore chosenby the agent)?


1. Required Concepts 1. Evaluation Criteria

1.1 Evaluation Criteria



We need to compare protocols. Each such protocol leadsto a solution. So we determine how good these solutionsare.

Social Welfare: Sum of all utilitiesPareto Efficiency: A solution xxx is Pareto-optimal, if

there is no solution x′x′x′ with:(1) ∃∃∃ agent agagag : utagagag(x

′x′x′) > utagagag(xxx)(2) ∀∀∀ agents ag′ag′ag′ : utag′ag′ag′(x

′x′x′) ≥ utag′ag′ag′(xxx).

Individual rational: if the payoff is higher than notparticipating at all.



Stability:Case 1: Strategy of an agentdepends on the others.The profile S∗AAA = 〈S∗111 , S∗222 , . . . , S∗|AAA|〉 iscalled a Nash-equilibrium, iff ∀∀∀iii : S∗iiiis the best strategy for agent iii ifall the others choose

〈S∗111 , S∗222 , . . . , S∗i−1i−1i−1, S∗i+1i+1i+1, . . . , S

∗|AAA|〉.

Case 2: Strategy of an agent doesnot depend on the others.Such strategies are called dominant.



Example 1.1 (Prisoners Dilemma, Type 1)Two prisoners are suspected of a crime (which they bothcommitted). They can choose to (1) cooperate with eachother (not confessing to the crime) or (2) defect (givingevidence that the other was involved). Both cooperating(not confessing) gives them a shorter prison term thanboth defecting. But if only one of them defects (thebetrayer), the other gets maximal prison term. Thebetrayer then has maximal payoff.

Prisoner 2cooperate defect

Prisoner 1cooperatedefect

(3,3)(5,0)

(0,5)(1,1)



Social Welfare: Both cooperate,Pareto-Efficiency: All are Pareto optimal,except when both defect.Dominant Strategy: Both defect.Nash Equilibrium: Both defect.


1. Required Concepts 2. Non-Coop Games in NF

1.2 Non-Coop Games in NF



Prisoners dilemma revisited: c a d b

Prisoner 2cooperate defect

Prisoner 1cooperatedefect

(a,a)(b,c)

(c,b)(d,d)



Example 1.2 (Trivial mixed-motive game, Type 0)

Player 2C D

Player 1CD

(4, 4)(3, 2)

(2, 3)(1, 1)



Example 1.3 (Battle of the Bismarck Sea)In 1943 the northern half of New Guinea was controlled bythe Japanese, the southern half by the allies. The Japanesewanted to reinforce their troops. This could happen usingtwo different routes: (1) north (rain and bad visibility) or(2) south (weather ok). Trip should take 3 days.

The allies want to bomb the convoy as long as possible. Ifthey search north, they can bomb 2 days (independentlyof the route taken by the Japanese). If they go south, theycan bomb 3 days if the Japanese go south too, and only 1day, if the Japanese go north.



JapaneseSail North Sail South

AlliesSearch NorthSearch South

2 days1 day

2 days3 days

Allies: What is the largest of all row minima?Japanese: What is smallest of the column maxima?

Battle of the Bismarck sea:largest row minimum = smallest column maximum.

This is called a saddle point.



Example 1.4 (Rochambeau Game)

Also known as paper, rock and scissors: papercovers rock, rock smashes scissors, scissors cutpaper.

MinP S R

MaxPSR

01-1

-101

1-10



Definition 1.5 (n-Person Normal Form Game)

A finite n-person normal form game is a tuple〈AAA, A,O, %,µµµ〉, where

AAA = 111, . . . iii, . . . ,nnn is a finite set of players.A =

∏n1 Ai, where Ai is the set of actions available to

player i. A vector a ∈ A is called action profile. Theelements of Ai are also called pure strategies.O is the set of outcomes.% : A −→ O assigns each action profile an outcome.µµµ = 〈µµµ1, . . .µµµi, . . .µµµn〉 where µµµi : O −→ R is areal-valued utility (payoff) function for player iii.



Games can be represented graphically using ann-dimensional payoff matrix. Here is a generic picturefor 2-player, 2-strategy games:

1\ 2 a12 a2

2

a11

a21

〈µµµ1(a11, a

12),µµµ2(a

11, a

12)〉

〈µµµ1(a21, a

22),µµµ2(a

21, a

22)〉

〈µµµ1(a11, a

12),µµµ2(a

11, a

22)〉

〈µµµ1(a21, a

12),µµµ2(a

21, a

22)〉

We often forget about % (thus we are making nodistinction between actions and outcomes). Thus wesimply write µµµ1(a

11, a

12) instead of the more precise

µµµ1(%(〈a11, a

12〉).



Definition 1.6 (Common Payoff Game)

A common payoff game (team game) is a gamein which for all action profiles a ∈ A1 × . . .× An

and any two agents i, j the following holds:µµµi(a) = µµµj(a).

In such games agents have no conflictinginterests. Their graphical depiction is simplerthan above (the second component is notneeded).



The opposite of a team game is aDefinition 1.7 (Constant Sum Game)

A 2-player n-strategy normal form game is calledconstant sum game, if there exists a constant csuch that for each strategy profile a ∈ A1 × A2:µµµ1(a) + µµµ2(a) = c.

We usually set wlog c = 0 (zero sum games).

What we are really after is strategies. A purestrategy is one where one action is chosen andplayed. But does this always make sense?



Example 1.8 (Battle of the Sexes, Type 2)

Married couple looks for evening entertainment.They prefer to go out together, but havedifferent views about what to do (say going tothe theatre and eating in a gourmet restaurant).

WifeTheatre Restaurant

HusbandTheatreRestaurant

(4,3)(1,1)

(2,2)(3,4)



Example 1.9 (Leader Game, Type 3)

Two drivers attempt to enter a busy stream oftraffic. When the cross traffic clears, each onehas to decide whether to concede the right ofway of the other (C) or drive into the gap (D). Ifboth decide for C, they are delayed. If bothdecide for D there may be a collision.

Driver 2C D

Driver 1CD

(2,2)(4,3)

(3,4)(1,1)



Example 1.10 (Fighters and Bombers)Consider fighter pilots in WW II. A good strategy to attackbombers is to swoop down from the sun: Hun-in-the-sunstrategy. But the bomber pilots can put on theirsunglasses and stare into the sun to watch the fighters. Soanother strategy is to attack them from below Ezak-Imakstrategy: if they are not spotted, it is fine, if they are, it isfatal for them (they are much slower when climbing). Thetable contains the survival probabilities of the fighterpilot.

Bomber CrewLook Up Look Down

Fighter PilotsHun-in-the-SunEzak-Imak

0.951

10



Example 1.11 (Matching Pennies Game)

Two players display one side of a penny (head ortails). Player 1 wins the penny if they display thesame, player 2 wins otherwise.

Player 2Head Tails

Player 1HeadTails

(1, -1)(-1, 1)

(-1, 1)(1, -1)



Definition 1.12 (Mixed Strategy for NF Games)

Let 〈AAA, A,O, %, u〉 be normal form game. For a set X let∏(X) be the set of all probability distributions over X.The set of mixed strategies for player i is the setSi =

∏(Ai). The set of mixed strategy profiles is

S1 × . . .× Sn.

The support of a mixed strategy is the set of actions thatare assigned non-zero probabilities.

What is the payoff of such strategies? We have to take intoaccount the probability with which an action is chosen.This leads to the expected utility µµµexpected.



Definition 1.13 (Expected Utility for player i)

The expected utility for player i of the mixedstrategy profile (s1, . . . , sn) is defined as

µµµexpected(s1, . . . , sn) =∑a∈A

µµµi(%(a))n∏

j=1

sj(aj).

What is the optimal strategy (maximising theexpected payoff) for an agent in an 2-agentsetting?



Definition 1.14 (Maxmin strategy)

Given a game 〈1, 2, A1, A2, µµµ1,µµµ2〉, the maxminstrategy of player i is a mixed strategy that maximises theguaranteed payoff of player i, no matter what the otherplayer −i does:

argmaxsi mins−i

µµµexpectedi (s1, s2)

The maxmin value for player i ismaxsi mins−i µµµ

expectedi (s1, s2).

The minmax strategy for player i is

argminsi maxs−i

µµµexpected−i (s1, s2)

and its minmax value is mins−i maxsi µµµexpected−i (s1, s2).



Lemma 1.15

In each finite normal form 2-person game (notnecessarily constant sum), the maxmin value ofone player is never strictly greater than theminmax value for the other.



We illustrate the maxmin strategy using a2-person 3-strategy constant sum game:

Player BB-I B-II B-III

Player AA-IA-II

01

5612

1234

We assume Player A’s optimal strategy is to playstrategyA-I with probability x andA-II with probability 1− x.

In the following we want to determine x.



In accordance with the minmax strategy let us compute

argmaxsi mins−i

µµµexpectedi (s1, s2)

We assume Player A plays (as above) A-I with probability xand A-II with probability 1− x (strategy s1). Similarly,Player B plays B-I with probability y and B-II withprobability 1− y (strategy s2).

We compute µµµexpected1 (s1, s2)

0 · x · y +5

6x(1− y) + 1 · (1− x)y +

1

2(1− x)(1− y)

thus

µµµexpected1 (s1, s2) = y((−4

3x) +

1

2) +

1

3x +

1

2



According to the minmax strategy, we have to choose xsuch that the minimal values of the above term aremaximal. For each value of x the above is a straight linewith some gradient. Thus we get the maximum when theline does not slope at all!

Thus x = 38. A similar reasoning gives y = 1

4.



Theorem 1.16 (von Neumann (1928))In any finite 2-person constant-sum game the followingholds:1 The maxmin value for one player is equal to the minmaxvalue for the other. The maxmin of player 1 is usuallycalled value of the game.

2 For each player, the set of maxmin strategies coincideswith the set of minmax strategies.

3 The maxmin strategies are optimal: if one player doesnot play a maxmin strategy, then its payoff (expectedutility) goes down.

What is the optimal strategy (maximising the expectedpayoff) for an agent in an n-agent setting?



Figure 1: A saddle.



Definition 1.17 (Best Response to a Strategy Profile)

Given a strategy profile

〈s1, s2, . . . , si−1, si+1, . . . , sn〉,

the best response of player i to it is any mixedstrategy s∗i ∈ Si such that

µµµi(s∗i , 〈s1, s2, . . . , si−1, si+1, . . . , sn〉) =µµµi(si, 〈s1, s2, . . . , si−1, si+1, . . . , sn〉)

for all strategies si ∈ Si.

Is the best response unique?



Example 1.18 (Best responses sets for Rochambeau)

How does the set of best responses look like?1 Player 2 plays the pure strategy P.2 Player 2 plays P with probability .5 and S withprobability .5.

3 Player 2 plays P with probability 13 and S with

probability 13 and R with probability

13.

Is a non-pure strategy in the best response set(say a strategy (s1, s2)with probabilities 〈p, 1− p〉,p 6= 0), then so are all other mixed strategies withprobabilities 〈p′, 1− p′〉 where p 6= p′ 6= 0.



Definition 1.19 (Nash-Equilibrium)

A strategy profile 〈s∗1, s∗2, . . . , s∗n〉 is a Nashequilibrium if for any agent i, s∗i is the bestresponse to 〈s∗1, s∗2, . . . , s∗i−1, s

∗i+1, . . . , s

∗n〉.

What are the Nash equilibria in the Battle ofsexes? What about the matching pennies?



Theorem 1.20 (Nash (1950))

Every finite normal form game has a Nashequilibrium.

Corollary 1.21 (Nash Eq. in constant-sum Games)

In any finite normal form 2-person constant-sumgame, the Nash equilibria are exactly all pairs〈s1, s2〉 of maxmin strategies (s1 for player 1, s2

for player 2).All Nash equilibria have the same payoff(expected utility): the value of the game, thatplayer 1 gets.


1. Required Concepts 3. Non-Coop Extensive Games

1.3 Non-Coop Extensive Games



We have previously introduced normal form games(Definition 1.5). This notion does not allow to deal withsequences of actions that are reactions to actions of theopponent.

Extensive form (tree form) games

Unlike games in normal form, those in extensive form donot assume that all moves between players are madesimultaneously. This leads to a tree form, and allows tointroduce strategies, that take into account the history ofthe game.

We distinguish between perfect and imperfectinformation games. While the former assume that theplayers have complete knowledge about the game, thelatter do not: a player might not know exactly which nodeit is in.



The following definition covers a game as a tree:

Definition 1.22 (Extensive form Games, Perfect Inf.)

A finite perfect information game in extensive form is atuple G = 〈AAA, A,H, Z, α, ρ, σ, µ1, . . . , µn〉 where

AAA is a set of n players, A is a set of actionsH is a set of non-terminal nodes, Z a set of terminalnodes, H ∩ Z = ∅,α : H → A assigns to each node a set of actions,ρ : H → N assigns to each non-terminal node a playerwho chooses an action at that node,σ : H × A → H ∪ Z assigns to each (node,action) asuccessor node (h1 6= h2 implies σ(h1, a1) 6= σ(h2, a2)),µi : Z → R are the utility functions.



Such games can be visualised as finite trees.Here is the famous “Sharing Game”.

Figure 2: The Sharing game.



Here is another (generic) game.

Figure 3: A generic game.Jürgen Dix and Wojtek Jamroga · European Agent Systems Summer School (Lisbon 2008) 45/177


Transforming extensive form games intonormal formNote that the definitions of best response and Nashequilibria carry over to games in extensive form. But theydo not take into account the sequential nature of extensivegames. Indeed we have:

Lemma 1.23 (Extensive form → Normal form)

Each game in extensive form can be transformed in a normalform game (such that the strategy spaces are the same).

The idea is to take the set of all strategies of agent i asthe set of actions of agent i. And to define the utilityfunction accordingly.



Transforming the generic game into normalform

C DAE W YAF X YBE Z ZBF Z Z



Is there a converse of Lemma 1.23?We consider prisoner’s dilemma and try tomodel a game in extensive form with the samepayoffs and strategy profiles.



In fact, it is not surprising that we do notsucceed in the general case:

Theorem 1.24 (Zermelo, 1913)

Every perfect information game in extensive formhas a pure strategy Nash equilibrium.

We will later introduce imperfect informationgames (in extensive form): Slide 53.



Example 1.25 (Unintended Nash equilibria)

We consider the following perfect informationgame in extensive form.

Figure 4: Unintended Equilibrium.Jürgen Dix and Wojtek Jamroga · European Agent Systems Summer School (Lisbon 2008) 50/177


Both (A,R) as well as (B,L) are Nash equilibria.But (B,L) is unintuitive.

This leads to the notion of subgame perfect Nashequilibria:

Definition 1.26 (Subgame perfect Nash equilibria)

Let G be a perfect information game in extensive form.Subgame: A subgame of G rooted at node h is the

restriction of G to the descendants of h.SPE: The subgame perfect Nash equilibria of a

perfect information game G in extensive formare those Nash equilibria of G, that are alsoNash equilibria for all subgames G′ of G.



Figure 5: The Centipede game.

The only SPE is for each player to go down.But many human players would rather goacross and hope for a better payoff.



Definition 1.27 (Extensive form Games, Imperfect Inf.)

A finite imperfect information game inextensive form is a tupleG = 〈AAA, A,H, Z, α, ρ, σ, µ1, . . . , µn, I1, . . . , In〉where

〈AAA, A,H, Z, α, ρ, σ, µ1, . . . , µn〉 is a perfectinformation game in the sense ofDefinition 1.22 on Slide 43,Ii are equivalence relations onh ∈ H : ρ(h) = i such that h, h′ ∈ Ii impliesα(h) = α(h

′).



Figure 6: An imperfect game.



Definition 1.28 (Pure strategy in imperfect inf. games)

Given an imperfect information game inextensive form, a pure strategy for player i is avector 〈a1, . . . , ak, 〉 with aj ∈ α(I(i,j)) whereI(i,1), . . . I(i,k) are the k equivalence classes foragent i.



Figure 7: Prisoner’s dilemma.

There is a pure strategy Nash equilibrium.



NF game → Imperfect game

Each game in normal form can be transformedinto an imperfect information game inextensive form.

Each imperfect information game in extensiveform can be transformed into a game in normalform.

This is obvious if we consider pure strategies.But what about mixed strategies?

What is the set ofmixed strategies for animperfect game?



SPE: What about subperfect equilibria(analogue of Definition 1.26 onSlide 51 for imperfect games?

First try: In each information set, we have a setof subgames (a forest). Why notasking that a strategy should be a bestresponse in all subgames of thatforest?



Figure 8: Subgames in Imperfect Games.



Nash equilibria: (L,U) and (R,D).In one subgame, U dominates D, in the otherD dominates U .But (R,D) seems to be the unique choice:both players can put themselves into theothers place and reason accordingly.Requiring that a strategy is best responseto all subgames is too strong.


1. Required Concepts 4. References

1.4 References


1. Required Concepts 4. References

[FudTir] Fudenberg, D. and J. Tirole (1991).Game Theory. MIT Press.

[Nash50] Nash, J. (1950).Equilibrium points in n-person games.Proceedings of the National Academy of Sciences of the UnitedStates of America 36, 48–49.

[OsbRub] Osborne, M. and A. Rubinstein (1994).A Course in Game Theory. MIT Press.

[Rosenthal73] Rosenthal, R. (1973).A class of games possessing pure-strategy Nash equilibria.International Journal of Game Theory 2, 65–67.

[Shoam] Shoham, Yoam (2003).Multiagent syStems. Preprint.



Chapter 2. How to Form aCoalition

How to Form a Coalition2.1 Coalitional Games2.2 Coalition-Structure-Search2.3 Core versus Shapley Value2.4 Computational Issues2.5 References



OutlineWeconsider in this chapter coalitional games (also calledcooperative). In contrast to the previous chapter, thebasic notion is a team of agents that work together,not a single agent, as in non-cooperative games.discuss two solution concepts: the core and theShapley value.describe an important anytime algorithm: CSS, thecoalition structure search,discuss the relations between the core and theShapley value, andstate some complexity results (computing theShapley value, is the core empty?).



Idea: Consider a protocol (to buildcoalitions) as a game and considerNash-equilibria.

Problem: Nash-Eq is too weak!

Definition 2.1 (Strong Nash Equilibrium)

A profile is in strong Nash-Eq if there is nosubgroup that can deviate by changingstrategies jointly in a manner that increases thepayoff of all its members, given thatnonmembers stick to their original choice.

This is often too strong and does not exist.


2. How to Form a Coalition 1. Coalitional Games

2.1 Coalitional Games



Definition 2.2 (Coalitional Game (CFG))

A coalitional game, also called characteristicfunction game (CFG) is a pair 〈AAA, vvv〉 wherevvv : 2AAA → R is a function assigning each coalitiona real-valued number.

Thus it is independent of the nonmembers.



Is that really true?

1 Positive Externalities: Overlapping goals.Nonmembers perform actions and move theworld closer to the coalition’s goal state.

2 Negative Externalities: Shared resources.Nonmembers may use the resources so thatnot enough is left.



Definition 2.3 (Coalition Formation in CFG’s)

Coalition Formation in CFG’s consists of:Forming CSCSCS: Formation of coalitions such that

within each coalition agents coordinatetheir activities. This partitioning iscalled coalition structure CSCSCS.

Solving Optimisation Problem: For eachcoalition the tasks and resources of theagents have to be pooled. Maximisemonetary value.

Payoff Division: Divide the value of thegenerated solution among agents.



Definition 2.4 (Additive, Simple Games)

A game 〈AAA, vvv〉 is called additive, if

vvvSSS∪TTT = vvvSSS + vvvTTT ,

for all SSS,TTT ⊆ AAA with SSS ∩ TTT = ∅.A game is called simple, if for every SSS ⊆ AAA thefollowing holds: vvvSSS ∈ 0, 1.

Lemma 2.5

Additive games are constant sum games.



Definition 2.6 (Super-additive Games)

A game 〈AAA, vvv〉 is called super-additive, if

vvvSSS∪TTT ≥ vvvSSS + vvvTTT ,

where SSS,TTT ⊆ AAA and SSS ∩ TTT = ∅.

Lemma 2.7

Coalition formation for super-additive games istrivial.

Conjecture

All games are super-additive.



The conjecture is wrong, because the coalitionprocess is not for free:communication costs, penalties, time limits.

Definition 2.8 (Sub-additive Games)

A game 〈AAA, vvv〉 is called sub-additive, if

vvvSSS∪TTT vvvSSS + vvvTTT ,

where SSS,TTT ⊆ AAA and SSS ∩ TTT = ∅.

Coalition formation for sub-additive games istrivial.



Definition 2.9 (Convex Game)

A game 〈AAA, vvv〉 is called convex, if the followingholds

vvvSSS + vvvTTT ≤ vvvSSS∪TTT + vvvSSS∩TTT

for all SSS,TTT ⊆ AAA.

Obviously, this implies super-additivity.Definition 2.10 (Veto Player)

A player i is called veto player, if vvvAAA\i = 0.



Example 2.11 (Treasure of Sierra Madre Game)There are n people finding a treasure of many gold piecesin the Sierra Madre. Each piece can be carried by twopeople, not by a single person.

Example 2.12 (3-player Majority Game)There are 3 people that need to agree on something. Ifthey all agree, there is a payoff of 1. If just 2 agree, they geta payoff of α (0 ≤ α ≤ 1). The third player gets nothing.

Example 2.13 (Parliament Game)The parliament has to decide about passing a 100 millionEuro spending bill. There are 4 parties with the followingnumber of representatives: A: 45, B: 25, C:15, and D:15.The bill passes when at least 51 vote for it.



How do the vvvSSS look like?

In the Sierra Madre Game: vvvSSS = b |S|2 c.For theMajority Game:

vvvSSS =

1, if S = 1, 2, 3;α, if |S| = 2;0, if |S| = 1.

For the Parliament Game:

100 = vA,B,C,D = vA,B = vA,C = vA,D = vB,C,D

and

0 = vB,C = vB,D = vC,D.

Examples Core


2. How to Form a Coalition 2. Coalition-Structure-Search

2.2 Coalition-Structure-Search



Maximise the social welfare of the agentsAAA byfinding a coalition structure

CSCSCS∗ = arg maxCSCSCS∈part(AAA)Val(CSCSCS),

where

Val(CSCSCS) :=∑SSS∈CSCSCS

vvvSSS.

How many coalition structures are there?



Let Z(|AAA|, i) denote the number of coalitionstructures with i coalitions. Then

Z(|AAA|, |AAA|) = Z(|AAA|, 1) = 1.Z(|AAA|, i) =iZ(|AAA| − 1, i) +Z(|AAA| − 1, i− 1).Add one agent to a game with |AAA| − 1agents.∑|AAA|

i=1 Z(|AAA|, i) is the number of coalitionstructures.This is in the order of |AAA|

|AAA|2 .



Too many: Ω(|AAA||AAA|2 ). Enumerating is only feasible if

|AAA| < 15.

Figure 9: Number of Coalition (Structures).Jürgen Dix and Wojtek Jamroga · European Agent Systems Summer School (Lisbon 2008) 79/177


How can we approximate Val(CSCSCS)?

Choose setNNN (a subset of all partitions ofAAA) andpick the best coalition seen so far:

CSCSCS∗NNN = arg maxCSCSCS∈NNNVal(CSCSCS).



Figure 10: Coalition Structure Graph.



We want our approximation as good as possible.

That means:Val(CSCSCS∗)Val(CSCSCS∗NNN )

≤ k,

where k is as small as possible.



We consider 3 search algorithms:MERGE: Breadth-first search from the top.SPLIT: Breadth first from the bottom.

Coalition-Structure-Search (CSS1): First thebottom 2 levels are searched, then abreadth-first search from the top.

MERGE might not even get a bound, without looking atall coalitions.

SPLIT gets a good bound (k = |AAA|) after searching thebottom 2 levels (see below). But then it can get slow.

CSS1 combines the good features of MERGE and SPLIT.



Why is SPLIT slow after the first two bottomlevels? Construct a bad example as follows.

vvvSSS =

1, if |SSS| = 1;0, otherwise.

So the optimum is the top node, andVal(CSCSCS∗)Val(CSCSCS∗NNN )

=|AAA|l − 1

,

where l is the level that the algorithm hascompleted (the number of unit coalitions on alevel l is always ≤ l − 1 (except the top levelwhere it is equal to l, namely |AAA|)).



Theorem 2.14 (Minimal Search to get a bound)

To bound k, it suffices to search the lowest twolevels of the CSCSCS-graph. Using this search, thebound k = |AAA| can be taken. This bound is tightand the number of nodes searched is 2|AAA|−1.

No other search algorithm can establish thebound k while searching through less than2|AAA|−1 nodes.



Proof.

There are at most |AAA| coalitions included in CSCSCS∗.Thus

Val(CSCSCS∗) ≤ |AAA|maxSSS

vvvSSS ≤ |AAA| maxCSCSCS∈NNN

Val(CSCSCS) = |AAA|Val(CSCSCS∗NNN )

Number of coalitions at the second lowest level:2|AAA| − 2.

Number of coalition structures at the secondlowest level: 1

2(2|AAA| − 2) = 2|AAA|−1 − 1.

Thus the number of nodes visited is: 2|AAA|−1.



What exactly does the last theorem mean? Letnmin be the smallest size ofNNN such that a boundk can be established.

Positive result: nmin

partitions ofAAA approaches 0 for|AAA| −→ ∞.

Negative result: To determine a bound k, oneneeds to search through exponentiallymany coalition structures.



Algorithm (CSCSCS-Search-1)The algorithm comes in 3 steps:1 Search the bottom two levels of theCSCSCS-graph.

2 Do a breadth-first search from the top of thegraph.

3 Return the CSCSCSwith the highest value.

This is an anytime algorithm.



Theorem 2.15 (CSCSCS-Search-1 up to Layer l)With the algorithm CSCSCS-Search-1 we get thefollowing bound for k after searching throughlayer l:d |AAA|h e if |AAA| ≡ h− 1 mod h and |AAA| ≡ l mod 2,

b |AAA|h c otherwise.

where h =def b |AAA|−l2 c+ 2.

Thus, for l = |AAA| (check the top node), kswitches from |AAA| to |AAA|

2 .



1 Is CSCSCS-Search-1 the best anytime algorithm?2 The search for best k for n′ > n is perhaps notthe same search to get best k for n.

3 CSCSCS-Search-1 does not use any informationwhile searching. Perhaps k can be madesmaller by not only considering Val(CSCSCS) butalso vvvSSS in the searched CSCSCS ′.


2. How to Form a Coalition 3. Core versus Shapley Value

2.3 Core versus Shapley Value



From now on we assume super-additivity!

Definition 2.16 (Payoff Vector, Core of a game)

A payoff vector for a CFG is a tuple 〈x1, . . . , xn〉such that xi ≥ 0 and

∑ni=1 xi = vvvAAAAAAAAA.

The core of a CFG is the set of all payoff vectorssuch that the following holds:

∀SSS ⊆ AAA :n∑

i∈SSS

xi ≥ vvvSSS

(core corresponds to strong Nash equilibrium)

What about the core in the three examplesfrom Slide 74? Core Examples



Sierra Madre:Case 1: |AAA| ≥ 4 and |AAA| is even. Then the coreconsists of a single payoff vector 〈1

2 , . . . ,12〉.

Case 2: |AAA| ≥ 3 and |AAA| is odd. Then the coreis empty.



3 Player Majority Game: The core consists of allpayoff vectors that assign 1 to the grandcoalition and something greater than α to allcoalitions with two agents. Thus we have threecases:Case 1: α 2

3. Then the core consists of thepayoff vectors 〈α

2 ,α2 , 1− α〉, 〈α

2 , 1− α, α2 〉,

〈1− α, α2 ,

α2 〉,.

Case 2: α = 23. Then the core consists of the

vector 〈13 ,

13 ,

13〉.

Case 3: α 23. Then the core is empty.



Parliament Game: Consider payoff vectors〈a, b, c, d〉. Whenever one of a, b, c, d is non-zero,then we can build the coalition consisting of theremaining three parties which gets a higherpayoff. Therefore the core is empty.



Definition 2.17 (Shapley-Value)

The Shapley value of agent iii is defined by

xiii =1

|AAA|!∑SSS⊆AAA

(|AAA| − |SSS| − 1)!|SSS|!(vvvSSS∪i − vvvSSS)

View coalition formation by adding one agent at a time.For a given sequence (of how to add agents), what is agent i'smarginal contribution (at the time it is added to the set S)?

It is (vvvSSS∪i − vvvSSS) .

How many ways to form S (before i joined): |SSS|! .

How many ways to form S (after i joined): (|AAA| − |SSS| − 1)! .This has to be summed up over all possible sets S and averagedover all |AAA|! orderings of the agents.



What are the Shapley values for theParliament Game?

B,C,D should all get the same amount(why?).A should get more.Doing the math gives: A: 50, and theremaining 3 each get 162

3.



Theorem 2.18 ((Non-) Emptiness of the Core)

1 The core of a constant-sum game that is notadditive, is empty.

2 In a simple game, the core is empty if and onlyif there is no veto player. When there are vetoplayers, the core consist of all payoff vectors inwhich the non-veto players get 0.

3 Convex games have non-empty cores.4 For convex games, the vector of Shapley valuesbelongs to the core.



The payoff division should be fair between theagents, otherwise they leave the coalition.

Definition 2.19 (Dummies, Interchangeable)

Agent iii is called a dummy, if

for all coalitions SSS with iii 6∈ SSS: vvvSSS∪iii − vvvSSS = vvviii.

Agents iii and jjj are called interchangeable, if

for all coalitions SSS with iii ∈ SSS and jjj 6∈ SSS: vvvSSS\iii∪jjj = vvvSSS



Three axioms:Symmetry: If iii and jjj are interchangeable, then

xiii = xjjj.Dummies: For all dummies iii: xiii = vvviii.Additivity: For any two games vvv,www:

xv⊕wv⊕wv⊕wiii = xvvv

iii + xwwwiii ,

where v ⊕ wv ⊕ wv ⊕ w denotes the game definedby (v ⊕ wv ⊕ wv ⊕ w)SSS = vvvSSS +wwwSSS.



Theorem 2.20 (Shapley-Value)

There is only one payoff division satisfying theabove three axioms: The Shapley value fromDefinition 2.17.


2. How to Form a Coalition 4. Computational Issues

2.4 Computational Issues



A coalitional game specifies for each coalition avalue. So the representation is alreadyexponential in the number of agents.

Using the Stirling formula, it can be shownthat the Shapley value can be computed intime O(N log log n), where N = 2n is the inputsize.

We need to find a more succinct gamerepresentation.



Definition 2.21 (Weighted Graph Game (WGG))

Let 〈V,W 〉 be an undirected weighted graph (Vthe set of vertices,W ∈ RV×V the set of edgeweights).The associated weighted graph game 〈AAA, vvv〉 isdefined as follows:

AAA = V ,vvvS :=

∑i,j∈S w(i, j).

Real life example?



Lemma 2.22

Let a weighted graph game be given, where allweights are non-negative.Then this game is convex and membership of apayoff vector in the core can be tested inpolynomial time.



Theorem 2.23 (Shapley Value of a WGG)

The Shapley value of a weighted graph game〈V,W 〉 is given by

xi =1

2

∑i6=j

w(i, j)

Therefore the Shapley value can be computed inquadratic time.



Theorem 2.24 (Nonemptiness of the core of a WGG)

The problem to decide whether the core is emptyfor a Weighted Graph Game, isNP-complete.


2. How to Form a Coalition 5. References

2.5 References


2. How to Form a Coalition 5. References

[SandholmLAST98] Sandholm, T., K. Larson, M. Andersson,O. Shehory, and F. Tohmé (1998).Anytime coalition structure generation with worst caseguarantees.In AAAI/IAAI, pp. 46–53.

[SandholmLAST99] Sandholm, T., K. Larson, M. Andersson,O. Shehory, and F. Tohmé (1999).Coalition structure generation with worst case guarantees.Artif. Intell. 111(1-2), 209–238.

[Shoam] Shoham, Yoam (2003).Multiagent syStemsPreprint



Chapter 3. Reasoning aboutCoalitions

Reasoning about Coalitions3.1 Modal Logic3.2 ATL3.3 Rational Play (ATLP)3.4 Imperfect Information3.5 Model Checking3.6 References



Outline

In the previous chapter, we showed howcoalitions can be rationally formed,

In this chapter, we show how one can usemodal logic to reason about their play andtheir outcome.


3. Reasoning about Coalitions 1. Modal Logic

3.1 Modal Logic



Why logic at all?

framework for thinking about systems,makes one realise the implicit assumptions,. . . and then we can:investigate them, accept or reject them,relax some of them and still use a part of theformal and conceptual machinery;

reasonably expressive but simpler and morerigorous than the full language ofmathematics.



Why logic at all?

verification: check specification againstimplementation,executable specification,

planning as model checking



Modal logic is an extension of classical logic bynew connectives and ♦♦♦: necessity andpossibility.

“p is true” means p is necessarily true, i.e.true in every possible scenario,“♦♦♦p is true” means p is possibly true, i.e.true in at least one possible scenario.



Various modal logics:

knowledge→ epistemic logic,beliefs→ doxastic logic,obligations→ deontic logic,actions→ dynamic logic,time→ temporal logic,and combinations of the above:most famousmultimodal logics:BDI logics of beliefs, desires, intentions (andtime).



Definition 3.1 (Kripke Semantics)

Kripke model (possible world model):

M = 〈W , R, π〉,

W is a set of possible worldsR ⊆ W ×W is an accessibility relationπ : W → P(Π) is a valuation of propositions.

M,w |= ϕ iff for every w′ ∈ W with wRw′ we have thatM,w′ |= ϕ.



An Example

q0

q2 q1

x=2

x=0

x=1

s

s

c

c x.= 1 → Ksx

.= 1


3. Reasoning about Coalitions 2. ATL

3.2 ATL



ATL: What Agents Can Achieve

ATL: Agent Temporal Logic [Alur et al. 1997]Temporal logic meets game theoryMain idea: cooperation modalities

〈〈A〉〉Φ: coalition A has a collective strategy toenforce Φ



〈〈jamesbond〉〉♦win:“James Bond has an infallible plan toeventually win”〈〈jamesbond, bondsgirl〉〉funU shot:“James Bond and his girlfriend are able tohave fun until someone shoots at them”

“Vanilla” ATL: every temporal operatorpreceded by exactly one cooperationmodality;ATL*: no syntactic restrictions;



ATL Models: Concurrent Game Structures

Agents, actions, transitions, atomicpropositionsAtomic propositions + interpretationActions are abstract



Definition 3.2 (Concurrent Game Structure)

A concurrent game structure is a tupleM = 〈Agt, Q, π, Act, d, o〉, where:

Agt: a finite set of all agentsQ: a set of statesπ: a valuation of propositionsAct: a finite set of (atomic) actionsd : Agt×Q→ P(Act) defines actionsavailable to an agent in a stateo: a deterministic transition function thatassigns outcome states q′ = o(q, α1, . . . , αk) tostates and tuples of actions



Example: Robots and Carriage

1 2

1

2

1

2

pos0

pos1pos2

q0

q2 q1

pos0

pos1

wait,wait

wait,wait wait,wait

push,push

push,push push,push

push

,wai

t

push,wait

push,wait

wait,push

pos2

wait,pushw

ait,p

ush



Definition 3.3 (Strategy)

A strategy is a conditional plan.We represent strategies by functionssa : Q→ Act.

Function out(q,SA) returns the set of all pathsthat may result from agents A executingstrategy SA from state q onward.



Definition 3.4 (Semantics of ATL)

M, q |= p iff p is in π(q);M, q |= ϕ ∧ ψ iffM, q |= ϕ andM, q |= ψ;

M, q |= 〈〈A〉〉Φ iff there is a collective strategySA such that, for every path λ ∈out(q, SA), we haveM,λ |= Φ.

M,λ |= ©ϕ iffM,λ[1] |= ϕ;M,λ |= ♦ϕ iffM,λ[i] |= ϕ for some i ≥ 0;M,λ |= ϕ iffM,λ[i] |= ϕ for all i ≥ 0;M,λ |= ϕU ψ iff M,λ[i] |= ψ for some i ≥ 0,

andM,λ[j] |= ϕ forall 0 ≤ j ≤ i.



Example: Robots and Carriage

q0

q2 q1

pos0

pos1

wait,wait

wait,wait wait,wait

push,push

push,push push,push

push

,wai

t

push,wait

wait,push

push,wait

wait,push

wai

t,pus

h

pos2

pos0 → 〈〈1〉〉¬pos1



Temporal operators allow a number of usefulconcepts to be formally specified

safety propertiesliveness propertiesfairness properties



Safety (maintenance goals):“something bad will not happen”“something good will always hold”

Typical example:

¬bankrupt

Usually: ¬....

In ATL:

〈〈os〉〉¬crash



Liveness (achievement goals):“something good will happen”

Typical example:

♦rich

Usually: ♦....

In ATL:

〈〈alice, bob〉〉♦paperAccepted



Fairness (service goals):

“if something is attempted/requested, thenit will be successful/allocated”

Typical examples:(attempt → ♦success)♦attempt → ♦success

In ATL* (!):

〈〈prod, dlr〉〉(carRequested → ♦carDelivered)



Connection to Games

Concurrent game structure = generalizedextensive game

〈〈A〉〉γ: 〈〈A〉〉 splits the agents into proponentsand opponentsγ defines the winning condition infinite 2-player, binary, zero-sum game

Flexible and compact specification ofwinning conditions



Solving a game ≈ checking ifM, q |= 〈〈A〉〉γBut: do we really want to consider all thepossible plays?


3. Reasoning about Coalitions 3. Rational Play (ATLP)

3.3 Rational Play (ATLP)



Game-theoretical analysis of games:

Solution concepts define rationality of playersmaxminNash equilibriumsubgame-perfect Nashundominated strategiesPareto optimality

Then: we assume that players play rationally...and we ask about the outcome of the game underthis assumption

Role of rationality criteria: constrain the possible gamemoves to “sensible” ones



q0 start

q1

money1

q2

money2

q3money1money2

q4 q5money2

HhT

t

Ht

Th

Hh

HtT

h

T tHh

Ht

Th

Tt

start → ¬〈〈1〉〉♦money1

start → ¬〈〈2〉〉♦money2



ATL + Plausibility (ATLP)

ATL: reasoning about all possible behaviors.

〈〈A〉〉ϕ: agents A have some collective strategy to enforce ϕagainst any response of their opponents.

ATLP: reasoning about plausible behaviors.

〈〈A〉〉ϕ: agents A have a plausible collective strategy toenforce ϕ against any plausible response of theiropponents.

Important

The possible strategies of both A and Agt\A are restricted.



New in ATLP:

(set-pl ω) : the set of plausible profiles is set/reset to thestrategies described by ω.Only plausible strategy profiles are considered!

Example: (set-pl greedy1)〈〈2〉〉♦money2



Concurrent game structures with plausibility

M = (Agt, Q,Π, π, Act, d, δ,Υ,Ω, ‖·‖)Υ ⊆ Σ: set of (plausible) strategy profiles

Ω = ω1, ω2, . . . : set of plausibility termsExample: ωNE may stand for all Nash equilibria

‖·‖ : Q→ (Ω → P(()Σ)): plausibility mapping

Example: ‖ωNE ‖q = (confess, confess)



Outcome = Paths that may occur when agents A performsA when only plausible strategy profiles from Υ are played

outΥ(q, sA) =

λ ∈ Q+ | ∃t ∈ Υ(sA) ∀i ∈ N(λ[i+ 1] = δ(λ[i], t(λ[i]))

)

P : the players always showsame sides of their coins

s1: always show “heads”



Semantics of ATLP

M, q |= 〈〈A〉〉γ iff there is a strategy sA consistent with Υsuch thatM,λ |= γ for all λ ∈ outΥ(q, sA)

M, q |= (set-pl ω)ϕ iffMω, q |= ϕ where the new modelMω is equal toM but the new set Υω ofplausible strategy profiles is set to ‖ω‖q.



Example: Pennies Game

q0 start

q1

money1

q2

money2

q3money1money2

q4 q5money2

HhT

t

Ht

Th

Hh

HtT

h

T tHh

Ht

Th

Tt

M, q0 |= (set-pl ωNE)〈〈2〉〉♦money2

What is a Nash equilibrium in this game?We need some kind of winning criteria!



Agent 1 “wins”, if γ1 ≡ (¬start → money1) is satisfied.Agent 2 “wins”, if γ2 ≡ ♦money2 is satisfied.

q0 start

q1

money1

q2

money2

q3money1money2

q4 q5money2

HhT

t

Ht

Th

Hh

HtT

h

T tHh

Ht

Th

Tt

γ1\γ2 hh ht th tt

HH 1,1 0, 0 0, 1 0, 1

HT 0, 0 0, 1 0, 1 0, 1

TH 0, 1 0, 1 1,1 0, 0

TT 0, 1 0, 1 0, 0 0, 1

Now we have a qualitative notion of success.

M, q0 |= (set-pl ωNE)〈〈2〉〉(¬start → money1)

where ‖ωNE ‖q0 = “all profiles belonging to grey cells”.



How to obtain plausibility terms?

IdeaFormulae that describe plausible strategies!

(set-pl σ.θ)ϕ: “suppose that θ characterizes rationalstrategy profiles, then ϕ holds”.

Sometimes quantifiers are needed...

E.g.: (set-pl σ. ∀σ′ dominates(σ, σ′))



Characterization of Nash Equilibrium

σa is a’s best response to σ (wrt ~γ):

BR~γa(σ) ≡ (set-pl σ[Agt\a])(〈〈a〉〉γa → (set-pl σ)〈〈∅〉〉γa

)σ is a Nash equilibrium:

NE~γ(σ) ≡∧a∈Agt

BR~γa(σ)



Example: Pennies Game revisited

γ1 ≡ (¬start → money1); γ2 ≡ ♦money2.

q0 start

q1

money1

q2

money2

q3money1money2

q4 q5money2

HhT

t

Ht

Th

Hh

HtT

h

T tHh

Ht

Th

Tt

γ1\γ2 hh ht th tt

HH 1,1 0, 0 0, 1 0, 1

HT 0, 0 0, 1 0, 1 0, 1

TH 0, 1 0, 1 1,1 0, 0

TT 0, 1 0, 1 0, 0 0, 1

M1, q0 |= (set-pl σ.NEγ1,γ2(σ))〈〈2〉〉(¬start → money1)

...where NEγ1,γ2(σ) is defined as on the last slide.



Characterizations of Other Solution Concepts

σ is a subgame perfect Nash equilibrium:

SPN~γ(σ) ≡ 〈〈∅〉〉NE~γ(σ)

σ is Pareto optimal:

PO~γ(σ) ≡ ∀σ′(

∧a∈Agt

((set-pl σ′)〈〈∅〉〉γa → (set-pl σ)〈〈∅〉〉γa) ∨∨a∈Agt

((set-pl σ)〈〈∅〉〉γa ∧ ¬(set-pl σ′)〈〈∅〉〉γa).



σ is undominated:

UNDOM ~γ(σ) ≡ ∀σ1∀σ2∃σ3(((set-pl 〈σa1 , σ

Agt\a2 〉)〈〈∅〉〉γa →

(set-pl 〈σa, σAgt\a2 〉)〈〈∅〉〉γa

)∨

((set-pl 〈σa, σAgt\a

3 〉)〈〈∅〉〉γa ∧

¬(set-pl 〈σa1 , σAgt\a3 〉)〈〈∅〉〉γa

)).



Theorem 3.5

The characterizations coincide withgame-theoretical solution concepts in the class ofgame trees.


3. Reasoning about Coalitions 4. Imperfect Information

3.4 Imperfect Information



How can we reason about extensive games withimperfect information?

Let’s put ATL and epistemic logic in one box.

Problems!



q0

q10q9q8q7 q11 q12 q13 q14 q15 q16 q17 q18

win win

start

win win win win

q1 q2 q3 q4 q5 q6

( , )- -

(Q,K

)

(A,K) (A,Q) (K,A) (K,Q) (Q,A) (Q,K)

(A,K

)

(K,Q

)

(A,Q

)

(Q,A

)

(K,A

)

(A,Q

)

(K,Q

)

(K,A

)

(Q,A

)

(A,K

)

(Q,K

)

keep keep keep keep keep keeptrade trade trade trade trade trade

start→ 〈〈a〉〉♦winstart→ Ka〈〈a〉〉♦win

Does it make sense?



Problem:

Strategic and epistemic abilities are notindependent!

〈〈A〉〉Φ = A can enforce Φ

It should at least mean that A are able toidentify and execute the right strategy!

Executable strategies = uniform strategies



Definition 3.6 (Uniform strategy)

Strategy sa is uniform iff it specifies the samechoices for indistinguishable situations:(no recall:) if q ∼a q

′ then sa(q) = sa(q′)

(perfect recall:) if λ ≈a λ′ then

⇒ sa(λ) = sa(λ), where λ ≈a λ′ iff λ[i] ∼a λ

′[i]for every i.

A collective strategy is uniform iff it consists onlyof uniform individual strategies.



Note:

Having a successful strategy does not implyknowing that we have it!

Knowing that a successful strategy exists doesnot imply knowing the strategy itself!



Levels of Strategic Ability

From now on, we restrict our discussion touniform memoryless strategies.

Our cases for 〈〈A〉〉Φ under incompleteinformation:

2 There is σ such that, for every execution of σ,Φ holds

3 A know that there is σ such that, for everyexecution of σ, Φ holds

4 There is σ such that A know that, for everyexecution of σ, Φ holds



Case [4]: knowing how to play

Single agent case: we take into account thepaths starting from indistinguishable states(i.e.,

⋃q′∈imgq,∼a

out(q, sA))

What about coalitions?Question: in what sense should they knowthe strategy? Common knowledge (CA),mutual knowledge (KA), distributedknowledge (DA)?



Given strategy σ, agents A can have:

Common knowledge that σ is a winningstrategy. This requires the least amount ofadditional communication (agents from A

may agree upon a total order over theircollective strategies at the beginning of thegame and that they will always choose themaximal winning strategy with respect to thisorder)Mutual knowledge that σ is a winningstrategy: everybody in A knows that σ iswinning



Distributed knowledge that σ is a winningstrategy: if the agents share their knowledgeat the current state, they can identify thestrategy as winning“The leader”: the strategy can be identifiedby agent a ∈ A“Headquarters’ committee”: the strategy canbe identified by subgroup A′ ⊆ A

“Consulting company”: the strategy can beidentified by some other group B



Many subtle cases...

Solution: constructive knowledge operators



Constructive Strategic Logic (CSL)

〈〈A〉〉Φ: A have a uniform memoryless strategyto enforce Φ

Ka〈〈a〉〉Φ: a has a strategy to enforce Φ, andknows that he has oneFor groups of agents: CA, EA, DA, ...

Ka〈〈a〉〉Φ: a has a strategy to enforce Φ, andknows that this is a winning strategyFor groups of agents: CA,EA,DA, ...



Non-standard semantics:

Formulae are evaluated in sets of statesM,Q |= 〈〈A〉〉Φ: A have a single strategy toenforce Φ from all states in Q

Additionally:out(Q,SA) =

⋃q∈Q out(q, SA)

img(Q,R) =⋃

q∈Q img(q,R)

M, q |= ϕ iffM, q |= ϕ



Definition 3.7 (Semantics of CSL)

M,Q |= p iff p ∈ π(q) for every q ∈ Q;M,Q |= ¬ϕ iff notM,Q |= ϕ;M,Q |= ϕ ∧ ψ iffM,Q |= ϕ andM,Q |= ψ;

M,Q |= 〈〈A〉〉γ iff there exists SA such that, forevery λ ∈ out(Q,SA), we have thatM,λ[1] |= ϕ;



M,Q |= KAϕ iffM, q |= ϕ for everyq ∈ img(Q,∼K

A) (where K = C,E,D);

M,Q |= KAϕ iffM, img(Q,∼KA) |= ϕ (where

K = C,E,D and K = C,E,D,respectively).



Example: Simple Market

q0

q1

bad-market

loss

success

21

cc

c

wait

subproduction

subproduction

subproduction

own-production

own-production

own-production

q2

ql

qs

oligopoly

s&m

wait

wait

@ q1 :

¬Kc〈〈c〉〉♦success

¬E1,2〈〈c〉〉♦success

¬K1〈〈c〉〉♦success

¬K2〈〈c〉〉♦success

¬D1,2〈〈c〉〉♦success



Theorem 3.8 (Expressivity)

CSL is strictly more expressive than most previousproposals.

Theorem 3.9 (Verification complexity)

The complexity of model checking CSL is minimal.


3. Reasoning about Coalitions 5. Model Checking

3.5 Model Checking



Model Checking Formulae of CTL and ATL

Model checking: Does ϕ hold in modelMand state q?

Natural for verification of existing systems;also during design (“prototyping”)Can be used for automated planning



function plan(ϕ).Returns a subset of Q for which formula ϕ holds, together with a (conditional)plan to achieve ϕ. The plan is sought within the context of concurrentgame structure S = 〈Agt, Q,Π, π, o〉.

case ϕ ∈ Π : return 〈q,−〉 | ϕ ∈ π(q)case ϕ = ¬ψ : P1 := plan(ψ);return 〈q,−〉 | q /∈ states(P1)

case ϕ = ψ1 ∨ ψ2 :P1 := plan(ψ1); P2 := plan(ψ2);return 〈q,−〉 | q ∈ states(P1) ∪ states(P2)

case ϕ = 〈〈A〉〉 © ψ : return pre(A, states(plan(ψ)))case ϕ = 〈〈A〉〉ψ :P1 := plan(true); P2 := plan(ψ); Q3 := states(P2);while states(P1) 6⊆ states(P2)do P1 := P2|states(P1); P2 := pre(A, states(P1))|Q3 od;return P2|states(P1)

case ϕ = 〈〈A〉〉ψ1 U ψ2 :P1 := ∅; Q3 := states(plan(ψ1)); P2 := plan(true)|states(plan(ψ2));while states(P2) 6⊆ states(P1)do P1 := P1 ⊕ P2; P2 := pre(A, states(P1))|Q3 od;return P1

end case



Complexity od Model Checking ATL

Theorem (Alur, Kupferman & Henzinger 1998)

ATL model checking is P -complete, and can bedone in time linear in the size of the modeland the length of the formula.

So, let’s model-check!Not as easy as it seems.



Nice results: model checking ATL istractable.But: the result is relative to the size of themodel and the formulaWell known catch: size of models isexponential wrt a higher-level descriptionAnother problem: transitions are labeledSo: the number of transitions can beexponential in the number of agents.



3 agents/attributes, 12 states, 216 transitions

nofuelroL

caR

fuelOK nofuel fuelOK

nofuel fuelOK nofuel fuelOK

nofuel fuelOK nofuel fuelOK

1

5 6

2

3 4

87

9 10 1211

roL roP

roL roL

roLroL

roP

roP roP

roP

roP

caL caL caLcaL

caR caR caR

caP caP caP caP

< >load ,nop ,fuel1 2

< >unload ,unload ,fuel1 2

< >nop ,nop ,nop1 2 3< >load ,unload ,nop1 2 3

< >nop ,unload ,load1 2 3

< >unload ,unload ,nop1 2 3

< >unload ,nop ,nop1 2 3

< >unload ,nop ,fuel1 2

< >load ,unload ,fuel1 2

< >nop ,nop ,fuel1 2

< >nop ,unload ,fuel1 2

< >nop ,nop ,load1 2 3< >load ,nop ,load1 2 3

< >load ,unload ,load1 2 3

< >load ,nop ,nop1 2 3



Model Checking Temporal & Strategic Logics

m, l n, k, l nlocal, k, l

CTL P-complete [1] P-complete [1] PSPACE-complete [2]

ATL P-complete [3] ∆P3 -complete [5,6] EXPTIME-complete [8,9]

CSL ∆P2 -complete [4,7] ∆P

3 -complete [7] PSPACE-complete [9]

[1] Clarke, Emerson & Sistla (1986).[2] Kupferman, Vardi & Wolper (2000).[3] Alur, Henzinger & Kupferman (2002).[4] Schobbens (2004).[5] Jamroga & Dix (2005).[6] Laroussinie, Markey & Oreiby (2006).[7] Jamroga & Dix (2007).[8] Hoek, Lomuscio & Wooldridge (2006).[9] Jamroga & Ågotnes (2007).



Main message:

Complexity is very sensitive to the context!In particular, the way we define the input,and measure its size, is crucial.



Even if model checking appears very easy, it canbe very hard.

Still, people do automatic model checking!LTL: SPINCTL/ATL: MOCHA, MCMAS, VeriCS

Even if model checking is theoretically hard, itcan be feasible in practice.


3. Reasoning about Coalitions 6. References

3.6 References


3. Reasoning about Coalitions 6. References

[Alur et al. 2002] R. Alur, T. A. Henzinger, and O. Kupferman.Alternating-time Temporal Logic.Journal of the ACM, 49:672–713, 2002.

[Emerson 1990] E. A. Emerson.Temporal and modal logic.Handbook of Theoretical Computer Science, volume B, 995–1072.Elsevier, 1990.

[Fisher 2006] Fisher, M..Temporal Logics. Kluwer, 2006.

[Jamroga and Ågotnes 2007] W. Jamroga and T. Ågotnes.Constructive knowledge: What agents can achieve underincomplete information.Journal of Applied Non-Classical Logics, 17(4):423–475, 2007.


Documents

What Coalitions Can Achieve - TU Clausthal...(2) south (weather ok). Trip should take 3 days. The allies want to bomb the convoy as long as possible. If they search north, they can