23
Analysis of Strategies to Promote Cooperation in Distributed Service Discovery >Guillem Martínez-Cànovas Universitat de València and ERI-CES, Spain >Elena Del Val DSIC, Universitat Politècnica de València, Spain >Vicente Botti DSIC, Universitat Politècnica de València, Spain >Penelope Hernández Universitat de València and ERI-CES, Spain >Miguel Rebollo DSIC, Universitat Politècnica de València, Spain March, 2014 DPEB 02/14

Analysis of Strategies to Promote Cooperation in ... · tems can be viewed as complex networks where nodes are bounded rational agents. In order to deal with complex goals, they require

  • Upload
    lethien

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Analysis of Strategies to

Promote Cooperation in

Distributed Service

Discovery

>Guillem Martínez-Cànovas

Universitat de València and ERI-CES, Spain

>Elena Del Val

DSIC, Universitat Politècnica de València, Spain

>Vicente Botti

DSIC, Universitat Politècnica de València, Spain

>Penelope Hernández

Universitat de València and ERI-CES, Spain

>Miguel Rebollo

DSIC, Universitat Politècnica de València, Spain

March, 2014

DPEB

02/14

Analysis of Strategies to Promote Cooperation in

Distributed Service Discovery

Guillem Martınez-Canovasb, Elena Del Vala, Vicente Bottia, PenelopeHernandezb, Miguel Rebolloa

aDepartament de Sistemes Informatics i Computacio, Universitat Politecnica deValencia, Spain

bERI-CES, Departamento de Analisis Economico, Universitat de Valencia, Spain

Abstract

New systems can be designed, developed, and managed as societies of agentsthat interact with each other by offering and providing services. These sys-tems can be viewed as complex networks where nodes are bounded rationalagents. In order to deal with complex goals, they require the cooperationof the other agents to be able to locate the required services. In this pa-per, we present a theoretical model that formalizes the interactions amongagents in a search process. We present a repeated game model where theactions that are involved in the search process have an associated cost. Also,if the task arrives to an agent that can perform it, there is a reward foragents that collaborated by forwarding queries. We propose a strategy thatis based on random-walks, and we study under what conditions the strategyis a Nash Equilibrium. We performed several experiments in order to val-idate the model and the strategy and to analyze which network structuresare more appropriate to promote cooperation.

Keywords: Repeated games, Networks, Nash Equilibrium, Random-walk,Service Discovery

Email addresses: [email protected] (Guillem Martınez-Canovas),[email protected] (Elena Del Val), [email protected] (Vicente Botti),[email protected] (Penelope Hernandez), [email protected] (MiguelRebollo)

Preprint submitted to Nuclear Physics B March 22, 2014

1. Introduction

Social computing has emerged as a discipline in different fields such asEconomics, Psychology, and Computer Science. Computing can be seen as asocial activity rather than as an individual one. New systems are designed,developed, and managed as societies of independent entities or agents thatoffer services and interact with each other by providing and consuming theseservices [18]. These systems and applications can be formally representedthrough formal models from the field of Complex Networks [22]. This areaprovides a sound theoretical basis for the development of models that helpus to reason about how distributed systems are organized [14]. ComplexNetwork models have been used in different contexts such as social net-works (collaboration, music, religious networks), economic networks (trade,tourism, employment networks), Internet (structure and traffic networks),bio-molecular networks, and computer science networks among others [5, 21].

In systems of this kind, one of the challenges is the design of efficientsearch strategies to be able to locate the resources or services required byentities in order to deal with complex goals [2, 21, 7]. Taking into accountthe autonomy of the entities that participate in the search process, threelevels of search decentralization can be considered. We consider that at thefirst level the search process is centralized when there is a common protocolthat is adopted by all the entities of the system and this protocol dictates theactions that must be followed (i.e., the protocol specifies the entity that startsthe process, the sequence of participation of entities, and the target). At thesecond level this protocol can be relaxed. The entities adopt that protocol,and, therefore, they carry out the same set of actions, but the search path(i.e., the sequence of entities that participate in the search process) is notspecified. At the third level, a decentralized search can be considered whenthere is a protocol adopted by all the entities that specifies the set of availableactions. However, these entities can decide whether or not they are goingto follow the protocol. It would not be desirable that to impose the samebehavior on all the nodes if it takes away their individual choice, (i.e., itwould be desirable that all the nodes would follow the protocol willingly).Therefore, we have looked for a concept of stability within the strategies ofthe entities of the system. This concept, which comes from Game Theory, isknown as Nash Equilibrium.

As an application scenario, we consider a P2P system that is modeledas a multi-agent system. Agents act on behalf of users playing the role of

2

a service provider or service consumer. Agents that play the role of serviceconsumers should be able to locate services, make contracts agreements, andreceive and present results [28]. Agents that play the role of service providersshould be able to manage the access to services and ensure that contractsare fulfilled. By considering the system as a network, it is assumed that allthe information is distributed among the agents. Since agents only have alocal view of the network, the collaboration of other agents is required inorder to reach the target. During a search process, agents can carry out aset of actions: create a task that must be performed by a qualified agent,forward the task to one or several neighbors if they do not know how to solvethe task, or perform the task if they can provide the required service. Thecooperation of agents forwarding queries plays a critical role in the success ofthe search process [6]. This action facilitates the location of a resource basedon local knowledge. However, in our scenario, this action has an associatedcost and agents are free to decide whether or not the forwarding action isprofitable to them based on its cost and the expected reward.

In this paper, we propose a model to formally describe the distributedsearch for services in a network as a game. Specifically, we use the repeatedgames framework to model both the process that a task follows through thenetwork and the global task-solving process. In the former, each period is adecision stage for the agent who is in possession of the task. In the latter,a project is generated in each period and randomly assigned to an agent inthe network.

Our intention is to analyze the relationship between the cost of forwardingthe task and the reward that agents obtain later when the task is solved inorder to guarantee that cooperation is a stable behavior in the game. Wecalled this reward α. We establish a bound for the total length of the searchprocess total length using Mean First Passage Time (MFPT), which is theaverage number of steps necessary to go from an agent i to another agent j inthe same network. Therefore, the structure of the network also characterizesα through the MFPT, and, consequently, the network structure influencesthe agents’ behavior. In order to verify this, we ran simulations to contrastthe possible differences among network structures. The results show thatthe structure of the network has a significant influence on the emergenceof cooperation. The structure that offers the best results is the Scale-Freestructure since its diameter is closer to the limit of steps in the search processthan the other network structures.

The paper is organized as follows. Section 2 presents a repeated game

3

model to formalize the search process of services in agent networks. In Section3, some strategies that agents can follow in the repeated game are analyzedin order to determine whether or not they are at a Nash Equilibrium. Sec-tion 4 describes several experiments we performed to empirically validate thetheoretical results in different network structures as well as to analyze theinfluence of the network structure and to determine which structure facili-tates the emergence of cooperation in the proposed repeated game. Section5 presents other works related to cooperation emergence in distributed envi-ronments. Finally, Section 6 presents the conclusions.

2. The Model

Consider a finite set of agents N = {1, 2, . . . , n} that are connected byundirected links in a fixed network represented by the adjacency matrix g.A link between two agents i and j, such that i, j ∈ N , is represented bygij = gji = 1, where gij = 0 means that i and j are not connected. The setof neighbors of agent i is

Ni = {j|gij = 1}For simplicity we assume that gii = 0 so all neighbors inNi(g) are different

from i. The number of neighbors that agent i has (its degree of connection)is denoted by ki = |Ni(g)|, which is the cardinality of the set Ni(g). Al-ternatively, we use the adjacency matrix to represent the network, which isdenoted by A. A link between agents i and j is represented by Aij = 1, andby Aij = 0 if there is no link.

We consider that each agent has a type (service) θi ∈ [0, 1] that representsthe degree of ability of agent i. Let ρ ∈ [0, 1] be a task that must be carriedout by one of the agents in the network. We assume that there is at least oneagent i ∈ N such that i is suitable to perform the task, which means that,for a fixed ε, its type θi is ‘similar’ to the task ρ, i.e., |θi − ρ| ≤ ε.

We define an N -person network game Γ∞ρ that takes place in g. Eachagent has a set of actions Ai = {∅, 1, 2, . . . , Ni,∞}, where:

• ∞ means the agent itself does the task

• {1, 2, . . . , Ni} means forwarding the task to one of the agent’s Ni neigh-bors

• ∅ means doing nothing

4

In the first period of the game, a task ρ is uniformly assigned to a ran-domly selected agent. Beginning at stage 1, the task passes through thenetwork stage by stage. At stage t > 0 each agent chooses one of the aboveactions depending on whether or not the task is in the agent’s node. Theaction perform by an agent where the task is not in the agent’s node is con-sidered to be ∅ or “doing nothing”. We associate a null payoff to this action.At stage 1, agent i(1) chooses one action from its action set.

At the initial stage, if the first agent, j ∈ {1, . . . , N} chooses to do thetask itself because its type is ε close to the task ρ, then the game ends. Agentj gets a payoff of 1 − |ρ − θj|, which depends on its type θj and the task ρ.The more similar the type and the task are, the greater the payoff is. Therest of the agents can do any action in their action sets. More specifically,let c > 0 be the cost of forwarding the task. If an agent forwards the task, atsome point it, the agent may earn a payoff α > c if the task ends successfully.If an agent chooses the action ∅, the payoff is 0 if the agent did not forwardany task in a previous period or the agent did forward the task but the taskended unsuccessfully (i.e.,nobody chose ∞).

Formally:

uti(ai, a−i; j) =

1− |ρ− θi| if ati =∞−c if ati ∈ {1, . . . , Ni}0 if ati = ∅ ∧ @t′ < t : at

′i ∈ {1, . . . , Ni}

α if ati = ∅ ∧ ∃t′ < t : at′i ∈ {1, . . . , Ni} ∧ ∃j ∈ N : atj =∞

By choosing actions at stage t, agents are informed of actions that arechosen in previous stages of the game. Therefore, let us consider a completeinformation set-up. Formally, letHt, t = 1, . . ., be the cartesian product A×At−1 times, i.e., Ht = At−1, with the common set-theoretic identification A0 =�, and let H = ∪t≥0H∞t . A pure strategy σi for agent i is a mapping fromH to Ai, σi : H → Ai. Obviously, H is a disjoint union of Ht, t = 1, . . . , Tand σit:Ht → Ai as the restriction of σi to Ht.

The payoff function of each agent when the game is repeated a certainnumber of times and when the task starts at any agent is formalized as:

ui(σi, σ−i) =∑t>0

uti(ai, a−i; j) =1

N

N∑j=1

∞∑t=1

uti(ai, a−i; j)

This induces an order in the payoffs for each strategy σi that each agent

5

i chooses given the action profiles σ−i, which allows us to rank them andcalculate the Nash Equilibriums of the game.

In order to characterize the set of feasible and individual rational level todefine the set of equilibria payoffs of the repeated game (i.e., the equilibriumpayoff attained as a consequence of the well-known Folk Theorem) [9], wehave to establish the min-max level in pure actions. The min-max strategyfor agent i is the one that guarantees the highest possible payoff in the actionprofile that is the worst case scenario for agent i. This is sometimes calledthe reservation payoff. Formally,

ui = mina−i

maxai

ui(ai, a−i), ai ∈ Ai, a−i ∈ A−i (1)

In our set-up for the one-shot payoff function, the min-max strategy is∅ and, therefore, the min-max level is 0. An action profile (σ∗1, . . . , σ

∗n) is a

Nash equilibrium in the network game Γ, if and only if

ui(θi, σ∗1, . . . , σ

∗n) ≥ ui(θi, σ

∗1, . . . , σi, . . . , σ

∗n), ∀ σ∗i 6= σi, i ∈ N, and θi ∈ Θ.

We define the set of feasible payoff vectors as

F := conv{u(a), a ∈ A}.The set of strictly individually rational payoff vectors (relative to the min-

max value in pure strategies) is

V := {x = (x1, . . . , xn) ∈ F : xi > ui ∀i ∈ N} .Folk theorems in the context of game theory establish feasible payoffs for

repeated games. Each Folk Theorem considers a subclass of games and iden-tifies a set of payoffs that are feasible under an equilibrium strategy profile.Since there are many possible subclasses of games and several concepts ofequilibrium, there are many Folk Theorems.

The Folk Theorem states any payoff profile in V can be implemented asa Nash equilibrium payoff if δ is large enough. The intuition behind the FolkTheorem is that any combination of payoffs such that each agent gets at leastits min-max payoff is sustainable in a repeated game, provided each agentbelieves the game will be repeated with high probability. For instance, thepunishment imposed on an agent who deviates is that the agent will be heldto its min-max payoff for all subsequent rounds of the game. Therefore, theshort-term gain obtained by deviating is offset by the loss of payoff in future

6

rounds. Of course, there may be other, less radical (less grim) strategiesthat also lead to the feasibility of some of those payoffs. The good newsfrom the Folk Theorem is that a wide range of payoffs may be sustainablein equilibrium. The bad news is that, there may exist a multiple number ofequilibria.

3. Equilibrium strategies

In this section, we study which strategy profiles are a Nash Equilibriumin the game Γ∞ρ . Namely, we start defining the Nobody works strategy, whichbasically consists of doing nothing, even in the case that an agent can performthe task. We prove that the Nobody works strategy is not a Nash Equilibriumin the game. Then we consider the so-called random-walk strategy. In thisstrategy, an agent is not able to solve the project, it uniformly and randomlychooses one of its neighbors to forward the task to. We establish the con-ditions under which the strategy profile every agent plays the random-walkstrategy is a Nash Equilibrium. We enrich the model by adding a thresholdfor the number of times that a task can be forwarded and we also studyunder which conditions is a Nash Equilibrium.

3.1. Nobody works

One possible strategy is the strategy we call Nobody works, in which everyagent always chooses the action ∅ and consequently gets a payoff of 0. One ofour model’s assumptions is that for all possible task ρ there exists an agentthat is able to perform it. Let that agent be i, and let its type be θi. Fromour payoff criterion, we can state that in some period t the project will startat agent i and agent i will be able to solve it. In that case, if agent i choosesthe ∞ action (doing the project), agent i gets a payoff of 1 − |ρ − θi| > 0;therefore, the Nobody works strategy is not an equilibrium strategy.

3.2. Random Walk

In this subsection, we study the case where all agents play a behavioralstrategy σ∞i : H t−1 → ∆(Ai), which leads to the well-known dynamics of“random-walk”. We call this behavioral strategy the random-walk strategy.

Let us formally define the random-walk strategy. At each stage t, agenti performs one of the three actions that are possible:

• the ∅ action if no task arrives.

7

• the ∞ action if agent i’s type θi is close to the task ρt.

• the forwarding action when the task arrives and agent i cannot solve it,agent i uniformly and randomly chooses one of its neighbors to forwardthe task to.

This strategy is a “myopic” strategy since agents do not update the ex-pected payoff. Each agent i will uniformly and randomly choose one of itsneighbors to continue searching for the agent that can solve the task ρ. Re-call that in our game for all task ρ, there exists an agent k∗ such that agenti can do the task ρ (i.e., |θk∗ − ρ| > ε). As a consequence of the random-walk strategy we can assert the existence of a finite time 0 ≤ t < ∞ andk∗ ∈ {1, . . . , N} such that atK∗ =∞. Therefore, given a task ρ, the achievedpayoff for each agent first depends on whether or not agent i was part of thepath of searching for the agent that did the task. If agent i did not in theprocedure, then agent i gets 0, which is the min−max value.

Now, suppose that i is part of the path. Let us define some parametersthat take part in the utility function. We refer to the probability of an agentbeing capable of performing the task as γi, and since it is the same for allagents , we simply call it simply γ. P∞x is the probability that the task reachesa specific agent x in the long run, and the previously defined parameters αand c are the reward and the cost of forwarding the task, respectively.

Hence, the utility function of the game Γ∞ρ for agent i is:

ui(θi, σi, σ−i) = P∞i (γ(1− |ρ− θi|) + (1− γ)(P∞k∗ (α− c) + (1− P∞k∗ )(−c)))(2)

The following proposition states that the strategy profile in which everyagent plays a random-walk strategy is a Nash equilibrium in the game Γ∞ρ .

Proposition 1. The strategy profile (σ∞1 , . . . , σ∞n ) is a Nash Equilibrium in

the game Γ∞ρ .

Proof. Let i be an agent such that agent i selects a strategy σ∅i 6= σ∞i ; let t bea time period such that the task ρ arrives to i; let t′ be another time periodsuch that t′ 6= t and let |θi − ρ| < ε. The strategy σ∅i is formally defined as

8

(σ∅i )t : H t → Ai

{(σ∅i )

t =∞∀t 6= t′, (σ∅i )

t = ∅(3)

When selecting that strategy, if i is able to afford the task agent i doesit, and that is the only profit that agent i eventually gets because it neverforwards the task. Consequently, agent i’s utility function is

ui(θi, σ∅i , σ

∞−i) = P∞i (γ(1− (ρ− θi))) (4)

In order to prove that the strategy profile (σ∞1 , . . . , σ∞n ) is a Nash Equi-

librium the utility function described in 2 must be greater or equal to theutility function specified in 4

P∞i (γ(1− (ρ− θi)) + (1− γ)(P∞k∗ (α− c) + (1− P∞k∗ )(−c)) ≥ P∞i (γ(1− (ρ− θi)))(1− γ)(P∞k∗ (α− c) + (1− P∞k∗ )(−c)) ≥ 0

Since 0 < γ < 1, (1− γ) is always positive. Then

P∞k∗ (α− c) + (1− P∞k∗ )(−c) ≥ 0

α ≥ c

P∞k∗

By the definition of random-walk dynamics, in the long term, a task ρ willalways find the agent k∗ that is capable of solving it, so P∞k∗ = 1. Since α > cby assumption, the strategy profile (σ∞1 , . . . , σ

∞n ) is a Nash Equilibrium in

the game Γ∞ρ .

Now we enrich the model by introducing a “time” condition to solve thetask. It makes sense to limit the rewards for efforts to solve or forward thetask to a time limit within which the task must be solved (i.e., efforts areonly rewarded if the task is solved in a certain number of time periods).

3.3. Random-walk strategy with a finite number of steps

An interesting measure for establishing the limit of steps that a task ρcan take to be solved is the Mean First Passage Time (hereafter MFPT).The MFPT between two nodes i and j of a network is defined as the averagenumber of steps to go from i to j in that particular network [32]. Therefore,we define the strategy στi for an agent i, which consists of forwarding the

9

task to a randomly selected neighbor only if it has advanced a number ti < τtimes, where τ is the average MFPT of the network (which we formally definebelow).

From equation 34 of [32], we know that the MFPT from any agent to aparticular agent j in a network is defined as:

〈Tj〉 =1

1− πj

N∑i=1

πiTij =

=1

1− πj

N∑k=2

(1

1− λkψ2kj

N∑i=1

kikj

)− 1

1− πj

N∑k=2

(1

1− λkψkj

√K

kj

N∑i=1

ψki

√kiK

)

where ki and kj are the degree of agents i and j, respectively, ψk isthe kth eigenvector of S corresponding to the kth eigenvalue λk (with S =

D−12AD−

12 , A being the adjacency matrix of the network, D being the diag-

onal degree matrix of the network, and the eigenvalues being rearranged as1 = λ1 > λ2 ≥ λ3 ≥ . . . ≥ λN ≥ −1) and πj = dj/K (with K =

∑Nj=1 dj).

It follows from Eq. (6) from the same article that∑N

i=1 ψki

√kiK

=∑Ni=1 ψkiψ1i = 0. Thus, the second term is equal to 0. So

〈Tj〉 =1

1− πj

N∑k=2

(1

1− λkψ2kj

N∑i=1

kikj

)=

1

1− πjK

kj

N∑k=2

1

1− λkψ2kj (5)

We define the maximum number of steps that must be taken for everytask ρ to be solved as the average 〈Tj〉 for all j ∈ N , which we denote as τ .Formally:

τ =N∑j=1

〈Tj〉N

(6)

We now define a new game Γτρ. In this game, if it takes more than τ stepsto solve the task, the game ends and the collaborating agents get no reward.In the following, we explain the equilibrium strategies for the game Γτρ.

Let us define some new parameters that play a role in the new game: thenumber of steps a task has advanced until it reaches agent i is ti; Q

τ−tii,k∗ is

the probability that the task reaches an agent k∗ starting from agent i in

10

τ − ti or less steps and P τs,i is the probability that the task reaches an agent

i starting from agent s in τ or less steps.In order to formally define P τ

s,i, we use the adjacency matrix of the network(denoted as A) and one of its properties which states that the values (i, j) ofAn indicate the number of paths of length n between i and j in that network.P τs,i can be defined as the number of paths of length τ or less between s and i

divided by the total number of paths with the same length starting at s butending at any possible agent j of the network. Formally:

P τs,i =

τ∑t=1

(At)si

N∑j=1

(τ∑t=1

(At)sj

) (7)

Let us define rτ−tii , or simply ri, as the number of agents that the taskcan reach starting from i in τ − ti or less steps. For this purpose, we use theReachability matrix, (denoted as R), which is defined as

∀i, j ∈ N, (Rτ−ti)ij =

1 if there exists at least one path between i and j

of length τ − ti or less

0 otherwise

(8)The process for obtaining R from the adjacency matrix is straightforward.Then, we formally define ri as

ri =N∑j=1

(Rτ−ti)ij (9)

To define Qτ−tii,k∗ , we compute the probability that none of the reachable

agents for agent i is able to solve the task, which is (1− γ)ri . Then Qτ−tii,k∗ is

defined as

Qτ−tii,k∗ = 1− (1− γ)ri (10)

Hence, the utility function of the game Γτρ for an agent i when all agentsplay the strategy στ is

11

ui(θi, στi , σ

τ−i) = P τ

s,i

(γ(1− (ρ− θi)) + (1− γ)(Qτ−ti

i,k∗ (α− c) + (1−Qτ−tii,k∗ )(−c)

)(11)

Now we study a bound for α for which the strategy profile (στ1 , . . . , στn) is

a Nash Equilibrium in the game Γτρ.

Proposition 2. If αi ≥c

1− (1− γ)ri, the strategy profile (στ1 , . . . , σ

τn) is a

Nash Equilibrium in the game Γτρ.

Proof. The proof proceeds exactly like the proof for Proposition 1 but sub-stituting the proper probabilities P τ

s,i and Qτ−tii,k∗ . Finally, we have

α ≥ c

Qτ−tii,k∗

(12)

By substituting 10 in 12, we have

α ≥ c

1− (1− γ)ri(13)

The fact that α depends on ri implies that each agent has its own boundfor alpha which depends on the agent’s connectivity (α becomes αi).

This means that the network structure has a deep impact on α bounds.In high clustered networks, ri is high for each agent i, and, consequently,αi is low. The opposite occurs in low clustered networks (e.g., Erdos-Renyinetworks) where αi is uniform among all agents. In networks with a non-uniform degree distribution (e.g., scale-free networks), average α may besimilar to the α for Erdos-Renyi networks, but it varies a lot between huband terminal agents.

4. Experiments

In this section, we validate the proposed mathematical model for servicesearch in different network structures. Specifically, we focus on how thestructural parameters of the networks influence the required reward α in orderto promote cooperation (i.e., forwarding tasks) and improve the success of the

12

topology N edges avDg std clust dens τ = Log(mfpt) d diameter/τ

Random 100.0 200.00 4.00 1.46 0.02 0.04 5.00 7.09 1.418100.0 300.00 6.00 1.82 0.03 0.06 4.00 5.00 1.25

ScaleFree 100.0 197.00 3.94 3.94 0.02 0.04 5.00 5.09 1.09100.0 293.00 5.86 5.08 0.04 0.06 5.00 4.27 0.854

SmallWorld 100.0 200.00 4.00 1.02 0.08 0.04 5.00 7.55 1.55100.0 300.00 6.00 1.27 0.10 0.06 4.00 5.45 1.36

Table 1: Network structural properties: topology, number of agents, number of edges,average degree of connection of agents, standard deviation of the degree distribution,clustering, density, τ = Log(Mean First Passage Time),diameter, ratio diameter/τ .

search process. The structural parameters are represented by the parameterri (see Equation 9). For the evaluation, we compare the success rate of thesearches and the average agent utility in different network structures. Thenetwork structures considered in the experiments are: Random, Scale-free,and Small-World networks.

4.1. Experimental Design

Each network in the experiments is undirected and has 100 agents. Wealso tested different sizes of networks, but the conclusions were similar tothose obtained with 100 agents and we do not include them here. The struc-tural properties of the networks are shown in Table 1. Each agent has a type(service) θi ∈ [0, 1] that represents the degree of ability of agent. The degreeof ability is uniformly distributed among the agents. A task ρ is generatedand assigned to an agent following a uniform probability distribution. Eachagent has a set of actions to choose from when it receives a task: doing thetask if the similarity between its ability and the task ρ is under a threshold|θi − ρ| ¡ ε, forwarding the task based on the expected reward (Formula 13),or doing nothing. The forwarding action has an associated cost c = 5. Atask ρ is successfully solved when an agent that has an ability that is similarenough to the task (|θi − ρ| < ε) in less than τ steps. For the experiments,the value of the τ is the Log(MFPT). We use this concave transformationto obtain clear results and illustrate the impact on the parameter with thestructure of the network. The value for the ε parameter is 0.1. We executedeach experiment over 10 networks of each type and we generated 1,000 tasksρ in each network.

13

0

10

20

30

40

50

60

4.9998

4.9999

5.0000

5.0001

5.0002

5.0003

5.0004

5.0005

% o

f su

ccess

ful se

arc

hes

reward

RandomScale-Free

Small-World 0

10

20

30

40

50

60

4.9998

4.9999

5.0000

5.0001

5.0002

5.0003

5.0004

5.0005

% o

f su

ccess

ful se

arc

hes

reward

RandomScale-Free

Small-World

Figure 1: Influence of α values in the percentage of successful searches in different networkstructures of 100 agents. (Left) Network structures with an average degree of connectionof 4. (Right) Network structures with an average degree of connection of 6.

4.2. The Influence of Structural Properties and α

In this section, we analyze the influence of network structural propertiesand reward α in the search process. We consider values for α in the range[4.99975, 5.0005] in order to see the effects on the search process (see Figure1). In this interval, we observe the effects of considering values for α that arelower than the cost of the forwarding action (c=5), values that are equal tothe cost of the forwarding action, and values that are greater than the cost ofthe forwarding action. With values of α lower than or equal to c, the successrate was around 20%. This percentage represents the number of tasks thatcan be solved directly by the first agent that receives the task. Values of αthat are strictly superior to the cost of the forwarding action (α > c) providean increase in the success rate of the search process (see Figure 1 Left).The structural properties of the network considered in the search processhave an important influence on the success rate. We observe that there aresignificant differences between the results in Scale-Free, Random, and Small-World networks. Scale-Free provided better results than the other networkssince its structural properties increased the number of agents that could bereachable. The diameter of the network is closer to τ than the diametersof other network models (see Table 1). Another example of the influence ofstructural properties is the average degree of connection of the agents. Asthe average degree of connection increases, the number of reachable agentsincreases and so does the probability of finding the required agent. Therefore,agents estimate that it is profitable to forward the task to their neighbors(see Figure 1 Right).

14

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

0 10 20

30 40

50 60

avera

ge a

gent

pro

fit

reward

RandomScale-Free

Small-World-0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

0 10 20

30 40

50 60

avera

ge a

gent

pro

fit

reward

RandomScale-Free

Small-World

Figure 2: Influence of α values on the utility in different network structures of 100 agents.(Left) Network structures with an average degree of connection of 4. (Right) Networkstructures with an average degree of connection of 6.

The structural properties and the reward value α also influence the aver-age utility obtained by an agent. In this experiment, we analyzed values ofα in the range [0, 60]. We considered a wider range in order to see the valuesthat made the average agent utility positive and how this utility evolves (seeFigure 2). Values of α lower than or equal to c provided a utility equal to 0since agents estimate that the expected reward was not enough to compen-sate the cost of the forwarding action. Values of α that were in the interval(5, 10] made some agents estimate that the forwarding action was going tobe profitable. Although the value for α was enough for agents to considerforwarding tasks, their utility was not always positive for all the agents.Therefore, the average utility had a negative value. The interval (5, 10] forα values could be considered risky. The average utility became positive withα values greater than 10 (see Figure 2 Left). In this experiment, the networkstructure also had a significant influence. The Scale-Free network providedhigher values of utility than the Random or Small-Word networks. This dif-ference was also observed when we increased the average degree of connectionof agents (see Figure 2 Right).

5. Related Work

Random-walk strategies have been presented as an alternative searchstrategy to flooding strategies [4, 31, 17] since they reduce the traffic inthe system and provide better results [19, 33]. A random-walk search algo-rithm selects a neighbor randomly each time to forward the message to [10].

15

There are many search proposals that navigate networks using random-walksince they do no require specific knowledge and can be applied in severaldomains. Some of these works have introduced modifications such as usingrandom-walk from multiples sources [34, 25] or adding information aboutroutes [15, 1, 3] in attempt to improve the search efficiency. The influenceof network structural properties on random-walk has also been studied. Forinstance, some of the properties that have been evaluated are: the meanfirst-passage time (MFPT) from one node to another [32, 27, 30], how thestructural heterogeneity affects the nature of the diffusive and relaxation dy-namics of the random walk [23], and the biased random-walk process basedon preferential transition probability [8].

One of the common assumptions in network search is that all the agentshave homogeneous behavior and that all of them are going to cooperate byforwarding messages. However, this does not correspond with real scenarios.In real large-scale networks, decisions are often made by each agent inde-pendently, based on that agent’s preferences or objectives. Game Theoreticmodels are well suited to explain these scenarios [20]. Game theory studiesthe interaction of autonomous agents that make their own decisions whiletrying to optimize their goals. Game Theory provides a suite of tools thatmay be effectively used in modeling interactions among agents with differentbehaviors [29].

There are works in the area of Game Theory that focus on the rout-ing problem in networks where there are selfish agents. Specifically, thisproblem has been studied in wireless and ad-hoc networks [29, 20]. Numer-ous approaches use reputation [13] (i.e., techniques based on monitoring thenodes’ behavior from a cooperation perspective) or price-based techniques[12] (i.e., a node receives a payment for its cooperation in forwarding net-work messages and also pays other nodes which participate in forwarding itsmessages) to deal with selfish agents. One of the drawbacks of reputationsystems is that nodes whose reputation values are higher than a thresholdare treated equally. Therefore, a node can maintain its reputation value justabove the threshold to obtain the same benefit as nodes with higher rep-utation levels. One of the problems of the Price-based techniques is thatthey are not fair with nodes located in region with low traffic that have fewopportunities to earn credit. Li et al. [16] integrate both techniques andpropose a game theory model for analyzing the integrated system. However,this approach does not consider the influence of the underlying structure inthe cooperation emergence.

16

To understand the social behavior of the systems it is important to con-sider the network structure. There are several works that analyze the influ-ence of the network structure when the agents of the networks do not followhomogeneous behavior. These works study how structural parameters suchas clustering or degree distribution affect the emergence and maintenanceof cooperative behavior among agents [26, 24]. Hofmann et al. [11] presenta critical study about the evolution of cooperation in agent societies. Theauthors conclude that there is a dependence of cooperation on parameterssuch as network topology, interaction game, state update rules and initialfraction of cooperators.

The proposal presented in this paper analyzes through a game theorymodel the problem of cooperation emergence in the context of decentralizedsearch. It differs from previous approaches in several ways. First, we consid-ered a game that fits better with the characteristics of decentralized searchthan other games proposed in the literature that are based on the often stud-ied Prisioner’s Dilemma. Second, agents decision about cooperation is basedon an utility function that takes into account the network topology proper-ties. Moreover, the utility function also considers a limit in the number ofpossible steps to reach the target agent. This feature is important in dis-tributed systems in order to avoid traffic overhead. Third, the strategy thatagents follow is based on a search mechanism that is often used in networknavigation and does not require specific domain knowledge. Therefore, themodel can be easily applied in different search contexts. Finally, in orderto promote cooperation, instead of using a reputation or price-based mecha-nisms, we use a mechanisms based on incentives provided by the system. Weformally and experimentally analyze which is the minimum required rewardin order to consider the strategy a Nash Equilibrium.

6. Conclusions

In this paper, we have analyzed the distributed search of resources innetworks that model societies of agents. These agents offer services andinteract with each other by providing and consuming these services. Theactions of these agents have an associated cost and not all of the agentshave homogeneous behavior. We have proposed the use of Game Theoryto formally model the interactions between the agents as a repeated game,and we have described a strategy that is based on the simple well-knownrandom-walk strategy. We have also established the conditions under which

17

the random-walk strategy is a Nash Equilibrium. The strategy proposedhas been extended by adding a constraint for contexts where the number oftimes a task can be forwarded is restricted. The conditions under which thisextended strategy is a Nash Equilibrium have also been analyzed. Finally,we validated the proposed model and the latest strategy in different types ofnetworks. The results show that in order to promote cooperation among theagents of the network, the expected reward should be greater than the costof the forwarding action. Moreover, the network structure has an importantinfluence on the success of the search process and in the average utility of thesystem. Scale-Free structural parameters facilitate the success of the searchprocess because their structural properties increase the number of agentsthat can be reached. The experiments also show that even though there arecertain values of the reward that are enough to promote cooperation, thesevalues are not enough to obtain a positive average utility value.

Acknowledgements

This work is supported by TIN2011-27652-C03-01 and TIN2012-36586-C03-01 projects and PROMETEOII/2013/019.

18

[1] L. Backstrom and J. Leskovec. Supervised random walks: Predictingand recommending links in social networks. 2011.

[2] X. Ban, J. Gao, and A. van de Rijt. Navigation in real-world complexnetworks through embedding in latent spaces. In Meeting on AlgorithmEngineering and Experiments (ALENEX10), pages 138–148, 2010.

[3] D. O. Cajueiro. Optimal navigation in complex networks. PhysicalReview E, 79(4), 2009.

[4] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker.Making gnutella-like p2p systems scalable. In Proceedings of the 2003conference on Applications, technologies, architectures, and protocols forcomputer communications, SIGCOMM ’03, pages 407–418. ACM, 2003.

[5] L. F. Costa, O. N. Oliveira Jr., G. Travieso, F. A. Rodrigues, P. R.Villas Boas, L. Antiqueira, M. P. Viana, and L. E. C. Rocha. Analyzingand modeling real-world phenomena with complex networks: a surveyof applications. Advances in Physics, 60(3):329–412, 2011.

[6] E. Del Val, M. Rebollo, and V. Botti. Promoting Cooperation in Service-Oriented MAS through Social Plasticity and Incentives. Journal of Sys-tems and Software, 86(2):520–537, 2013.

[7] E. Del Val, M. Rebollo, and V. Botti. Enhancing Decentralized ServiceDiscovery in Open Service-Oriented Multi-Agent Systems. Journal ofAutonomous Agents and Multi-Agent Systems, 28(1):1–30, 2014.

[8] A. Fronczak and P. Fronczak. Biased random walks in complex networks:The role of local navigation rules. Phys. Rev. E, 80:016107, Jul 2009.

[9] D. Fudenberg and E. Maskin. The Folk Theorem in Repeated Gameswith Discounting or with Incomplete Information. Econometrica,54(3):533–554, 1986.

[10] C. Gkantsidis, M. Mihail, and A. Saberi. Random walks in peer-to-peer networks: Algorithms and evaluation. Performance Evaluation,63(3):241–263, March 2006.

[11] L.-M. Hofmann, N. Chakraborty, and K. Sycara. The evolution of co-operation in self-interested agent societies: A critical study. In The

19

10th International Conference on Autonomous Agents and MultiagentSystems - Volume 2, AAMAS ’11, pages 685–692, Richland, SC, 2011.International Foundation for Autonomous Agents and Multiagent Sys-tems.

[12] H. Janzadeh, K. Fayazbakhsh, M. Dehghan, and M. S. Fallah. A securecredit-based cooperation stimulating mechanism for manets using hashchains. Future Gener. Comput. Syst., 25(8):926–934, Sept. 2009.

[13] J. J. Jaramillo and R. Srikant. A game theory based reputation mech-anism to incentivize cooperation in wireless ad hoc networks. Ad HocNetworks, 8(4):416–429, 2010.

[14] J. Kleinberg. Complex networks and decentralized search algorithms.In Proceedings of the International Congress of Mathematicians (ICM),2006.

[15] S. Lee, S.-H. Yook, and Y. Kim. Searching method through biasedrandom walks on complex networks. Phys. Rev. E, 80, 2009.

[16] Z. Li and H. Shen. Game-theoretic analysis of cooperation incentivestrategies in mobile ad hoc networks. IEEE Transactions on MobileComputing, 11(8):1287–1303, Aug. 2012.

[17] A. L. Lopes and L. M. Botelho. Improving multi-agent based resourcecoordination in peer-to-peer networks. Journal of Networks, 3:38–47,2008.

[18] M. Luck, P. McBurney, O. Shehory, and S. Willmott. Agent Technology:Computing as Interaction (A Roadmap for Agent Based Computing).AgentLink, 2005.

[19] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker. Search and replica-tion in unstructured peer-to-peer networks. In Proceedings of the 16thinternational conference on Supercomputing, ICS ’02, pages 84–95, NewYork, NY, USA, 2002. ACM.

[20] A. B. MacKenzie and L. A. DaSilva. Game Theory for Wireless En-gineers. Synthesis Lectures on Communications. Morgan & ClaypoolPublishers, 2006.

20

[21] M. E. J. Newman. The structure and function of complex networks.SIAM REVIEW, 45:167–256, 2003.

[22] M. E. J. Newman. Resource letter CS-1: Complex systems. AmericanJournal of Physics, 79(8):800–810, 2011.

[23] J. D. Noh and H. Rieger. Random walks on complex networks. PhysicalReview Letters, 92:118701, 2004.

[24] H. Ohtsuki, C. Hauert, E. Lieberman, and M. A. Nowak. A simple rulefor the evolution of cooperation on graphs and social networks. Nature,441(7092):502–505, 2006.

[25] C.-L. Pu and W.-J. Pei. Mixing search on complex networks. PhysicaA: Statistical Mechanics and its Applications, 389(3):587 – 594, 2010.

[26] J. M. Pujol, J. Delgado, R. Sanguesa, and A. Flache. The role of clus-tering on the emergence of efficient social conventions. In IJCAI, pages965–970, 2005.

[27] A. P. Roberts and C. P. Haynes. Electrostatic approximation of source-to-target mean first-passage times on networks. Phys. Rev. E, 83:031113,Mar 2011.

[28] C. Sierra, V. Botti, and S. Ossowski. Agreement computing. KI-Kunstliche Intelligenz, 25(1):57–61, 2011.

[29] V. Srivastava, J. Neel, A. B. MacKenzie, R. Menon, L. A. Dasilva, J. E.Hicks, J. H. Reed, and R. P. Gilles. Using game theory to analyzewireless ad hoc networks. Communications Surveys & Tutorials, IEEE,7(4):46–56, 2005.

[30] V. Tejedor, O. Benichou, and R. Voituriez. Close or connected: Distanceand connectivity effects on transport in networks. Physical Review E,83(6), 2011.

[31] B. Yang and H. Garcia-Molina. Efficient search in peer-to-peer net-works. In Proceedings of the International Conference on DistributedComputing Systems (ICDCS), 2002.

21

[32] Z. Zhang, A. Julaiti, B. Hou, H. Zhang, and G. Chen. Mean first-passagetime for random walks on undirected networks. The European PhysicalJournal B, 84:691–697, 2011.

[33] M. Zhong. Popularity-biased random walks for peer-to-peer search un-der the square-root principle. In Proceedings of the 5th InternationalWorkshop on Peer-to-Peer Systems (IPTPS), 2006.

[34] T. Zhou. Mixing navigation on networks. Physica A: Statistical Me-chanics and its Applications, 387(12):3025 – 3032, 2008.

22