Generic Protocol

7/30/2019 Generic Protocol

1/10

CoFree : Relieving Live Streaming

from Rational Collusions

Sonia Ben MokhtarCNRS - LIRIS

Lyon University, France

[email protected]

Jeremie DecouchantGrenoble University

France

[email protected]

Vivien QuemaINP, Grenoble University

France

Email : [email protected]

ResumePeer-to-peer live streaming is a robust, scalable andcheap alternative to its centralized counterpart. Similarly to manycollaborative applications, peer-to-peer live streaming suffersfrom rational nodes, i.e., nodes that aim at downloading thestream without contributing their fair share to the system. Whilethe problem of rational nodes that act individually has been welladdressed in the literature, colluding rational nodes is still anopen issue. Indeed, LiFTinG, the only existing protocol addres-sing this issue yields a high ratio of false positive accusations

of correct nodes, which makes it impractical. In this paper, wepresent CoFree, the first peer-to-peer live streaming protocol thattolerates all kind of rational nodes, while guaranteeing zero falsepositive accusations. A performance evaluation performed on atestbed comprising 250 nodes shows that CoFree is as efficientas protocols tolerating only individual rational behaviors.

I. INTRODUCTION

Live streaming accounts for a large proportion of traffic

over the Internet, enabling the real-time covering of social,

political or sporting events. Several systems developed in the

last decade have shown the benefits of relying on the peer-to-

peer (P2P) paradigm for live streaming [1], [2]. For instance,

P2P based solutions are used in China to disseminate televisionchannels to thousands of users 1. Relying on the P2P paradigm

offers robustness to failures, scalability up to hundreds of

thousands of nodes and adaptability. Indeed, P2P systems can

handle massive node arrival/departure and are highly resilient

to churn. From the point of view of content providers, relying

on a P2P system allows shifting cost (e.g., bandwidth) to

clients, and avoids the need for maintaining dedicated servers.

A major problem that face large scale P2P systems deployed

on the public domain is the existence of rational nodes, i.e.,

nodes that aim at receiving the stream without contributing

their fair share, by forwarding stream packets to others.

Existing studies [3] have shown that the presence of even

a small portion of rational nodes significantly degrades thesystems performance. In particular, nodes running the protocol

experience simultaneously a high jitter and a high overhead.

This is why a number of protocols have been devised in the last

decade to deal with the problem of rational nodes (e.g., BAR

Gossip [3], FlightPath [4], the protocol by Van Renesse et.

al [5] and LiFTinG [6]). All these protocols provide incentives

that encourage/force rational nodes to participate in the system.

1. PPLive home page. http ://www.pptv.com/

However, a problem that has been underestimated is the

impact of rational nodes that collude (also called colluders

for short). A group of colluders is a group of nodes that

collaborate to exchange stream packets (also called updates)

between each other off the record. Colluders do not share

with nodes not belonging to the group the updates they receive

off the record. Most of the existing protocols simply assume

that rational nodes do not collude and are thus heavily im-pacted by the occurrence of rational collusions. For instance,

as further analyzed in Section III, correct nodes using the

most robust rational resilient streaming protocol, i.e., the BAR

Gossip protocol [3], receive an unusable stream in presence of

as little as 20% of colluders. Note that, as we show, colluders

do not need to belong to the same group to harm the protocol :

10 groups of colluders, each comprising 2% of the nodes of the

system, have an impact that is comparable to that of 1 group

of colluders comprising 20% of the nodes of the system.

To the best of our knowledge, the only peer-to-peer strea-

ming protocol trying to prevent collusions is the LiFTinG

protocol [6]. In this protocol, nodes log their interactions with

other nodes and perform distributed audits of each otherslogs. In order to be cost effective, this protocol relies on

cryptography-free procedures, and statistical analysis of these

logs. For instance, a node is suspected of colluding with

another node if the frequency of its interactions with the

latter is greater than an expected average. Unfortunately, as

analyzed by the authors themselves, due to their statistical

nature and to message losses, the mechanisms implemented

in LiFTinG do not allow catching all rational collusions (false

negatives), and may even lead to wrong exclusion of correct

nodes (false positives). For instance, in presence of rational

nodes that aim at increasing the quality of their stream by

10%, the protocol does 50% false positive exclusions of correct

nodes. Consequently, it is not possible to use LiFTinG in realenvironments.

The question we try to answer in this paper is : how to

design a rational-resilient streaming protocol that prevents

collusions to occur and that does not wrongfully exclude

non-colluding nodes ? An observation one can start with

is : a colluding behavior can be considered as a Byzantine

behavior [?]. A legitimate question is thus to know whether it

is possible to rely on existing techniques for Byzantine fault

tolerance and Byzantine fault detection, such as Nysiad [7],


2/10

PeerReview [8], Accountable Virtual Machines [9], Trinc [10],

or A2M [11] ? The answer is No. The reason is that these

generic solutions for Byzantine fault tolerance either assume

a limited proportion of faulty nodes, or assume the existence of

trusted nodes or hardware. Instead, as in BAR Gossip [3], we

assume in this paper that all nodes can be rational, and we do

not rely on any trusted entity, whether software or hardware.

In this paper, we present CoFree, the first live streaming

protocol that tolerates an unlimited number of (possibly col-

luding) rational nodes, while guaranteeing that correct nodes

are never accused of behaving rationally and that all rational

deviations are eventually detected. To reach this objective, we

adopt a radically different approach than the one used in the

LiFTinG protocol : rather than trying to detect collusions a-

posteriori, we built CoFree in such a way that it is not in the

interest of nodes to collude. We prove that property by relying

on game theory, and the concept of Nash equilibrium [?] in

particular.

Performance wise, we demonstrate using a real deployment

involving 250 nodes that CoFree is able to provide nodes with

a high-quality stream. We even actually show that CoFreeachieves a slightly better streaming quality than BAR Gossip,

the state-of-the art rational-resilient streaming protocol.

The remaining of this paper is structured as follows. We

first review the related works in Section II. We emphasize the

impact of colluders on live streaming systems in Section III.

We then describe the system model we consider in Section IV,

and introduce the core ideas of CoFree in Section V. We

provide a detailed presentation of CoFree in Section VI, and

prove its resilience to (colluding) rational nodes in section VII.

We finally present the performance evaluation in Section VIII

and concluding remarks in Section IX.

I I . RELATED

WORKS

There exist a number of P2P live streaming protocols that

handle rational nodes. These protocols can be classified accor-

ding to the way updates are exchanged between nodes, into

two categories. The first category of protocols is composed of

symmetric protocols. These protocols force nodes to collabo-

rate, as the updates they get from a node are proportional to the

updates they have to offer (this principle is often referred to as

tit-for-tat). BAR Gossip [3] and FlightPath [4] are symmetric

protocols relying on game theory. Both provide incentives to

ensure that rational nodes do not have any interest (have a

limited interest, in the case of FlightPath) in deviating from

the protocol. In terms of robustness to rational nodes, the

BAR Gossip protocol exhibits stronger properties than theFlightPath protocol. Indeed, nodes in FlightPath are assumed

to deviate only if the benefit they get is higher than a threshold,

which is not the case in BAR Gossip. While the authors of

these two protocols point out the problem of colluding rational

nodes in [3], none of these two protocols address it.

The second category of protocols is composed of asymme-

tric protocols. These protocols require nodes to altruistically

push update identifiers to other nodes, who subsequently pull

updates of interest. LiFTinG [6] and the protocol presented

in [5] are asymmetric protocols. They both introduce auditing

mechanisms to periodically verify whether a node contributed

its fair share to the system. We describe these two protocols

below.

The protocol presented in [5] aims at adapting the contri-

bution of nodes to the systems according to their available

resources. Specifically, the authors propose the clustering of

nodes in groups that have the same uploading capacity. Hence

doing, nodes are audited by considering the amount of data

they can contribute. The major limitation of this protocol

is that it assumes the existence of trusted auditors that run

in dedicated external nodes. We do not want to assume the

existence of trusted third parties. Finally, this protocol does

not deal with colluders.

LiFTinG is a protocol that leverages the randomness of node

selection to increase the probability of catching rational nodes.

It is the only protocol that tackles the problem of colluding

rational nodes. Specifically, LiFTinG sporadically verifies the

distribution of the interactions a given node performed with

other nodes in the system. Nodes that collude with other nodes

would break the uniform distribution of partner selection,which may result in their detection. In order to be cost

effective, LiFTing only performs sporadic audits and relies

on non secure logs that can contain wrong information, be

incomplete, be tampered with and, as a consequence, be

inconstant the ones with respect to the others. As a result,

LiFTinG suffers from two major limitations : correct nodes

can be wrongly evicted from the system (false positives) and

a proportion of colluding rational nodes can harm the system

without being detected (false negatives).

III. IMPACT OF COLLUDERS ON STREAMING PROTOCOLS

In this section, we experimentally study the impact of

colluders on the BAR Gossip protocol, as it is the mostrobust rational resilient protocol that has been proposed so

far. We do not include the LifTinG protocol in this study as it

yields an unacceptable rate of false positives. We performed

a simulation 2 with 250 nodes and we varied the number of

colluders, as well as the size of colluding groups. As done

in [?], we use the jitter metric to measure the quality of the

stream perceived by non-colluding nodes. We compute the

jitter as the ratio of streaming windows that are not viewable

(because not enough packets have been received). An average

jitter of 1% means that each peer misses, on average, 1% of

the stream.

Results are depicted in Figure III. The X axis presents

the proportion of nodes that are part of the colluding group,

while the Y axis presents the jitter experienced by nodes. We

draw the curves for both correct nodes, and rational nodes.

It can first be noticed that rational nodes always received the

whole stream. More importantly, these curves show that in

presence of a proportion of 30% of colluders organized in

a single group, correct nodes using the BAR Gossip protocol

experience a jitter greater than 10%, which makes the received

2. The parameters of the simulation are those presented in VIII.


3/10

stream unusable. This result is further confirmed whatever the

size of the groups of colluders as depicted in figure III. For this

figure, we made several experiments in which we distributed

30% of all the nodes in colluding groups of identical size.

Whatever the size of the colluding groups, rational nodes

experience a noticeable benefit receiving the whole stream,

and having no jitter, while correct nodes experience a high

jitter ranging from 8% to 10%. Increasing the size of colluding

groups has an amplifying effect on correct nodes jitter. This is

due to the fact that correct nodes less frequently own updates

that colluding nodes are missing, and thus have difficulties

obtaining new updates from them in symmetric exchanges.

0

2

4

6

8

10

12

14

16

18

0 10 20 30 40

Jitter(%)

Proportion of colluding nodes (%)

non-colluding nodescolluding nodes

FIGURE 1. Average jitter experienced by nodes when a given proportion ofthem is in a single collusion.

We first studied the case in which all colluders belong to

the same group. Results are depicted in Figure III. The X

axis presents the proportion of nodes that collude, while theY axis presents the jitter experienced by nodes. We draw

the curves for both colluding and non-colluding nodes. We

can first notice that nodes have an interest in colluding :

they experience a better jitter when colluding. Actually, when

colluding, nodes receive the entire stream, whereas when

they do not collude they experience some jitter, i.e. around

1,8% 3. More importantly, we observe that, in presence of as

little as 20% of colluders organized in a single group, non-

colluding nodes experience a jitter of about 6%, which makes

the received stream unusable.

We then studied the impact of spreading colluders in mul-

tiple independent groups. More specifically, we made several

experiments in which we distributed 20% of all the nodesin colluding groups of identical size. We depict the results

in Figure III. The X axis presents the size of colluding

groups, while the Y axis presents the jitter experienced by both

colluding and non-colluding nodes. We observe that spreading

colluders in different groups does not significantly decrease the

jitter experienced by non-colluding nodes.

3. Another interesting phenomenon is that when colluding, nodes use lessCPU as they do not need to perform all the cryptographic operations requiredby the protocol.

0

1

2

3

4

5

6

0 2 5 15 25 50

Jitter(%)

Size of colluding groups

colluding nodesnon-colluding nodes

FIGURE 2. Average jitter experienced by nodes when 20% of rational nodescollude in independent groups of equal sizes.

The results presented in this section show that tolerating

individual rational behaviors, as done, e.g. in BAR Gossip [?],

is not enough to build robust peer-to-peer streaming protocolstolerating collusions of rational nodes. Indeed, the reliability

of these protocols is harmed as soon as many small groups of

colluding nodes are formed, which we believe is something

that could happen in real-life systems.

IV. SYSTEM MODEL

We consider two classes of nodes : correct nodes and

rational nodes. Correct nodes follow the protocol. Rational

nodes are defined as in [3] : they aim at getting a quality

stream (i.e., with the lowest possible jitter) at the lowest

possible overhead in terms of bandwidth consumption. This

means that rational nodes would deviate in any sort from the

protocol, possibly by colluding with each other, as long as thedeviation saves their resources while not impacting the quality

of the stream they are getting. Colluding rational nodes would

typically exchange updates off the record, and, in order to save

bandwidth, would not share the updates they obtained secretly

with nodes outside their group. It is important to note that

rational nodes are risk averse, i.e., they never deviate from

the protocol if there is any risk of being evicted from the

system. We assume that all nodes may be rational and may

organize themselves in colluding groups of arbitrary sizes.

We assume that the source of the stream is a correct node.

Moreover, we assume that the network allows every pair of

nodes to exchange messages and that a message sent from one

correct node to another is eventually received, if retransmittedsufficiently often. We also assume that hash functions are

collision resistant and that cryptographic primitives cannot be

forged.

Finally, we assume that nodes have a secure log that is used

to check their correctness through its analysis. A secure log is

a log that is tamper evident and append only. Many systems

recently defined variants of secure logs among which [8][11].

We build on the secure log presented in [8]. In our protocol, the

secure log is used to keep track of the communication a node


4/10

has with other nodes in the system. Specifically, each log entry

in the log of a node A corresponds to a message sent (resp.

received) by A to (resp. from) another node B. A log entry eiis of the form ei = (seqnoi, hi,messagei) where seqnoi is amonotonically increasing sequence number, hi is a hash value

linked with the previous entries in the log and messagei is the

message sent (resp. received) by A. The value ofhi is compu-

ted as follows : hi = H(hi1||seqnoi||H(messagei)), whereh0 = 0, H is hash function and || stands for concatenation.

Each time a log entry ei is added to the log of a node A,

an authenticator i is generated. This authenticator, which is

a signed message i = (seqnoi, hi)(A), states that A hasa log entry ei with a corresponding hash hi. By sending the

authenticator i to a node B, A commits to having logged the

entry ei and to the content of its log before ei. Any node that

receives i can use it to inspect ei and all the entries preceding

ei in the log ofA. Upon reception of a log, any node is able to

recompute the hash values it contains, according to the content

of log entries, and thus to check their validity. In addition, a

log entry for a received message must include a matching

authenticator, implying that a node cannot invent an entry fora message it did not received. These two properties show that

logs are tamper-evident.

V. PROTOCOL PRINCIPLES

As discussed in Section ??, incentives used in tit-for-tat

protocols such as BAR Gossip are not sufficient to encourage

colluders to participate in the system. Moreover, the mecha-

nisms implemented in the LifTinG protocol to fight collusions

may engender the wrong exclusion of a large portion of correct

nodes (false positives) on the one hand and do not catch all

rational collusions (false negatives) on the other hand. We do

thus propose a new protocol, called CoFree, that guarantees

the following two properties : (i) Accuracy : A correct nodeis never accused of behaving rationally, (ii) Completeness :

A rational node that deviates from the protocol in a way

that impacts the performance of correct nodes is eventually

suspected by all correct nodes.

In the remainder of this section, we describe the principles of

CoFree that allow us to guarantee the above two properties.

Protocol details are then presented in Section VI.

As previously discussed, colluders are harmful nodes because

they block the propagation of updates to correct nodes. While

the former keep receiving a quality stream from their colluding

partners, the latter observe a degradation in the quality of their

stream. Consequently, rather than trying to detect collusions

a-posteriori (with possible false positives and negatives) as isdone in the LiFTinG protocol, we propose, in CoFree, to avoid

collusions. In other words, CoFree is built in such a way that

it is not in the interest of nodes to collude.

The core idea underlying CoFree is to have each node store

in a secure log its interactions with other nodes in the system,

including the updates it received. Because any node can verify

the information present in the log of a node it is interacting

with, the latter will be obliged to send the updates it has and to

receive the updates it is missing. Consequently, no node will

have an interest in forming collusions. Indeed, assume that a

node p colludes with another node to receive an update u,

off the record. Node p will not be able to record update u in

its log (because the exchange was unofficial ; we explain later

how it is done). The good news for node p is that it does not

have to forward u to other nodes because u does not appear

in its log. The problem is that next time a correct node having

u in its log will interact with node p, it will send update u

to p. Consequently, p will eventually have to forward u and it

will have wasted its bandwidth because it will have received

u twice (off the record and from a correct node).

This core idea raises several questions and challenges that

we answer in the remainder of this section.

What ifp choses only colluders as partners with whom it

will interact with in the near future ?. This way, p could

accept u and arrange with its future partners so as they dont

audit his log or they dont send him u a second time. While

colluders can escape audits by selecting the right partners in

the LifTinG protocol, it is not possible in CoFree. Indeed,

in our protocol, nodes are forced to (periodically) establish

random, yet deterministically verifiable partnerships.What if a node, p, maintains many (correct) logs ? For

instance, p could have a log in which an update u appears

and another log in which the same update does not appear.

As such, if p interacts with a node that already has u, it

will show his log in which u appears (to avoid receiving

it twice). Instead, if p interacts with a node that does not

have u, it will show the log in which u does not appear (to

avoid sending it). This problem is known as equivocation,

i.e., the ability to make conflicting statements to different

participants [10]. We deal with this issue by forcing nodes

to audit their partners logs at the beginning of each new

partnership. This audit verifies the consistency of the log of

a node as a whole. For instance, a node verifies that his newpartner has established in the past the partnerships with the

nodes with whom it was supposed to communicate. If a node

maintains many sub logs, it will present one of them during its

audits. However, whatever the log it presents to an audit, the

latter will necessarily be incomplete (e.g., it will contain only

a subset of the interactions a node was supposed to perform

as the missing interactions will be stored in a different log).

Consequently, the auditing node can expose the audited node

for behaving rationally. A rational node would never take such

a risk.

Isnt this periodic exchange of logs a performance over-

kill ? It is not necessary to audit the log of nodes each

time two nodes exchange updates. Indeed, we build on theassumption that colluders and rational nodes in general, are

risk averse. Hence, it is enough to ensure that for each step

of the protocol, a deviation has a non-zero probability to be

detected in the near future, in order to make sure that rational

nodes will not deviate. Consequently, instead of performing

audits each time nodes communicate, audits are triggered in a

random manner.

What if rational nodes and colluders exploit the fact that

audits are performed randomly, to avoid doing them (in


5/10

order to save resources or to protect colluding partners) ?

Audits (from the point of view of audited nodes) must not be

predictable, because rational nodes would seize an opportunity

to deviate undetected if they could predict them. Yet they

must be verifiable (from the point of view of nodes perfor-

ming them), because nodes have to be forced to trigger this

procedure. To reach this objective, a node that starts a new

partnership with a node performs a deterministic computation

that results in a boolean telling him whether it should audit

his partners or no. As such, any node, in particular the future

partners of the latter, can rerun the deterministic computation

and verify whether this node was supposed to perform an audit

and whether it effectively did it. From the point of view of

audited nodes, the deterministic computation performed by

auditors to decide whether they should perform an audit or

not can not be precomputed by audited nodes. As such, neither

auditors, nor audited node will take the risk of not performing

an audit for the former and performing deviations for the latter.

What if rational nodes decide not to answer to correct

nodes to avoid trading updates, or being audited ? Jeremie

to developWhat if rational nodes decide to suspect rational nodes

instead of interacting with them to avoid being audited ?

Jeremie to develop

VI . PROTOCOL DETAILS

We have presented the principles of CoFree in the previous

section. In this section, we detail the steps of the protocol. We

then prove in Section VII that rational nodes follow all the

steps of the protocol and do not deviate neither individually

nor as a group.

In a nutshell, CoFree divides time in rounds. At each round

the source disseminates new updates to a small set of ran-

domly chosen nodes. RTE4 rounds after their release, updatesexpire. Upon expiration, updates are delivered to the nodes

media application. To get updates, each node initiates, and

maintains partnerships with Fanout other nodes with whom

it exchanges updates at each round. The partners are selec-

ted using a pseudo-random number generator function, i.e.,

PRNG, seeded deterministically (e.g., with the node public

key concatenated with the round number). At the beginning

of a round, each node contacts all of its partners in order to

propose updates to them and to request updates from them.

Every Period rounds, each node updates its set of partners.

Each time a node starts a new partnership with a node, the two

nodes audit each others log. Specifically, this audit checks the

behavior of the new partner for the last RTE rounds.

The remaining of this section describes the sub protocols

constituting CoFree in detail, as follows. First, we present the

join protocol (Section VI-A), which allows dealing with new

nodes joining the system. Then, we present the partnership

management (Section VI-B) and the update exchange proto-

cols (Section VI-C), which allow handling the partnerships

between nodes and exchanging updates between partners,

4. RTE stands for : Rounds To Expire.

respectively. We finally present the omission failures proto-

col (Section VI-D), which allows dealing with unresponsive

nodes.

A. Join protocol

It is important that nodes participating in the system have

the same membership list, i.e., the list of nodes participating in

the system at a given round. To synchronize nodes membership

lists, we use the mechanism presented in [4]. This mechanism

uses epochs 5 at the beginning of which new nodes can start

to exchange with others. More precisely, nodes learn during

epoch e+ 1 about the new nodes that arrived during epoch e,and start to exchange with them during epoch e+ 2. This way,nodes share the same knowledge of new nodes arrival, and

partnerships between nodes can be controlled. More precisely,

assume that the source node is informed during epoch e that

a new node px wants to join the streaming session. The latter

will send to px a signed message during epoch e+ 1 with themembership list of epoch e + 2. Then, px logs the messagesent by the source, which will allow him to justify ulteriorly

of its new arrival in the system and to avoid being suspectedbecause of the small size of its log. During epoch e + 2, pxstarts establishing new partnerships with others as prescribed

by the partnership management protocol. The source also

disseminates, along with propagated updates, the list of new

members, so as eventually these members will be selected for

new partnerships by old members.

B. Partnership management

Each node px has to maintain partnerships with Fanout

other nodes, selected with the PRNG function seeded with

a deterministically computed seed (e.g., the round number

concatenated with pxs public key). Every Period rounds,

a node px breaks the partnerships it initiated with Fanoutnodes. A node having an identifier id will change its partner-

ships during round r if (id + r) mod Period = 0. Becauseeach node can replay the computations performed by its

partners, nodes do not need to inform their partners of broken

partnerships : every node knows when the relationship initiated

by any of its partners should come to an end.

At the beginning of a partnership, a node px may trigger an

in-depth audit of its new partner py, by contacting the partners

py had in the RTE previous rounds, and asking them to

return their own log of the last RTE rounds (including current

round). To reduce the cost of the protocol, nodes perform

these audits in a random manner, i.e., each time they are in

a position to perform an audit, they flip a coin and decide

whether they should audit their partner or not. Nevertheless, to

avoid that rational nodes hide behind this randomness to avoid

auditing their partners, we make this randomness verifiable.

Specifically, each time a node px is in a position to perform

an audit of a new partner py, it computes the hash of its last

authenticator, i.e., i, concatenated with the last authenticator

ofpy , i.e., j . The value of this hash modulo 2 gives a boolean

5. An epoch is a period composed of a set of rounds


6/10

that px uses to decide whether it should audit or not its new

partner. Node px further logs the authenticators it used to

compute the value of this boolean, in order to justify, in future

audits, the reason why it performed on did not perform the

audit of py. If the computed boolean indicates that the audit

must take place, px contacts py partners and ask for their logs.

Upon reception of these logs, node px verifies : Sonia verifier

les etapes ci dessous

(i) the consistency of the logs, by recomputing the hash values

associated to log entries,

(ii) the presence of the exchanges py had to initiate, and the

legitimacy of the others,

(iii) that py declared the updates it was supposed to receive

from the source,

(iv) that the exchanges correspond to a correct execution of

the protocol,

(v) that the past partners logs are consistent.

As any other node, the source also maintains partnerships and

regularly changes its partners, i.e., the nodes it serves. The

source follows the updates exchange protocol described above,

except that it does not send any log and it is not audited bynodes 6. To avoid that nodes do not declare the updates they

received from the source, these nodes are deterministically

chosen among the membership list, so that any node can

check that they correctly declared the received updates. As

the serving rate of the source is constant, the identifier of the

updates that are released at each round are known.

C. Update exchanges

At the beginning of each round and for the duration of

their partnership, two partners, px and py exchange updates as

depicted in Figure 3. Node px (resp. py) starts the exchange by

generating a proposition message containing the identifiers of

all the updates that appear in its log and that did not expire yet.Node px (resp. py) logs this proposition message in its log and

generates the corresponding authenticator. Then, px (resp. py)

sends the proposition message along with the corresponding

authenticator to py (i.e., for px, the Send(Propose Updates,py)

message in the figure). Upon reception of the proposition

message, which it logs, node py (resp. px) selects those

updates it is missing and replies to px (resp. py) with an update

request (i.e., for py, the Send(Request Updates,px) message in

the figure). The update request is logged at the two parties.

Finally, px (resp. py) serves the missing updates (i.e., for px,

the Send(Serve Updates,py) message in the figure) and logs

the serve message. After receiving the updates, each partner

terminates the exchange by logging the updates it received inits log.

D. Omission failures management

Handling omission failures, i.e., nodes that do not send a

given message that they are expected to send, in absence of

strong synchrony assumptions is a tricky task. The difficult

question to answer is whether a node is not responding because

6. We recall that the source is assumed to be a correct node.

FIGURE 3. Update exchanges between nodes

it has experienced network delays, or whether it does not want

to reply to a given message (e.g., sending his log in order not

to be audited). In our protocol, we guarantee the following

properties :

A node that is not responding to a given message iseventually suspected by all correct nodes.

A suspicion is released if the suspected node eventually

sends the missing message it has been suspected for. A node that is suspected at a given round and thatinteracts with another node at an ulterior round without

broadcasting the missing message is accused for rational

behavior.

A rational node never wrongly suspects another node.Jeremie relire cette section, travailler la fluidit e To enforce

these properties, we augment the protocol with mechanisms

we now describe.

Each node maintains a list of nodes it suspects. Specifically,

consider a node px that sent a given message to a node

py and is expecting a reply from him. After a number of

retransmissions of his original message, px suspects py at

round r. At the following round, px will include py in his listof suspected nodes. px also includes the type of the missing

message it was expecting from py and signs the list. At the

beginning of next round, node px attaches its list of suspected

nodes to the Send(ProposeUpdates) message he sends toits partners. After receiving the list of suspected nodes from

its partner, node px adds this list to its own list. The new

list will be forwarded to its future partners. Eventually, every

correct node will receive the list of suspected nodes and the

list of corresponding expected messages. From the reception of

this list onward, correct nodes do not interact with suspected

nodes. To exonerate itself, a suspected node can broadcast

(using the streaming protocol) the missing message. If a node

is contacted by a suspected node at an epoch ulterior tothe reception of the suspicion message, the latter can accuse

him of misbehaving. In order to make sure that a rational

node will never suspect a correct node in order to avoid

initiating or accepting an interaction with her, we make the

cost of sending a suspicion message higher than the cost of

a normal interaction. Henceforth, unless it is a real suspicion,

a node will never suspect another node instead of initiating

or accepting an interaction with her. In order not to overload

the system with large suspicion messages at every forwarding


7/10

step, the suspicion message is large only for the sender who

initiates it. Nodes that forward a suspicion message extract the

signed attestation comprised in the original message and send

a normal size message.

VII. RESILIENCE TO (C OLLUDING) RATIONAL NODES

In this section, we analyze each step of the protocol and

prove why rational, possibly colluding nodes do not deviate

from the protocol :

Join protocol

Protocol step (1) : Node p0 deterministically selects f part-

ners using a pseudo-random number generator seeded with

a deterministically computed seed (e.g., the round number

concatenated with p0s public key).

1) What if node p0 selects nodes whose IDs are other than

those computed using the PRNG function ?

A rational node p0 will never select such nodes at round r

as it risks eviction during the next RTE rounds. Indeed, node

p0 can be selected by a correct node, say px, during one of

these rounds which will verify that p0 effectively selected thenodes it was supposed to interact with at round r by examining

its log. If node px detects such a deviation, it will lead to the

eviction of p0.

2) What if node p0 selects less than f nodes ?

The same incentive as the one above holds for this deviation.

3) What if node p0 does not start immediately to establish f

partnerships ?

If p0 does not establish any partnerships, it will never be

allowed to join the system. Indeed, no gap in the parterships

management is allowed. In the next rounds, any correct node

interacting with p0 will check in p0s log that he established

its initial partnerships, or maintained them. In addition, if p0contacts colluding nodes, they will also risk eviction if they

do not denounce him.

Protocol step (2) : When node p0 proposes updates to node

pi, pi checks thatp0 had to contact it by rerunning the PRNG.

If the check succeed, p0 andpi will exchange during the next

round.

1) What if node pi does not reply to a partnership request sent

by p0 ?

Ifp0 does not receive a reply after it sent its proposition, it

will suspect pi. Not doing so would expose him during the next

RTE rounds, and prevent it to interact with any correct node.In order not to be evicted, pi will answer to the partnership

request.

2) What if node pi does not verify thatp0 had to contact it ?

We distinguish two cases : (a) p0 is correct, and (b) p0 is

rational and the verification pi has to perform should not pass.

(a) If p0 is correct and pi attests that p0 has passed the

verification without effectively performing it there is no way

to detect that pi is behaving rationally. However, as pi does not

effectively know whether p0 is correct or not, pi risks eviction

as well as p0. We detailed below the way it can happen, as it

is the same whether p0 is correct or not.(b) If p0 is rational and pi attests that p0 passed the

verification without performing it, pi risks eviction. Indeed, if

one of the following nodes among those that will be contacted

by p0 or one of the correct nodes that will contact p0 at

round r + 1, say px, is correct and finds out that p0 behavedrationally through an audit, px could use the attestations

sent by pi and which are in p0 log to prove that pi behaved

rationally. This will result in the eviction of pi. As rational

nodes do not want to be evicted, they do not attest for the

correctness of a node without performing the corresponding

verifications.

Partnership management

Protocol step (3) : Each Period rounds, node p0 stops

exchanging with its Fanout partners with whom it initiated a

partnership.

1) What if node p0 stops a membership with node pi before

Period rounds ?

Node pi knows when the partnership established with p0started. Ifpi is correct, it will denounce p0 if it stops exchan-

ging with him before the Period rounds. If pi is colluding

with p0 it may not denounce him. However, any correct nodes

seeking for partnerships with p0 or pi, or exchanging with

them, will potentially discover the deviation with an audit and

denounce both of them.

3) What if node p0 stops more or less than Fanout partner-

ships ?

The partnerships that have to be stopped are chosen

deterministically, and can be checked by any node. Any

couple of node maintaining or breaking a relationship when

they are not allowed to do so will be exposed during the nextRTE rounds if they interact in any way with correct nodes.

Protocol step (4) : Each Period rounds, node p0 initiates

Fanout new partnerships with nodes deterministically chosen

with its PRNG.

1) What if node p0 initiates new partnerships before or after

Period rounds ?When trying to exchange a partnership, a node potentially

has to send its log for the last RTE rounds. This log contains

the last partnership management of p0. Any correct node

receiving a log where the last partnership management was

not done Period rounds in the past will denounce p0. If p0proposes to a node colluding with him, any of their correct

partner will denounce them. If they have no correct partners,

any correct node auditing them will see the misbehavior.

2) What if node p0 does not use its PRNG to choose new

partners ?

Another time, rational nodes will use the PRNG to choose

partners, for the same reasons we presented previously.

3) What if node p0 initiates more or less than Fanout

partnerships ?


8/10

Ifp0 has in its partners a correct node px, next turn it will

expect to see that p0 initiated Fanout partnerships. If it is

not the case, px will denounce p0. In addition, any correct

node interacting with p0 in the last RTE rounds will check

that p0 initiated Fanout new partnerships at each period.

Update exchanges

Consider node pi

among the set of partners selected by p0

.

Protocol step (5) : Node p0 contacts pi and sends him a

proposition message containing the updates it owns.

1) What if node p0 does not send a proposition message to

pi ?

Each correct partner of p0 expects to receive its a proposi-

tion at the beginning of each round, and have to denounce p0if it does not send it. Supposing that p0 has no correct partners,

which is difficult to obtain following the protocol, it will be

evicted if a correct node audits him in the next RTE rounds,

because he cannot present an acknowledgement coming from

its partners.

2) What if node p0 sends to a node pi an invalid proposition ?Ifp0 sends an invalid proposition to node pi it risks eviction

during the RTE following rounds. In fact, an audit of p0 by

a correct node will determine which updates it received, and

if they were included in the proposition messages. A correct

node will then denounce p0. If node pi is colluding with p0 it

takes the risk to be evicted in the following rounds if an audit

confront their logs.

Protocol step (7) : Node p0 serves its partner pi with the

updates it requests.

1) What ifp0 sends less updates than those requested by pi ?

Sending less updates than those that are requested

constitutes a proof of misbehavior pi could use to evict p0.Ifpi does not denounce p0, both nodes would be evicted if a

correct node interact with them in the next RTE rounds.

Omission failures

Protocol step (1) : if a rational node p0 does not receive a

given message from a node pi at round r, it eventually adds

pi to his list of suspected nodes and encloses this list as well

as a payload in its proposition messages to be sent during

following partnership.

1) What ifp0 does not suspectpi ?

If p0 does not suspect pi, a node px contacted by p0 to

establish a partnership will notice while auditing p0

s log that

p0 did not interact or suspect the f nodes it was supposed to

interact with. px will thus expose p0. As rational nodes do not

want to be exposed, p0 will suspect pi.

2) What ifp0 suspects pi even ifp0 has received the message

expected from pi ?

p0 might be tempted to suspect pi in order to avoid

interacting with it. However, as it is more costly to suspect a

node than to interact with it (due to the payload that needs

to be attached to the message), a rational node will always

prefer to interact with a node instead of wrongly suspecting

her.

Protocol step (2) : Upon receiving a message containing a list

of suspected nodes at round r from node pi, a rational node

p0 removes the payload, extracts the list of suspected nodes

and adds the extracted list to the list of nodes it suspects

1) What ifp0 does not extract the list of suspected nodes sentby pi and does not forward this list ?

p0 must enclose the list of suspected nodes sent by pi in his

list of suspected nodes (even if this list is empty) as the nodes

that will audit p0 during following partnerships will verify that

this list is present.

2) What ifp0 modifies the list of suspected nodes sent by pibefore forwarding it (e.g. makes it shorter) ?

The list of suspected nodes sent by pi is a signed message,

p0 can not modify its content without being detected.

Protocol step (3) : A rational node p0 refuses to interact with

a suspected node.1) What ifp0 interacts with a suspected node ?

The node that will audit p0 at round r + 1 will notice thatp0 has interacted with a node that is present in its list of

suspected nodes, which constitutes a proof of misbehavior.

Incentive (5) : A suspected rational node p0 at round r,

which did not fail, sends the missing message as part of its

proposition messages during following partnerships.

1) What ifp0 does not disseminate the missing message it was

suspected for ?

If p0 does not disseminate the missing message no other

node will accept to interact with him in future rounds.

VIII. EVALUATION

In this section, we present an evaluation of the CoFree

protocol. We start by introducing the methodology we fol-

lowed. Then, we study the performance of CoFree in a

real system composed of 250 nodes, and we compare its

performance to that achieved by BAR Gossip, the state-

of-the-art rational-resilient peer-to-peer streaming protocol.

Overall our evaluation draws the following conclusion : in

a real deployment involving 250 nodes, CoFree achieves a

slightly better streaming quality than BAR Gossip, while their

bandwidth consumption is similar. Moreover, the overhead ofCoFree in terms of memory and CPU usage is reasonable.

A. Methodology

We implemented CoFree in Java and deployed it on 250

nodes of the Grid5000 platform []. Nodes are interconnected

by a 1Gb/s network. In our experiments, the source node

diffuses a video a stream at a rate of 300 kbps, during 5

minutes, and proposes each update to 3 random nodes. To

provide further tolerance to message loss (combined with


9/10

retransmission), the source groups packets in windows of 40

packets, including 4 FEC 7 coded packets. The duration of one

round is set to 1 s. When executing CoFree, nodes maintains

one partnership, and change it every three rounds.

As explained in Section ??, we measure the quality of

the stream, using the jitter, which is defined as the ratio of

streaming windows that are not viewable (because not enough

packets have been received). The lower the jitter, the better

the streaming quality. Regarding the overhead of CoFree, we

evaluate it using two metrics : the bandwidth consumption and

the memory consumption.

B. Stream quality

Figure 4 indicate the reverse cumulative distribution of the

jitter that nodes experienced during this session. All nodes in

CoFree experienced a jitter lower than 2.5%, while roughly

10% of nodes in BAR Gossip suffered from a higher jitter.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

CoFree BAR Gossip

Jitter(%)

FIGURE 4. Average and standard deviation of jitter with 250 nodes.

C. Overhead

We present in Figure 5 the cumulative distribution of

the average banwidth consumption nodes experienced with

CoFree, and with BAR Gossip. These curves show that the

bandwidth consumption of BAR Gossip and CoFree are com-

parable, but CoFree can be seen as more regular than BAR

Gossip.

To measure the memory overhead of CoFree, we averaged

the size of nodes log and the size of their stored partners

authenticators. Figure 6 presents the evolution of theses sizes

throughout the streaming session. To run CoFree, nodes need

to allocate 150 Kbits of memory, among which 40 Kbits areused to store their log. Thus, the memory overhead of CoFree

is very reasonable. Jeremie talk about cpu overhead

I X. CONCLUSION

REF ERENCES

[1] T. Bonald, L. Massoulie, F. Mathieu, D. Perino, and A. Twigg, Epi-demic live streaming : optimal performance trade-offs, in Proceedingsof the ACM international conference on Measurement and modeling ofcomputer systems, SIGMETRICS08.

7. FEC stands for Forward Error Correction.

20

40

60

80

100

0 200 400 600 800 1000 1200 1400 1600 1800 2000

Percentageofn

odes(cumulativedistribution)

Average throughput (kbps)

CoFreeBAR Gossip

FIGURE 5. Cumulative distribution of average bandwidths with 250 nodes.

0

20

40

60

80

100

120

140

160

0 200 400 600 800 1000 1200 1400

Size

in

kbits

Time in seconds

Node logTotal

FIGURE 6. Memory used by one node to store its log, and the authenticatorsit receives.

[2] D. Frey, R. Guerraoui, A.-M. Kermarrec, M. Monod, and V. Quema,Stretching Gossip with Live Streaming, in Proceedings of the The39th IEEE/IFIP International Conference on Dependable Systems and

Networks, DSN09.

[3] H. C. Li, A. Clement, E. L. Wong, J. Napper, I. Roy, L. Alvisi, andM. Dahlin, Bar gossip, in Proceedings of the 7th USENIX Symposiumon Operating Systems Design and Implementation - Volume 7, ser. OSDI06. Berkeley, CA, USA : USENIX Association, 2006.

[4] H. C. Li, A. Clement, M. Marchetti, M. Kapritsos, L. Robison, L. Alvisi,and M. Dahlin, Flightpath : obedience vs. choice in cooperativeservices, in Proc. of the 8th USENIX conference on Operating systemsdesign and implementation, ser. OSDI08. Berkeley, CA, USA :USENIX Association, 2008, pp. 355368.

[5] R. van Renesse, M. Haridasan, and I. Jansch-Porto, Enforcing fairnessin a live-streaming system, in Proc. of Multimedia Computing and

Networking (MMCN), 2008.

[6] R. Guerraoui, K. Huguenin, A.-M. Kermarrec, M. Monod, and S. Prusty,Lifting : lightweight freerider-tracking in gossip, in Proceedings of the

ACM/IFIP/USENIX 11th International Conference on Middleware, ser.Middleware 10. Berlin, Heidelberg : Springer-Verlag, 2010, pp. 313333.

[7] C. Ho, R. van Renesse, M. Bickford, and D. Dolev, Nysiad : practicalprotocol transformation to tolerate byzantine failures, in Proceedingsof the 5th USENIX Symposium on Networked Systems Design and

Implementation, ser. NSDI08, 2008, pp. 175188.

[8] A. Haeberlen, P. Kouznetsov, and P. Druschel, Peerreview : practical


10/10

accountability for distributed systems, SIGOPS Oper. Syst. Rev., vol. 41,pp. 175188, 2007.

[9] A. Haeberlen, P. Aditya, R. Rodrigues, and P. Druschel, Accountablevirtual machines, in Proceedings of the 9th USENIX Symposium onOperating Systems Design and Implementation (OSDI10), 2010.

[10] D. Levin, J. R. Douceur, J. R. Lorch, and T. Moscibroda, Trinc : smalltrusted hardware for large distributed systems, in Proceedings of the 6thUSENIX symposium on Networked Systems Design and Implementation,2009.

[11] B. gon Chun, P. Maniatis, S. Shenker, and J. Kubiatowicz, Attestedappend-only memory : Making adversaries stick to their word, inProceedings of SOSP, 2007, pp. 189204.

Documents

Generic Protocol