An Efficient Decentralized Algorithm for the Distributed Trigger Counting (DTC) Problem

An Efficient Decentralized Algorithm for the Distributed Trigger Counting

(DTC) Problem

Venkatesan T. Chakravarthy (IBM Research-India)Anamitra Roy Choudhury (IBM Research-India)Vijay Garg (University of Texas at Austin)Yogish Sabharwal (IBM Research-India)

Distributed Trigger Counting (DTC) Problem

Distributed system with n processors

Each processor receives some triggers from an external source

Report to the user when the number of triggers received equals w. (In general w>>n)

Applications of DTC Problem

Distributed monitoring traffic volume : raise an alarm if #vehicles on a highway

exceeds some threshold wildlife behavior: #sightings of a particular species in a

wildlife region exceeds a value.

Global Snapshots: the distributed system must determine if all in-transit

messages have been received to declare snapshot valid. This problem reduces to DTC Problem [Garg, Garg, Sabharwal 2006]

Assumptions:

complete graph model, i.e., any processor can communicate with any other processor

no shared clock and no shared memory

processors communicate using messages

reliable message delivery

no faults in the processors.

Measure of any DTC Algorithm:

Low message complexity

Low MaxRcvLoad, the maximum number of messages received by any

processor in the system.

Low MsgLoad the maximum number of messages communicated by any processor in the system.

Trivial Algorithm

Fix one node to be Master Node

Total Deficit (w) is maintained by the Master Node

Any processor that receives a trigger informs the Master Node The master node decrements the deficit

Finish when deficit reaches zero

Total messages = O(w)

MaxRcvLoad and MsgLoad also O(w)

Previous Work: Any deterministic algorithm has message complexity Ω(n log(w/n))

[Garg et al] Centralized algorithm

message complexity O(n log w). MaxRcvLoad can be as high as O(n log w).

Tree-based algorithm message complexity O(n log n log w). more decentralized in a heuristic sense. MaxRcvLoad can be as high as O(n log n log w), in the worst case.

Algorithm Message Complexity

MaxRcvLoad

Centralized O(n log w) O(n log w)Tree based O(n log n log w) O(n log n log w)LayeredRand O(n log n log w) O(log n log w)

CoinRand [IPDPS 11] O(n log w) O(log w)

Modifications to the trivial algorithm Any processor sends message (count of triggers received) to

the master only after it receives B triggers. Works in multiple rounds. w’: deficit at beginning of a round. (initially w’ = w) Master keeps count of the triggers reported by other

processors and the triggers received by itself. End-of-round declared when count reaches w’/2

)2/(' nwB

System never enters a dead state Unreported triggers for each processor < B Count of triggers at master < w’/2

Message complexity O(n log w) log w rounds w’/2B = n messages exchanged in every round.

Main Result:

LayeredRand, a decentralized randomized algorithm.

Theorem: For any trigger pattern, the message complexity of the LayeredRand algorithm is O(n log n log w). Also, there exists constants c,d > 1 such that

Pr[MaxRcvLoad > c log n log w] < 1/nd

LayeredRand Algorithm n = (2L -1) processors

arranged in L layers

lth layer has 2l processors, l=0 to L-1

)1log(24 nw

ll

Algorithm proceeds in multiple rounds. w’: initial value of a round (number of triggers yet to be

received) Threshold for lth layer defined as

C(x): sum of triggers received by x and some processors in layers below.

LayeredRand Algorithm (Contd.)

For non-root processor x at layer l If a trigger is received: C(x)++ ; If C(x)>= τ(l)

pick a processor y from level l-1 at random and send a coin to y. C(x) := C(x) - τ(l);

If a coin is received from level l+1: C(x) :=C(x)+ τ(l+1).

Root r maintains C(r) just like others. If C(r) > , initiate end-of-round procedure

gets total number of triggers received in this round broadcasts new value of w’ for next round.

2w

Example

2

1 1

11 1

w’ = 96

τ(1) = 4

τ(2) = 2

2

2 2

351

6

3

45

1 1

4

49 End of round w’ for next round

= 96- 53 = 43

A B C D

E F

G

Analysis

System does not stall in the middle of a round, when all the triggers have been delivered.

Message complexity to O(n log n log w)

MaxRcvLoad bounded to O(log n log w) with high probability

Correctness

Consider the state of the system in the middle of any round.

x: any non-root processor at layer l

Dead state thus implies C(r)>3w’/4, leading to contradiction.

1)()( lxC

Message Complexity O(n log n log w)

log w rounds

Every coin sent from layer l to l-1 means that at least τ(l) triggers have been received at layer l in this round.

#coins sent from layer l to the layer l-1 is at most w’/ τ(l)

#coins sent in a particular round

O(n) message exchanges for every end-of-round procedure.

nnnlw L

l

lL

l

log)1(4log24)' 1

1

1

1

MaxRcvLoad O(log n log w) w.h.p.

Prob[MaxRcvLoad of some processor exceeds c log n log w]< n-(c-1) , for any constant c>=48

In any given round, #coins received by layer l < w’/ τ(l+1) < 4.2l+1.log n

Each coin sent uniformly and independently at random to one of the 2l processors occupying layer l.

Mx: r.v. denoting the number of coins received by x

E[Mx] = 8 log n log w

Prob[Mx > 8a. log n log w] < 2 -8a. log n log w < n -8a , for a>=6

Above result follows by applying union bound.

Concurrency

48

1

11 12

w’ = 96

τ(1) = 4

τ(2) = 2

1

49

End of Round! ΣC(x) = 53 instead of 55

We assumed that the triggers are delivered one at a time all the processing required for handling a trigger is completed

before the next trigger arrives. Relax on that assumption

A B C D

G

FE

Handling Concurrency

Triggers and coins received during a round placed in queue and processed one at a time.

Additional features for handling end-of-round.

Default queue and Priority queue Unprocessed triggers and coins placed in default queue End-of-round messages in priority queue Default queue serviced only when priority queue empty

Counters C(x), D(x) and RoundNum D(x): triggers processed by x since the begin C(x): reset after every round

Thank You!!

Backup

End-of-round procedure Processors arranged in a tree. Four phases. First Phase: root initiates RoundReset message A processor x on receiving RoundReset

suspends processing of the default queue until end of round i.e., D(x) value not modified further till new round

Non-leaf processor forwards it to its children; Leaf processor initiates the second phase

48

1

11 12

w’=96

1

49End of Round!

D(x)=2A B C D

G

E F

Second Phase Leaf processor initiates Reduce message containing its

D(x) value A processor x on receiving Reduce from its children

Non-root processor adds its D(x) value to the sum and forwards it to its parent

Root processor computes w’ – termination or next round.

48

1

11 12

w’=96

1

49End of Round!

D(x)=2

ΣD(x) = 55 New w’=41

A B C D

G

E F

Third Phase Root broadcasts the new w’ by Inform message. Every non-leaf processor forwards it to its children; Leaf processors on receiving Inform message initiate the

fourth phase.

12

End of Round!New w’=41

τ(1) = 2

A B C D

G

E F

Fourth Phase Processor in this phase perform the following

RoundNum incremented. signifies new round i.e., processor does not process any coin

from the previous rounds. C(x) reset to zero. InformAck message sent to its parent. Processing of the default queue resumed.

System (all processors) enters next round when root receives InformAck

12

End of Round! w’=41

Discard this coin

Next Round

A B C D

G

E F

Documents

An Efficient Decentralized Algorithm for the Distributed Trigger Counting (DTC) Problem