Download pptx - Distributed Asymmetric Verification in Computational Grids

DistributedComputing

Group

Distributed Asymmetric Verification in Computational Grids

Michael KuhnStefan SchmidRoger Wattenhofer

IPDPS 2008Miami, Florida, USA

2Michael Kuhn, ETH Zurich @ IPDPS 2008 2

Grid Computing

• Goal: Use idle computing resources worldwide– Examples: seti@home,

folding@home, ...


Model

`

Server

Clients (Participants)

#clients: 104-106

Sends work-units (WU)

Returns result


• Why should people participate?– Incentives: honour („user of the day“), money, ...

• But: Incentives attract cheaters!

The Problem of Cheaters


• Verification required– Today: Redundancy

– E.g. seti@home: Send the same task to 3 participants– This paper: Distributed asymmetric checking

– Asymmetry: Verification is often cheaper than computation– Distributed: Participants check each other

The Problem of Cheaters

Random value


Contributions

• Distributed checking algorithm integrated in BOINC– Faster than redundancy if asymmetric checking function exists– Better guarantees than typically used redundancy schemes

• Resistant against– Dominance of seemingly fast clients– Lazy checking– Sybil attacks

• Proof-of-concept implementation for discrete logarithm problem– Asymmetric checking function for Pollard-rho algorithm


Related Work• Cheating is a problem

[Kahney, Wired Magazine, Feb. 2001]– Seti@home: more than 50% of resources spent on cheating!

• Ringer scheme (precompute selected results)[Golle and Mironov, CT-RSA‘01], [Szaja et al., SP‘03]– Additional work on the server (precomputing ringers)

• Commitment scheme with Merkle-tree[Du et al., ICDCS‘04]– Additional work on the server (recompute some work-units)

• Cryptographic protocolse.g. [Aiello et al, ICALP‘00], [Cachin et al., EUROCRYPT‘99]– Often computationally too expensive in practice


Challenges

• Cheaters seem to be much faster than ordinary participants– 1% cheaters can submit >99%

of the results

• Cheaters stay honest for a long time, and only then start to cheat– Opens many possibilities for cheaters– Never trust a participant

• Lazy checking– Cheaters can calculate everything correctly but cheat during

verification (i.e. simply say the result was correct)

Honest

Cheater

Clients

Results0

10203040506070

80

90

100


Cheater Characterization• Fraction of cheating clients: p

– Problem: Sybil attacks

• Fraction of incorrect results in the system: r– Problem: Random results can

be computed very fast– If no countermeasures are

taken: r = 1

• Fraction of computing power of cheater(s): q– Computing power is

expensive => q is limited!– Goal of cheaters: Pretend to

have worked more than what is possible with the available computing resources

r

p

q


Asymmetric Verification• Performance property: It is much cheaper to verify the

correctness of a result than to calculate the result– Asymmetric

• Fingerprint property: A verifier calculates a fingerprint rather than a boolean result– Server compares fingerprints– Only honestly computed checks can lead to a positive result– Observe: Collusions still possible

• Uniqueness property: Results are either inherently unique, or the dependence from the input values can be verified– Prevents replay attacks

Example: Task: Find prime factors of x = 10829; Solution: {7, 7, 13, 17}Checking input: {7, 7, 13, 17}; Fingerprint: 7 * 7 * 13 * 17 = 10829


Distributed Verification: Algorithm

• Prerequisites– Fraction of cheaters is limited and considerably smaller than

50% (e.g. p ≤ 10%) => details later– Punishment is possible

• Check each result, until a clear decision is possible– Result good if „considerably more“ positive than negative checks

(and vice versa)– As p is limited, high probability of correct decision (see paper for details)

– Punish cheaters (including colluders) and remove all their pending results

• Assign checks uniformly at random among active clients– Fast clients (often cheaters) cannot dominate checking


Lifecycle of a Task

Server creates WU(input x)

Client 1 computes result and adds

fingerprint

Server chooses client 2 uniformly at random and

stores fingerprint

f(x)

x

Client 2 computes result and adds

fingerprint

c(x,f(x))

Server chooses client 3 uniformly at random and

stores fingerprint

Result good, as a „large majority“ of fingerprints match

the original one.

Save result in DB

One after the other, to mitigate collusion

1

2

3

4

5


Preventing Sybil Attacks

• Problem: Zero cost identity– Solution: Don‘t assign identity for free!

• Idea: Couple p (#clients) to q (computing resources)– New client has to perform some work without getting credits

=> buys identity– Goal: make the number of incorrect results a cheater can deliver

before being detected lower than the price to buy the identity– Observe: For honest participants the price is low (as they only

have to „pay“ once)


Analysis (Simulation)

• Number of checks vs. number of results (p = 10%)– Asymmetry: Checking is 50 times faster than calculation– Fastest clients 100 times faster than slowest

Fast clients do not dominate checking!


Analysis (Simulation) (2)

• Queue lengths– Number of pending checks for different confidence values


Implementation in BOINC


ECC Challenge

• Task: Break large discrete logarithm on elliptic curve– Currently: 130-bit– Reward: 20,000 USD

• Discrete Logarithm– Given a group with generator g, as well as a group element h:

Find x, such that g^x = h

• Best known algorithm: Pollard-Rho– Well suited for parallelization and use in grids


Pollard-Rho (Sketch)

x0f(x0)

x1f(x1)

d1,3 = d2,2

d1,1

d1,2

d1,3=d2,2

d2,1

d3,1

d3,2

d1,1

d1,2

d1,3

d2,1

d2,2

d3,1

d3,2

Normal point

Distinguished point


Asymmetric Verification (Sketch)

• Not every point possesses a predecessor– Backward iteration has high

probability to fail after a certain number of steps

• Finding a distinguished point together with the required parameters is asymptotically as expensive as forward iteration

• Checking function: Report the x-th predecessor – Verifier can forward iterate x steps and check whether the

distinguished point is found

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250

Iterations

Prob

abili

ty

P(length > 50) < 10%


Conclusions

• Algorithm for distributed verification in volunteer computing, which is resistant against:– Seemingly fast clients (uniform selection of verifier among all

active clients)– Lazy checking (fingerprint property)– Replay attacks (uniqueness property)– Sybil attacks (don‘t assign identity for free)

• Downside: Strong assumption on the verification function– But: such verification functions exist (Pollard-Rho)

• Future: More generic approaches


Thanks for your Interest

• Questions?