DistributedComputing
Group
Distributed Asymmetric Verification in Computational Grids
Michael KuhnStefan SchmidRoger Wattenhofer
IPDPS 2008Miami, Florida, USA
2Michael Kuhn, ETH Zurich @ IPDPS 2008 2
Grid Computing
• Goal: Use idle computing resources worldwide– Examples: seti@home,
folding@home, ...
3Michael Kuhn, ETH Zurich @ IPDPS 2008 3
Model
`
Server
Clients (Participants)
#clients: 104-106
Sends work-units (WU)
Returns result
4Michael Kuhn, ETH Zurich @ IPDPS 2008 4
• Why should people participate?– Incentives: honour („user of the day“), money, ...
• But: Incentives attract cheaters!
The Problem of Cheaters
5Michael Kuhn, ETH Zurich @ IPDPS 2008 5
• Verification required– Today: Redundancy
– E.g. seti@home: Send the same task to 3 participants– This paper: Distributed asymmetric checking
– Asymmetry: Verification is often cheaper than computation– Distributed: Participants check each other
The Problem of Cheaters
Random value
6Michael Kuhn, ETH Zurich @ IPDPS 2008 6
Contributions
• Distributed checking algorithm integrated in BOINC– Faster than redundancy if asymmetric checking function exists– Better guarantees than typically used redundancy schemes
• Resistant against– Dominance of seemingly fast clients– Lazy checking– Sybil attacks
• Proof-of-concept implementation for discrete logarithm problem– Asymmetric checking function for Pollard-rho algorithm
7Michael Kuhn, ETH Zurich @ IPDPS 2008 7
Related Work• Cheating is a problem
[Kahney, Wired Magazine, Feb. 2001]– Seti@home: more than 50% of resources spent on cheating!
• Ringer scheme (precompute selected results)[Golle and Mironov, CT-RSA‘01], [Szaja et al., SP‘03]– Additional work on the server (precomputing ringers)
• Commitment scheme with Merkle-tree[Du et al., ICDCS‘04]– Additional work on the server (recompute some work-units)
• Cryptographic protocolse.g. [Aiello et al, ICALP‘00], [Cachin et al., EUROCRYPT‘99]– Often computationally too expensive in practice
8Michael Kuhn, ETH Zurich @ IPDPS 2008 8
Challenges
• Cheaters seem to be much faster than ordinary participants– 1% cheaters can submit >99%
of the results
• Cheaters stay honest for a long time, and only then start to cheat– Opens many possibilities for cheaters– Never trust a participant
• Lazy checking– Cheaters can calculate everything correctly but cheat during
verification (i.e. simply say the result was correct)
Honest
Cheater
Clients
Results0
10203040506070
80
90
100
9Michael Kuhn, ETH Zurich @ IPDPS 2008 9
Cheater Characterization• Fraction of cheating clients: p
– Problem: Sybil attacks
• Fraction of incorrect results in the system: r– Problem: Random results can
be computed very fast– If no countermeasures are
taken: r = 1
• Fraction of computing power of cheater(s): q– Computing power is
expensive => q is limited!– Goal of cheaters: Pretend to
have worked more than what is possible with the available computing resources
r
p
q
10Michael Kuhn, ETH Zurich @ IPDPS 2008 10
Asymmetric Verification• Performance property: It is much cheaper to verify the
correctness of a result than to calculate the result– Asymmetric
• Fingerprint property: A verifier calculates a fingerprint rather than a boolean result– Server compares fingerprints– Only honestly computed checks can lead to a positive result– Observe: Collusions still possible
• Uniqueness property: Results are either inherently unique, or the dependence from the input values can be verified– Prevents replay attacks
Example: Task: Find prime factors of x = 10829; Solution: {7, 7, 13, 17}Checking input: {7, 7, 13, 17}; Fingerprint: 7 * 7 * 13 * 17 = 10829
11Michael Kuhn, ETH Zurich @ IPDPS 2008 11
Distributed Verification: Algorithm
• Prerequisites– Fraction of cheaters is limited and considerably smaller than
50% (e.g. p ≤ 10%) => details later– Punishment is possible
• Check each result, until a clear decision is possible– Result good if „considerably more“ positive than negative checks
(and vice versa)– As p is limited, high probability of correct decision (see paper for details)
– Punish cheaters (including colluders) and remove all their pending results
• Assign checks uniformly at random among active clients– Fast clients (often cheaters) cannot dominate checking
12Michael Kuhn, ETH Zurich @ IPDPS 2008 12
Lifecycle of a Task
Server creates WU(input x)
Client 1 computes result and adds
fingerprint
Server chooses client 2 uniformly at random and
stores fingerprint
f(x)
x
Client 2 computes result and adds
fingerprint
c(x,f(x))
Server chooses client 3 uniformly at random and
stores fingerprint
Result good, as a „large majority“ of fingerprints match
the original one.
Save result in DB
One after the other, to mitigate collusion
1
2
3
4
5
13Michael Kuhn, ETH Zurich @ IPDPS 2008 13
Preventing Sybil Attacks
• Problem: Zero cost identity– Solution: Don‘t assign identity for free!
• Idea: Couple p (#clients) to q (computing resources)– New client has to perform some work without getting credits
=> buys identity– Goal: make the number of incorrect results a cheater can deliver
before being detected lower than the price to buy the identity– Observe: For honest participants the price is low (as they only
have to „pay“ once)
14Michael Kuhn, ETH Zurich @ IPDPS 2008 14
Analysis (Simulation)
• Number of checks vs. number of results (p = 10%)– Asymmetry: Checking is 50 times faster than calculation– Fastest clients 100 times faster than slowest
Fast clients do not dominate checking!
15Michael Kuhn, ETH Zurich @ IPDPS 2008 15
Analysis (Simulation) (2)
• Queue lengths– Number of pending checks for different confidence values
16Michael Kuhn, ETH Zurich @ IPDPS 2008 16
Implementation in BOINC
17Michael Kuhn, ETH Zurich @ IPDPS 2008 17
ECC Challenge
• Task: Break large discrete logarithm on elliptic curve– Currently: 130-bit– Reward: 20,000 USD
• Discrete Logarithm– Given a group with generator g, as well as a group element h:
Find x, such that g^x = h
• Best known algorithm: Pollard-Rho– Well suited for parallelization and use in grids
18Michael Kuhn, ETH Zurich @ IPDPS 2008 18
Pollard-Rho (Sketch)
x0f(x0)
x1f(x1)
d1,3 = d2,2
d1,1
d1,2
d1,3=d2,2
d2,1
d3,1
d3,2
d1,1
d1,2
d1,3
d2,1
d2,2
d3,1
d3,2
Normal point
Distinguished point
19Michael Kuhn, ETH Zurich @ IPDPS 2008 19
Asymmetric Verification (Sketch)
• Not every point possesses a predecessor– Backward iteration has high
probability to fail after a certain number of steps
• Finding a distinguished point together with the required parameters is asymptotically as expensive as forward iteration
• Checking function: Report the x-th predecessor – Verifier can forward iterate x steps and check whether the
distinguished point is found
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250
Iterations
Prob
abili
ty
P(length > 50) < 10%
20Michael Kuhn, ETH Zurich @ IPDPS 2008 20
Conclusions
• Algorithm for distributed verification in volunteer computing, which is resistant against:– Seemingly fast clients (uniform selection of verifier among all
active clients)– Lazy checking (fingerprint property)– Replay attacks (uniqueness property)– Sybil attacks (don‘t assign identity for free)
• Downside: Strong assumption on the verification function– But: such verification functions exist (Pollard-Rho)
• Future: More generic approaches
21Michael Kuhn, ETH Zurich @ IPDPS 2008 21
Thanks for your Interest
• Questions?