Transcript
Page 1: Accountable systems  or how to catch a liar?

Accountable systems or

how to catch a liar?

Jinyang Li(with slides from authors of SUNDR and PeerReview)

Page 2: Accountable systems  or how to catch a liar?

What we have learnt so far

ā€¢ Use BFT replication to handle (a few) malicious servers

ā€¢ Strengths: ā€“ Transparent (service goes on as if no faults)ā€“ Minimal client modifications

ā€¢ Weaknesses:ā€“ Breaks down completely if >1/3 servers are bad

Page 3: Accountable systems  or how to catch a liar?

Hold (bad) servers accountableā€¢ What is possible when >1/3 servers fail?

ā€“ E.g. 1 out of 1 servers fail?ā€¢ Let us consider a bad file server

ā€“ Can it delete a clientā€™s data?ā€“ Can it leak the content of a clientā€™s data?ā€“ Can it show client A garbage and claim itā€™s from B?ā€“ Can it show client A old data of client B?ā€“ Can it show A (op1, op2) and B (op2, op1)?

Page 4: Accountable systems  or how to catch a liar?

Hold (bad) servers accountableā€¢ SUNDRā€™s observations:

ā€“ Cannot prevent a bad server from misbehaving at least once

ā€“ Trustworthy clients can detect inconsistencies due to server misbehaviors!

ā€¢ Useful?

QuickTimeā„¢ and a decompressor

are needed to see this picture.

Fool me once, shame on you; fool me twice, shame on ā€¦

Page 5: Accountable systems  or how to catch a liar?

Case study I: SUNDR

ā€¢ Whatā€™s SUNDR?ā€“ A (single server) network file systemā€“ Handle potential Byzantine server behaviorā€“ Can run on an un-trusted server

ā€¢ Useful properties:ā€“ Tamper-evident

ā€¢ Unauthorized operations will be immediately detectedā€“ Can detect past misbehavior

ā€¢ If server drops operations, can be caught eventually

Page 6: Accountable systems  or how to catch a liar?

Ideal File system semantics

ā€¢ Represent FS calls as fetch/modify operationsā€“ Fetch: client downloads new dataā€“ Modify: client makes new change visible to others

ā€¢ Ideal (sequential) consistency: ā€“ A fetch reflects the sequence of modifications that

happen before itā€“ Impossible when the server is malicious

ā€¢ A is only aware of Bā€™s latest modification via the serverā€¢ Goal:

ā€“ get as close to ideal consistency as possible

Page 7: Accountable systems  or how to catch a liar?

Strawman File System

AModify f2 sig3

AModify f1 sig1

BFetch f4 sig2

AModify f2 sig3

BFetch f2 sig4

AModify f1 sig1

BFetch f4 sig2

AModify f1 sig1

BFetch f4 sig2

AModify f1 sig1

BFetch f4 sig2

sig1

sig2

AModify f2 sig3

sig3

File server

A: echo ā€œA was hereā€ >> /share/aaa

B: cat /share/aaa

userA

userB

Page 8: Accountable systems  or how to catch a liar?

Log specifies the total order

AModify f1 sig1

BFetch f4 sig2

AModify f2 sig3

BFetch f2 sig4

AModify f1 sig1

BFetch f4 sig2

AModify f2 sig3 The total order:

LogA ā‰¤ LogB iff LogA is prefix of LogB

Aā€™s latest log:

Bā€™s latest log:

LogA

LogB

Page 9: Accountable systems  or how to catch a liar?

Detecting attacks by the server

AModify f2 sig3

AModify f1 sig1

BFetch f4 sig2

BFetch f2 sig3

A: echo ā€œA was hereā€ >> /share/aaa AModify f1 sig1

BFetch f4 sig2

AModify f1 sig1

BFetch f4 sig2

AModify f1 sig1

BFetch f4 sig2

AModify f2 sig3

A

BB: cat /share/aaa(stale result!)

File server

Page 10: Accountable systems  or how to catch a liar?

AModify f1 sig1

BFetch f4 sig2

BFetch f2 sig3b

AModify f1 sig1

BFetch f4 sig2

AModify f2 sig3a

Aā€™s log and Bā€™s log canno longer be ordered:LogA ā‰¤ LogB, LogB ā‰¤ LogA

Aā€™s latest log:

Bā€™s latest log:

Detecting attacks by the server

LogA

LogB

sig1sig2

sig3a

Page 11: Accountable systems  or how to catch a liar?

What Strawman has achieved

High overhead, no concurrency Tamper-evident

ā€¢ A bad server canā€™t make up ops users didnā€™t do

Achieves fork consistencyā€¢ A bad server can conceal usersā€™ ops from each

other, but such misbehavior could be detected later

Page 12: Accountable systems  or how to catch a liar?

Fork Consistency: A tale of two worlds

File Server

Aā€™s view Bā€™s view

ā€¦

ā€¦

Page 13: Accountable systems  or how to catch a liar?

Fork consistency is useful

ā€¢ Best possible alternative to sequential consistencyā€“ If server lies only once, it has to lie forever

ā€¢ Enable detections of misbehavior:ā€“ users periodically gossip to check violationsā€“ or deploy a trusted online ā€œtimestampā€ box

Page 14: Accountable systems  or how to catch a liar?

SUNDRā€™s tricks to make strawman practical

1. Store FS as a hash treeā€“ No need to reconstruct FS image from log

2. Use version vector to totally order opsā€“ No need to fetch entire log to check for misbehavior

Page 15: Accountable systems  or how to catch a liar?

Trick #1: Hash tree

h0

h1

h3

h2

h4

h6

h5

h7

h9

h8

h10 h11 h12

D0

D1 D2 D3

D4 D5 D6 D7 D8 D9 D10 D11 D12

Key property:h0 verifies the entire tree of data

Page 16: Accountable systems  or how to catch a liar?

Trick #2: version vector

ā€¢ Each client keeps track of its version #ā€¢ Server stores the (freshest) version vector

ā€“ Version vector orders all operationsā€¢ E.g. (0, 1, 2) ā‰¤ (1, 1, 2)

ā€¢ A client remembers the latest version vector given to it by the serverā€“ If new vector does not succeed old one, detect

order violation!ā€¢ E.g. (0,1,2) ā‰¤ (0,2,1)

Page 17: Accountable systems  or how to catch a liar?

SUNDR architecture

ā€¢ block server stores blocks retrievable by content-hashā€¢ consistency server orders all events

Untrusted Network

consistency server

block server

SUNDR server-side

SUNDRclient

SUNDRclient

userA

userB

Page 18: Accountable systems  or how to catch a liar?

SUNDR data structures: hash tree

ā€¢ Each file is writable by one user or groupā€¢ Each user/group manages its own pool of inodes

ā€“ A pool of inodes is represented as a hash tree

Page 19: Accountable systems  or how to catch a liar?

Hash files

ā€¢ Blocks are stored and indexed by hash on the block server

data1

Metadata

H(data1)

H(data2)

H(iblk1)

data2

data3

data4H(data3)

H(data4)

iblk1

20-byte File Handle

i-node

Page 20: Accountable systems  or how to catch a liar?

Hash a pool of i-nodesā€¢ Hash all files writable by each user/group

ā€¢ From this digest, client can retrieve and verify any block of any file

2 20-byte

3 20-byte

4 20-byte

i-table i-node 2

i-node 3

i-node 4

20-byte digest

i-num

Page 21: Accountable systems  or how to catch a liar?

SUNDR FS 2 20-byte3 20-byte4 20-byte

digest

2 20-byte3 20-byte4 20-byte

digest

Superuser:

UserA:

SUNDR State

How to fetch ā€œ/share/aaaā€?/:Dir entry: (share, Superuser, 3)

Lookup ā€œ/ā€

/share:Dir entry: (aaa, UserA, 4)

Lookup ā€œ/shareā€

Fetch ā€œ/share/aaaā€

digestUserB: ā€¦

234

digestGroupG:

Page 22: Accountable systems  or how to catch a liar?

SUNDR data structures:version vector

ā€¢ Server orders usersā€™ fetch/modify opsā€¢ Client signs version vector along with digestā€¢ Version vectors will expose ordering failures

Page 23: Accountable systems  or how to catch a liar?

Version structure

ā€¢ Each user has its own version structure (VST)ā€¢ Consistency server keeps latest VSTs of all usersā€¢ Clients fetch all other usersā€™ VSTs from server before

each operation and cache themā€¢ We order VSTA ā‰¤ VSTB iff all the version numbers in VSTA

are less than or equal in VSTB

VSTA

Signature A

ADigest A

A - 1B - 1G - 1

VSTBā‰¤ Signature B

BDigest B

A - 1B - 2G - 2

Page 24: Accountable systems  or how to catch a liar?

Update VST: An exampleConsistency Server

B

AA-0

B-0

A: echo ā€œA was hereā€ >> /share/aaa

B: cat /share/aaa

DigA

A

A-1

B-1DigA

A

A-1

B-1DigA

A

A-1

B-1DigA

A

A-0

B-1DigB

B

A-1

B-2DigB

B

A-1

B-2DigB

B

A-0

B-1DigB

B

VSTA ā‰¤ VSTB

Page 25: Accountable systems  or how to catch a liar?

Detect attacks

Consistency Server

BA: echo ā€œA was hereā€ >> /share/aaa

B: cat /share/aaa (stale!)

A

A-1

B-1DigA

A

A-0

B-0DigA

A

A-0

B-1DigA

B

A-1

B-1DigA

A A-0

B-1DigB

B

A-0

B-2DigB

B

A-0

B-2DigB

B

Aā€™s latest VST and Bā€™s can no longer be ordered:VSTA ā‰¤ VSTB, VSTB ā‰¤ VSTA

ā‰¤A-0

B-0DigA

A

Page 26: Accountable systems  or how to catch a liar?

Support concurrent operationsā€¢ Clients may issue operations concurrently

ā€“ While first client is still working on his operation, how does second client know what vector to sign?

ā€¢ Idea: If operations donā€™t conflict, include first userā€™s forthcoming version number in VST

ā€¢ Solution: Pre-declare operations in signed updatesā€“ Server returns latest VSTs and all pending updates,

thereby ordering them before current operationā€“ User computes new VST including pending updatesā€“ User signs and commits new VST

Page 27: Accountable systems  or how to catch a liar?

Concurrent update of VSTs

Consistency ServerAB

update: B-2

update: A-1

A: echo ā€œA was hereā€ >>/share/aaa

B: cat /share/bbb

A-0

B-0DigA

A

A-1

A-0

B-1DigB

B

A-0

B-1DigB

B

A-1 B-2A-0

B-0DigA

A

VSTA ā‰¤ VSTB

A-1

B-1DigA

A

A-1 A-1

B-1DigA

A

A-1A-1

B-2DigB

B

A-1B-2A-1

B-2DigB

B

A-1B-2

Page 28: Accountable systems  or how to catch a liar?

SUNDR is practical

0

2

4

6

8

10

12

Create (1K) Read (1K) Unlink

NFSv2 NFSv3 SUNDR SUNDR/NVRAM

Seco

nds

Page 29: Accountable systems  or how to catch a liar?

Case study II: PeerReview [SOSP07]

Page 30: Accountable systems  or how to catch a liar?

Motivations for PeerReview

ā€¢ Large distributed systems consist of many nodes

ā€¢ Some nodes become Byzantineā€“ Software compromiseā€“ Malicious/careless administrator

ā€¢ Goal: detect past misbehavior of nodesā€“ Apply to more general apps than FS

Page 31: Accountable systems  or how to catch a liar?

Challenges of general fault detection

ā€¢ How to detect faults?ā€¢ How to convince others that a node is (not)

faulty?

Page 32: Accountable systems  or how to catch a liar?

Overall approachā€¢ Fault := Node deviates from expected behaviorā€¢ Obtain a signature for every action from each nodeā€¢ If a node misbehaves, the signature works a proof-of-misbehavior against the

node

Page 33: Accountable systems  or how to catch a liar?

Can we detect all faults?

ā€¢ Noā€“ e.g. Faults affecting a node's

internal state

ā€¢ Detect observable faultsā€“ E.g. bad nodes send a message that correct nodes would not send

A

X

C

100101011000101101011100100100

0

Page 34: Accountable systems  or how to catch a liar?

Can we always get a proof?

ā€¢ No ā€“ A said it sent X B said A didnā€˜t C: did A send X?

ā€¢ Generate verifiable evidence:ā€“ a proof of misbehavior (A send wrong X)ā€“ a challenge (C asks A to send X again)

ā€¢ Nodes not answering challenges are suspects

A

X

B

C

?

I sent X!

I neverreceived

X!?!

Page 35: Accountable systems  or how to catch a liar?

ā€¢ Treat each node as a deterministic state machineā€¢ Nodes sign every output messageā€¢ A witness checks that another node outputs correct

messagesā€“ using a reference implementation and signed inputs

PeerReview Overview

Page 36: Accountable systems  or how to catch a liar?

M

PeerReview architectureā€¢ All nodes keep a log of

their inputs & outputsā€“ Including all messages

ā€¢ Each node has a set of witnesses, who audit its log periodically

ā€¢ If the witnesses detect misbehavior, theyā€“ generate evidenceā€“ make the evidence available

to other nodes

ā€¢ Other nodes check evi-dence, report fault

A's log

B's log

A

BM

CD

E

A's witnesses

M

Page 37: Accountable systems  or how to catch a liar?

PeerReview detects tampering

A B

Message Hash chain

Send(X)

Recv(Y)

Send(Z)

Recv(M)

H0

H1

H2

H3

H4

B's log

ACK ā€¢ What if a node modifies its log entries?

ā€¢ Log entries form a hash chainā€¢ Signed hash is included with

every message Node commits to having received all prior messages

Hash(log)

Hash(log)

Page 38: Accountable systems  or how to catch a liar?

PeerReview detects inconsistencyā€¢ What if a node

ā€“ keeps multiple logs?ā€“ forks its log?

ā€¢ Witness checks if signed hashes form a single chain

H3'

Read X

H4'

Not found

Read Z

OK

Create X

H0

H1

H2

H3

H4

OK

"View #1""View #2"

Page 39: Accountable systems  or how to catch a liar?

Module B

PeerReview detects faultsā€¢ Witness audits a node:

ā€“ Replay inputs on a reference implementation of the state machine

ā€“ Check outputs against the log

Module AModule B

=?

LogNetwork

Input

Output

State machine

if ā‰ 

Module A

Page 40: Accountable systems  or how to catch a liar?

PeerReviewā€˜s guaranteesā€¢ Faults will be detected (eventually)

ā€“ If a node if faulty: ā€¢ Its witness has a proof of misbehaviorā€¢ Its witness generates a challenge that it cannot answer

ā€“ If a node is correctā€¢ No proof of misbehavorā€¢ It can answer any challenge

Page 41: Accountable systems  or how to catch a liar?

PeerReview applicationsā€¢ App #1: NFS server

ā€¢ Tampering with filesā€¢ Lost updates

ā€¢ App #2: Overlay multicastā€¢ Freeloadingā€¢ Tampering with content

ā€¢ App #3: P2P emailā€¢ Denial of serviceā€¢ Dropping emails

ā€¢ Metadata corruptionā€¢ Incorrect access control

Page 42: Accountable systems  or how to catch a liar?

PeerReviewā€™s performance penalty

ā€¢ Cost increases w/ # of witnesses per node W

Baseline 1 2 3 4 5

100

80

60

40

20

0Avg traffic (Kbps/node)

Number of witnesses

Baseline traffic

Page 43: Accountable systems  or how to catch a liar?

What have we learnt?ā€¢ Put constraints on what faulty servers can do

ā€“ Clients sign data, bad SUNDR server cannot fake data

ā€“ Clients sign version vector, bad server cannot hide past inconsistency

ā€¢ Fault detectionā€“ Need proof of misbehavior (by signing actions)ā€“ Use challenges to distinguish slow nodes apart

from bad ones


Recommended