Upload
richard-woods
View
215
Download
1
Embed Size (px)
Citation preview
Secure Cloud Database
Introduction
• Cloud computing– IT as a service from third party service provider
• Security in cloud environment– Adversary corrupts the service provider?– Goal: protect sensitive data
Related Work
• Encryption Approach– NetDB2, IBM (Outsourced database)– Relational Cloud, CryptDB (MIT, CIDR 2011)
• TrustedDB using secure hardware (VLDB 2011 demo, Radu Sion)
• Fully homomorphic encryption (STOC 2009)• Secure Multi-Party Computation Approach– ShareMind
NetDB2
Tuple 1 xxx yyyTuple 2 aaa bbb
Tuple 1 !a4 a3gTuple 2 L%j m*KValue-level encryption
SELECT * WHERE value = `xxx’ SELECT * WHERE value = `!a4’
DB
Encrypted DB
Tuple 1 P2 P2
Tuple 2 P1 P1+Partition information
Partition:P1: < `m’; otherwise P2
SELECT * WHERE value < `xxx’ SELECT * WHERE value in [P1, P2]
Simple deterministic encryption
CryptDB
• Onion-encryption: multiple encryption done on 1 data
10
Original data
encryptE1(10) =A*65h
OPES: numeric comparisons
E2(A*65h) = BB647
Deterministic encryptionEquality can be done
Non-deterministic encryptionNo computation is feasible
E3(BB647) = %j@9G
If the user wants more computation power, decrypt to the desired level (one way!)
Weakness of encryption approach
• Functions supported are not generic– For example:• Supported (OPES): SELECT * WHERE SALARY > 6000• Not supported: SELECT * WHERE SALARY + BONUS >
6000
TrustedDB
• Provides generic functionality– Owner puts its keys in a secure hardware– The hardware is given to service provider– When there is computation on sensitive data, it
can be done by the secure hardware• Weakness– Processing power limited by secure
hardware– Hardware management by owner IBM 4764 PCI-X
Cryptographic Coprocessor
Fully homomorphic encryption
• Property: (E : encryption function)– E(x) + E(y) = E(x + y) ---- XOR gate for [0, 1]– E(x) E(y) = E(xy) ---- AND gate for [0, 1]
• Conceptually support any computations that can be represented by circuits– Difference: No branch operation (if-then-else)
• Weakness– Naturally not supporting select statement– Poor efficiency for large circuit so far
ShareMind
• Key: Secret sharing + recursive processing
A
B
C
Service Provider 1
Service Provider 2
Service Provider 3
QueryResult
D
E
F
D + E + F = Result
DB
DB = A + B + C
Properties of ShareMind
• Generic operations– Recursive processing: the result of one
computation can be the input of another computation, both result and input are hided in shares
• Weakness– Requires multiple non-colluding parties– Owner has no control (no key), poor sense of
security
Objective
• Two party problem: owner and service provider (SP)
• The owner keeps a `key’• SP keeps an encrypted database• Functions to be supported: generic selection• Efficient operations
Overview of our approach
• MPC supports generic operations– Data hided in shares
• In other words: we encrypt by secret sharing• Following questions:– How to encrypt exactly?– How to compute queries?
Our approach
DB
A B C
SP2SP1 SP3
Owner
DB A
B C
SP1 SP2
Owner
MPC-based approach The owner keeps a copy; but it is large and the owner has to involve in query computation
Our approach
• Owner keeps a small share A (small storage)
• Without A, SP cannot recover DB (similar security strength as MPC)
• Owner has minimal involvement in MPC (low cost)
DB A
B
SP
Owner
Our Model
Share compression
Message compression
Functionality generality inherits from MPC
Background
Secret sharing (around 1980)
10
Secret
46 shares
Alice Bob
6+4 = 10
What is the secret value?
Alice’s share would be 5? 20? -3?
The secret is recovered only when the two parties exchange their shares
Secret sharing
• General case
s
Secret
s1 s2 … sn
The secret can be divided into n parties, for any n
s = g(s1, s2, …, sn)
Example:Sum of all shares (modular)Bitwise XOR of all sharesProduct, string concatenation, etc…
Security requirement:Given k < n shares, it is hard to recover s
To design a generic secure database
How secure? The security model
• Negative result– Ideal security:• Querying workflow: user issues query => service
providers compute result and return to user• Knowledge gained by service providers: NONE. Not
even anything about query and result!
– A solution achieving ideal security is not more efficient than a non-outsourcing solution (not using cloud)
Knowledge gained by service provider
• Output space of a simple selection query: varies from no tuple to the entire database– Even larger space if we consider joins
• Example knowledge gain– If the output size is small, the service provider knows
it is not the case that the query selects entire table• To hide the above information, each returned
query result should be at least of size = entire table
Our security model
• Provides adequate security for practical use– Level 1 model: An attacker observes an instance of encrypted
database but not other values. Security is said to be enforced if the attacker cannot recover the original database• Example: Hack into the cloud server and copy the instance
– Level 2 model: An attacker observes an instance of encrypted database and knows the original values of some of the tuples. Security is said to be enforced if the attacker cannot recover the values of other tuples• Example: Hacker plants adversary programs on SP and observes the
encrypted value and • Similar to chosen-ciphertext attack (CPA)
Which level to use?
• Check which model fits!• Example:– Name of 40 students in a class
• Domain size is small and is assumed to be public• Easy to be mapped to the encrypted tuple• Level 2 is recommended
– Account balance in banks• Values are not known to public• Level 1 should be good enough
• At the same time, we will try to hide as much other information as possible
Information revealed to SP
• The service provider can observe– Query content• The tables that are related to the query• Number of conditions, types of conditions, attributes
that are related
– Query answer• the set of shares of tuples in some query answer
Example query
• SELECT NameFROM EmployerWHERE Salary > 6000
• Transformed query may look like to one service providerSELECT ATTRIBUTE_7FROM TABLE_AWHERE ATTRIBUTE_3 - X > 0WITH PARAM_X = [1234, 3335, 222, 1119]WITH PARAM_CMP_X = [335, 17778]
Some basic design – level 2 model
• To hide the database, we use secret sharingDB = A + B
• In our case, we use multiplicative secret sharing– To store value v, we have
ab = v (mod D)• D: domain size• The shares are a, b
DB A
B
SP
Owner
Sharing the sign bit
• We separate sign bit from the magnitude• Example sharing– Sign bit can be recovered by multiplying the
shares together
Value
+1
-1
+2
Owner copy-1+2+2
SP copy
-1
-2
+1
Magnitude domain size: 3
The shares here are randomly generated
Share Compression (Ignore sign bit now)
• The shares of the DB is generated randomly• Who decides the random shares? Lets use a
pseudo random function
Share compression function
• Input: – key (secret to owner)– Tuple ID
• Requirements:– Support generic functionality (show later)– Secure (note: now considering level 2)
ID X
1 18
2 20
ID Share
1 1
2 4
f(ID) = mIDk mod n
ID Share
1 18
2 5
Share AKept by owner
Share BBy SP
k,m: secret key; n: public key
k=2m=1
Storage cost
• Linear to number of columns– Assuming the IDs are from 1-t• Just need to remember t• Note on the random function:
– To make the input look like random, we have» f(ID) = mh(ID)k mod n
• h: any one-way hash
• Storage part is easy, how about computation?
ID Share
1 1
2 4
… …
f(ID) = mIDk mod n
How to do multiplication?
• Column-column multiplication– The two values are both in share format
A B
10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
A B
1 5
Real value
Owner
SP
C = A X B
200
5
40 (k = 3, m=5)
m1m2xk1xk2 = m1m2xk1+k2 k = 2
m=1
resharing
4
50
k=1m=5
A = a1a2B = b1b2C = (a1b1)(a2b2)
mIDk = 10
Recap: operations at the parties
A (k = 1, m=5)
B (k =2,m=1)Owner
SP
A B
1 5
2 8
10 9
… …
C (k=2,m=1)
C
50
…
…
…
Column-constant multiplication
A
10
ID A (k = 1, m=1)
2 2
A
5
Real value
Owner
SP
Constant B = 20
C = A X B
200
5
40 (k = 1, m=20)
k = 2m=5
resharing
20
10
k=-1m=4
mIDk = 2
Column-column addition
• A = a1a2
• B = b1b2
– C = A + B => a1a2 + b1b2
– Goal: C = c1c2 = a1a2 + b1b2
c2 = a1c1-1a2 + b1c1
-1b2
Owner: a1, b1SP: a2, b2
Kept by owner
Column-column addition
• c2 = a1c1-1a2 + b1c1
-1b2
A B
10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
A B
1 5
Real value
Owner
SP
C = A + B
30
f(ID) = mIDk
3.75
A:k=-1m=2.5
C (k = 2, m = 2)
8
B:k=0m=0.5
1.25 * 1 + 0.5 * 5
Column-constant addition
• Add a constant to each tuple– Becomes column-column addition
A
10
20
30
45
A Z
10 1
20 1
30 1
45 1
Take care of the sign bit
• Secret sharing
• How to generate the shares at the owner?– Again, use share compression
Owner SP Value
1(+) 1(+) 1(+)
1(+) 2(-) 2(-)
2(-) 1(+) 2(-)
2(-) 2(-) 1(+)
multiplication (mod 3)
Multiplication with sign bitA B
-10 20
ID A’s sign (k = 2, m=1)
B’s sign (k =1,m=1)
2 4 (1, +) 2 (2, -)
A B
2, - 2, -
Real value
Owner
SP
C = A X B
-200
1, +
8 (2, -)
m1m2xk1xk2 = m1m2xk1+k2 k = 2
m=2
resharing
8 (2, -)
1, +
k=1m=2-1 = 2
k = 3m=1
mod 3
1, +
Addition with sign bit
• The math is the same• A = a1a2
• B = b1b2
– C = A + B => a1a2 + b1b2
– Goal: C = c1c2 = a1a2 + b1b2
c2 = a1c1-1a2 + b1c1
-1b2
Addition with sign bit
• c2 = a1c1-1a2 + b1c1
-1b2A B
-10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
Sign (k=1, m = 2)4 (1, +)
(k=2, m = 2)8 (2, -)
A B
Value 1 5
Sign 2, - 2, -
Real value
Owner
SP
C = A + B
10
f(ID) = mIDk
1.25
A:k=-1m=2.5
C (k = 2, m = 2)
Value 8
Sign (k=1, m=2)4 (1, +)
B:k=0m=0.5
+1.25 * -1 + (-0.5) * (-5)
(k=0, m = 1)=> 1 (1, +)
(k=1, m = 1)=> 2 (2, -)
Implication with numeric operations
• Addition on [0, 1] represents XOR gate• Multiplication on [0, 1] represents AND gate• The above operations can be applied
repeatedly– Function that can be expressed as a function can
be computed, theoretically• Branch operation, i.e., comparison?
Comparison operation
• Target: Column > 0– General comparisons can be transformed to above
by additions and multiplications• Example: SELECT * WHERE X*X + Y+Y – 2* X*Y – 10 > 0
• The above can be obtained by looking at the sign bit, shared by the owner and SP
Building logic gate
• Note 1: we represent positive as 1; negative as 2; 0 has no meaning
• The share compression function is the same for sign bit and magnitude. Multiplication and addition can be done on sign bits.
Logical operation
• Multiplying 2 (at SP)
• Multiplying two sign bitsSign 1 Sign 2 Result
1(+) T 1(+) T 1(+) T
1(+) T 2(-) F 2(-) F
2(-) F 1(+) T 2(-) F
2(-) F 2(-) F 1(+) T
XNOR gate
Sign 1 Result
1(+) T 2(-) F
2(-) F 1(+) TNOT gate
XOR gate
Logical operation
• (Sign1 – 1) * (sign2 – 1) + 1– Sign1 sign2 – sign1 – sign2 + 2
sign1 sign2 (s1 – 1)(s2 – 1) Result
1(+) T 1(+) T 0 1(+) T
1(+) T 2(-) F 0 1(+) T
2(-) F 1(+) T 0 1(+) T
2(-) F 2(-) F 1 2(-) F
OR gateNOR gate
Logical operation
• (2Sign1 – 1) * (2sign2 – 1) + 1= 4Sign1 sign2 – 2sign1 – 2sign2 + 2
sign1 sign2 2sign1 2sign2 2s1 – 1 2s2 – 1
1(+) T 1(+) T 2 2 1 1
1(+) T 2(-) F 2 1 1 0
2(-) F 1(+) T 1 2 0 12(-) F 2(-) F 1 1 0 0
NAND gateAND gate
(2s1 – 1)(2s2 – 1) Result
1 2(+) F
0 1(+) T
0 1(+) T
0 1(+) T
Summary
• Logical operations supported– Example• SELECT *
WHERE X > 100 AND Y < 200
• The final predicate result is revealed to SP (owner sends SP its own share function on the final boolean value)
• Corresponding tuples are sent back to the owner