View
215
Download
0
Embed Size (px)
Citation preview
1 @Carnegie MellonDatabases
Simultaneous Scalability and Simultaneous Scalability and Security for Data-Intensive Web Security for Data-Intensive Web ApplicationsApplications
Simultaneous Scalability and Simultaneous Scalability and Security for Data-Intensive Web Security for Data-Intensive Web ApplicationsApplications
Amit Manjhi*, Anastassia Ailamaki*, Bruce M. Maggs*y, Todd C. Mowry*z, Christopher Olston* ©, Anthony Tomasic*
* Carnegie Mellon University y Akamai Technologiesz Intel Research Pittsburgh © Yahoo! Research
2 @Carnegie MellonDatabasesDynamic data-intensive Web applications: need scalability service
Home server
App server DatabaseWeb server
Client
Client
Provisioning for Web applications is Provisioning for Web applications is difficultdifficult
Need on-demand scalability
A scalability service can provide on-demand scalability• Example: CDN for static content
3 @Carnegie MellonDatabases
Client
Client
Client
Client
DSSPnodes
DSSPnodes
How to guarantee security of data?
Distributed Scalability Service ArchitectureDistributed Scalability Service Architecture
Shared Database Scalability Service Provider (DSSP)
4 @Carnegie MellonDatabases
A simple solution for guaranteeing A simple solution for guaranteeing securitysecurity
Outsource database scalability Home server: master copies of all data—handles
updates directly No query execution on the DSSP
DSSP caches query results—kept consistent by invalidation
All data passing through the DSSP can be encrypted:
Query, Update, Query results
5 @Carnegie MellonDatabases
A Simple ExampleA Simple Example
Empty
Home server database
Q1: SELECT toy_id FROM toys WHERE toy_name=“GI Joe”
DSSP nodeQ1
Q1:toy_id=15
U1
Empty
Q1
Nothing is encrypted
Results are encrypted
No Invalidations
Q1:
Q1:
U1
Invalidate
More encryption can lead to more invalidations
11 Barbie
15 GI Joe
11 Barbie
15 GI Joe
toys (toy_id, toy_name)
Result
Result
U1: DELETE FROM toys WHERE toy_id=5
Q1: toy_id=15
6 @Carnegie MellonDatabases
Challenge: providing scalability Challenge: providing scalability while guaranteeing securitywhile guaranteeing security
Security-scalability tradeoff
When updates occur, for correctness, DSSP needs to invalidate “affected” cache entries
Invalidations depend on what data is not encrypted:
• Encrypt everything conservative invalidation, poor scalability
• Encrypt nothing more precise invalidation, poor security
7 @Carnegie MellonDatabases
Opportunity for managing the Opportunity for managing the tradeofftradeoff
But for most data, nontrivial to assess: 1. Data-sensitivity2. Scalability impact of securing the data
Data Sensitivity
Extremely sensitive
Completely insensitive
Moderately sensitive
Credit card information
Bestsellers list
Inventory records
Don’t careCare but worried about scalability impact
Secure atall costs
Not all data is equally sensitive
8 @Carnegie MellonDatabases
Managing the security-scalability Managing the security-scalability tradeofftradeoff
Security
Sca
labi
lity
Encrypt sensitive and moderately sensitive data
Encrypt sensitive data Our approach
Extremely sensitive
Moderatelysensitive
Encrypt data not useful for invalidationinvalidation
Tradeoff has to be managed only over remaining data
9 @Carnegie MellonDatabases
Key insight: Queries and updates can Key insight: Queries and updates can only be instantiations of templates only be instantiations of templates
Can identify data not useful for invalidationGiven templates:
Q1: SELECT cust_name FROM customers WHERE cust_id=?
U1: DELETE FROM toys WHERE toy_id=?
Parameters and results not useful for invalidation
SELECT cust_name FROM customers WHERE cust_id=123
cust_name
John
template Query resultparameter
Encrypting them has no scalability overhead
10 @Carnegie MellonDatabases
OutlineOutline
Security-scalability tradeoff Four operating points in the tradeoff space Identifying data not useful for invalidation Evaluation results Related work and summary
11 @Carnegie MellonDatabases
Invalidation Strategies: OverviewInvalidation Strategies: Overview
Update template, update parameters
DSSP node
Invalidations
Key Value
(Query template, query parameters)
Query result
View
Template
Statement
Blind
• Data not encrypted Invalidations
• Four natural invalidation strategies
12 @Carnegie MellonDatabases
Invalidation Strategies: ViewInvalidation Strategies: View
DELETE FROM toys WHERE toy_id=5
(Template, Parameters) Query result
DSSP node
View
Q1 SELECT toy_id FROM toys WHERE toy_name=?
Q2 SELECT qty FROM toys WHERE toy_id=?
Q3 SELECT cust_name FROM customers WHERE cust_id=?
Template
Statement
Blind
No data is encrypted
• Invalidate all Q1 results with toy_id=5, all Q2 results with toy_id=5
View
13 @Carnegie MellonDatabases
Invalidation Strategies: Invalidation Strategies: StatementStatement
DELETE FROM toys WHERE toy_id=5
Q1 SELECT toy_id FROM toys WHERE toy_name=?
Q2 SELECT qty FROM toys WHERE toy_id=?
Q3 SELECT cust_name FROM customers WHERE cust_id=?
(Template, Parameters) Result View
Template
Statement
Blind
Query results are encrypted
• Invalidate all Q1 results, all Q2 results with toy_id=5
DSSP node
14 @Carnegie MellonDatabases
Invalidation Strategies: TemplateInvalidation Strategies: Template
DELETE FROM toys WHERE toy_id=
Q1 SELECT toy_id FROM toys WHERE toy_name=?
Q2 SELECT qty FROM toys WHERE toy_id=?
Q3 SELECT cust_name FROM customers WHERE cust_id=?
(Template, ) ResultParam5
View
Template
Statement
Blind
Results and parameters are encrypted
• Invalidate all Q1 results,all Q2 results
DSSP node
15 @Carnegie MellonDatabases
Invalidation Strategies: BlindInvalidation Strategies: BlindQ1 SELECT toy_id FROM toys WHERE toy_name=?
Q2 SELECT qty FROM toys WHERE toy_id=?
Q3 SELECT cust_name FROM customers WHERE cust_id=?
( , ) 5Template Template Param Result View
Template
Statement
Blind
DSSP node
All data are encrypted
• Invalidate all Q1 results,all Q2 results, all Q3 results
16 @Carnegie MellonDatabases
Invalidation Strategies: Invalidation Strategies: SummarySummary
Template Parameters Query result
Invalidations
View Q1 with toy_id=5
Q2 with toy_id=5
Statement All Q1,
Q2 with toy_id=5
Template All Q1, Q2
Blind All Q1, Q2, Q3
U1 DELETE FROM toys WHERE toy_id=5
Sca
labi
lity
Sec
urity
Q1 SELECT toy_id FROM toys WHERE toy_name=?
Q2 SELECT qty FROM toys WHERE toy_id=?
Q3 SELECT cust_name FROM customers WHERE cust_id=?
Accessible by DSSP?
x x xxx
x
: Yes x : No
17 @Carnegie MellonDatabases
OutlineOutline
Security-Scalability Tradeoff Four operating points in the tradeoff space Identifying data not useful for invalidation Evaluation results Related work and summary
18 @Carnegie MellonDatabases
Sometimes invalidation strategies Sometimes invalidation strategies have same invalidation behaviorhave same invalidation behavior
Template and View have same behavior
Q1: SELECT cust_name FROM customers WHERE cust_id=?
U1: DELETE FROM toys WHERE toy_id=?
Parameters and results can be encrypted
Find template pairs for which different invalidation strategies
have same invalidation behavior
Invalidation behavior characterization:
19 @Carnegie MellonDatabases
Applications can expose (not Applications can expose (not encrypt) on a per-template basisencrypt) on a per-template basis
Nothing Template Template, parameters
Template, parameters, result
Nothing
Template
Template, parameters
Query Exposure
Upd
ate
Exp
osur
e
Encrypt data as long as invalidationsdo not increase for any template pair
Invalidation MatrixInvalidation Matrix
20 @Carnegie MellonDatabases
OutlineOutline
Security-Scalability Tradeoff Four operating points in the tradeoff space Identifying data not useful for invalidation Evaluation results Related work and summary
21 @Carnegie MellonDatabases
Benchmark ApplicationsBenchmark Applications
Auction (RUBiS, from Rice)
Bulletin board (RUBBoS, from Rice)
Bookstore (TPC-W, from UW-Madison)
22 @Carnegie MellonDatabases
Evaluation MethodologyEvaluation Methodology
California Privacy Law determined sensitive data
Home serverCDN and DSSPUsers
5 ms 100 ms
Scalability: max # concurrent users with acceptable response times
Security: # templates with encrypted results
23 @Carnegie MellonDatabases
0
300
600
900
Auction Bboard Bookstore
Blind Template Statement View
Sca
labi
lity
(num
ber
of
conc
urre
nt u
sers
sup
port
ed)
Magnitude of Security-Scalability tradeoffMagnitude of Security-Scalability tradeoff
Benchmark Applications
0 0
1.Blanket encryption (Blind) hurts scalability2.View has the best scalability
24 @Carnegie MellonDatabases
Security ResultsSecurity Results
Bboard
and result
Additional query data that can be encrypted using our approach, without hurting scalability
Parameters
Result
Nothing
Auction
18
6 4 17 7
12
Bookstore
14
7 7
Different numbers denote the # query templates
Can encrypt results for over 50% of the templates
25 @Carnegie MellonDatabases
Security Results in DetailSecurity Results in Detail
Auction: The historical record of user bids was not exposed
Bboard: The rating users give one another based on the quality of their posting
Bookstore: Book purchase association rules discovered by the vendor – customers who purchase book A also purchase book B
26 @Carnegie MellonDatabases
Bookstore benchmark: security-Bookstore benchmark: security-scalability resultsscalability results
Encrypt only sensitive
data
Our Approach
Full encryption
0
300
600
900
0 5 10 15 20 25 30Security (Number of query templates with encrypted results)
Sca
labi
lity
(Num
ber
of
con
curr
en
t use
rs s
up
port
ed)
27 @Carnegie MellonDatabases
Related WorkRelated Work
Outsource database: [Hacigumus+ 2002], [Hacigumus+ 2002], [Agrawal+ 2004]
Outsource database scalability: DBCache [Luo+ 2002, Altinel+ 2003], DBProxy [Amiri+ 2003], NEC cache portal [Li+ 2003]
View invalidation strategies: [Levy and Sagiv 1993], [Candan+ 2002], [Choi and Luo 2004]
28 @Carnegie MellonDatabases
SummarySummary
Security-scalability tradeoff in presence of DSSP
Shortcut to manage the tradeoff Static analysis of database templates Find data not useful for invalidation Tradeoff has to be managed only over remaining data
Evaluation on three application benchmarks Blanket encryption hurts scalability Data identified by our approach is moderately sensitive
29 @Carnegie MellonDatabases
30 @Carnegie MellonDatabases
Back-up slides….Back-up slides….
31 @Carnegie MellonDatabases
Key insight: Set of queries and updates Key insight: Set of queries and updates can be determined by inspecting the can be determined by inspecting the codecodefunction get_toy_id ($toy_name) {
$template:=“SELECT toy_id FROM toys
WHERE toy_name=?”;
$query:=attach_to_template ($template, $toy_name);
execute ($query);
…
}
Statically identify data not useful for invalidationGiven templates:
32 @Carnegie MellonDatabases
Summary of Our ApproachSummary of Our Approach
Initial list of encrypted data(highly sensitive)
Static analysis
of templatesFinal list of encrypted data
1. For each query, update template pair, construct an IM. Use IM characterization results to see if Blind=Template, Template=Statement, and Statement=View in each case
2. Use a greedy algorithm to find all data that is not useful for invalidation
Tradeoff needs to be managed over reduced data
Privacy law
33 @Carnegie MellonDatabases
Flow of InvalidationsFlow of Invalidations
invalidate
(upon miss)
queryupdate CDN
DSSP (untrusted)
homeorganization
cache
34 @Carnegie MellonDatabases
Template Exposure LevelsTemplate Exposure Levels
Four levels of how much data is exposed per template
Nothing Template Template, Parameters
Template, Parameters, Result
greater exposure (more help for invalidation)
greater security
blind template statement view
Control the security-scalability tradeoff by controlling exposure levels
35 @Carnegie MellonDatabases
View Invalidation StrategiesView Invalidation Strategies
For each class: correct: at least as many invalidations as “required” minimal: fewer invalidations than any strategy in its class
Query Update Strategy
blind blind Blind
template template Template-Inspection
statement statement Statement-Inspection
view statement View-Inspection
36 @Carnegie MellonDatabases
Invalidation MatrixInvalidation Matrix
Nothing Template Template, parameters
Template, parameters, result
Nothing
Template
Template, parameters
Application can expose on a per-template basis
Query Exposure
Up
dat
e E
xpos
ure
Blind Blind Blind Blind
Blind
Blind
Template
StatementTemplate
Template Template
View
Not encrypted == exposed
37 @Carnegie MellonDatabases
Simple ExamplesSimple Examples
DELETE FROM toys WHERE toy_id=5
SELECT cust_name FROM customers WHERE cust_id=?
If View and Template have the same invalidation behavior,
parameters and query result need not be exposed.
If Template and Blind have the same invalidation behavior,
template need not be exposed.
DELETE FROM toys WHERE toy_id=5
SELECT qty FROM toys WHERE toy_id=?
38 @Carnegie MellonDatabases
correct statement-inspection
correct blind
Hierarchy of Invalidation Hierarchy of Invalidation StrategiesStrategies
correct template-inspection
minimal template-inspection
correct view-inspection
minimal blind
minimal statement-inspection
minimal view-inspection
39 @Carnegie MellonDatabases
Query and Update Classification?Query and Update Classification?
Symbol Meaning
S (UT) Attributes used in selection predicates
M (UT) Attributes modified
S (QT) Attributes used in selection predicates or order-by constructs
P (QT) Attributes retained in the result
Ignorable: M (U^T) \cap (S (Q^T)
40 @Carnegie MellonDatabases
Query and Update classification (1/2)Query and Update classification (1/2)
Update: selection S (U) and modified attributes M (U)
Query: selection S (Q) and preserved attributes P (Q)SELECT toy_id FROM toys WHERE toy_name=?
preserved attributes selection attributes
UPDATE customers SET cust_name=? WHERE cust_id=?
modified attributes selection attributes
41 @Carnegie MellonDatabases
Query and Update classification (2/2)Query and Update classification (2/2)
Ignorable update for a query: M(U) Å (S(Q) [ P(Q)) = { }
SELECT toy_id FROM toys WHERE toy_name=?UPDATE customers SET cust_name=? WHERE cust_id=?
No instance of the update ever invalidates the result of
any instance of the query
Result-unhelpful: S(U) Å P(Q) = { }
The result is not helpful in ruling out invalidations
SELECT toy_id FROM toys WHERE toy_name=?UPDATE customers SET cust_name=? WHERE cust_id=?
42 @Carnegie MellonDatabases
Blind vs. Template?Blind vs. Template?
Blind: always invalidates Template: always invalidates if not ignorable
Example:
If update is not ignorable, then Blind=Template
SELECT toy_id FROM toys WHERE toy_name=?
DELETE FROM toys WHERE toy_id=5
43 @Carnegie MellonDatabases
Template vs. Statement? Template vs. Statement?
If ignorable, then neither template nor statement invalidates
If not ignorable, and selection predicates of query and update don’t overlap, then both template and statement invalidate
SELECT toy_id FROM toys WHERE toy_name=?
UPDATE toys SET toy_id=? WHERE toy_id=?
Assumptions rule out updates like
UPDATE toys SET toy_id=5 WHERE toy_id=5
44 @Carnegie MellonDatabases
Statement vs. View?Statement vs. View?
If the update is result-unhelpful then Statement=View
If update is an insertion and query is a SPJ with conjunctive selection predicates and equality as join operator, Statement=View
Significant contribution
45 @Carnegie MellonDatabases
Simple ExampleSimple Example
DELETE FROM toys WHERE toy_id=5
SELECT toy_name FROM toys WHERE qty>?
If View and Template have the same invalidation behavior,
parameters and query result need not be exposed
View Minimal View-Inspection Strategy Template Minimal Template-Inspection Strategy
1. Whenever Template invalidates, View also invalidates:
2. When View does not invalidate, Template does not invalidate:
DELETE FROM toys WHERE toy_id=5
SELECT cust_name FROM customers WHERE cust_id=?
46 @Carnegie MellonDatabases
Scalability-conscious securityScalability-conscious security
Initial list of encrypted data(highly sensitive)
Static analysis
of templatesFinal list of encrypted data
1. Not all data is useful for invalidation purposes
2. Such data can be found by statically analyzing the templates
1. Data encrypted for “free” – a lot is moderately-sensitive data
2. Managing tradeoff becomes simpler – manage over substantially reduced data
SELECT toy_id FROM toys WHERE toy_name=?
Web Applications have templates:
47 @Carnegie MellonDatabases
Security without hurting Security without hurting scalabilityscalability
Security Conscious Scalability Approach
Tradeoff has to be only managed over remaining data
Data not needed for invalidation
Can secure “for free” (without hurting scalability)
As a result,