View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Compressed Accessibility Map: Efficient Access
Control for XML
Ting Yu: University of Illinois
Divesh Srivastava: AT&T Labs
Laks V.S. Lakshmanan: University of British Columbia
H.V. Jagadish: University of Michigan
Information Sharing in business over the Internet XML as a standard information
exchange/sharing format Direct access to XML documents
Offer advantages in terms of cost, accuracy and timeliness
Security is crucial Nature of selective access in this context is
complex
Access Control for XML Fine-grained access control
Business relationship is sophisticated Constraints on tag/attribute level instead of
only on document level Complex access control rules
Efficient evaluation of data’s accessibility is desired Focus of this talk
An Example XML Document with Access Control Info.<division name=“security” access=“public”> <about_div> <member> … </member> <member> … </member> </about_div> <res_activity> <description access = “internal”> The purpose of … </description> <project access = “public” type =“system”> <name access=“internal”>Access Control</name> <fund access=“internal”>…</fund> <report access=“internal” code=“R1-99”> … </report>
</project> <project access = “public” type=“theory”> … </project> </res_activity> …..</division>
*based on examples in [Damiani et al. 2000]
Two Potential Approaches Approach 1: use access control rules
directly Pros: Flexible Cons: Time-inefficient
Approach 2: fully materialized accessibility map (access control list) Pros: Time-efficient Cons: Space-inefficient
Our Approach Compressed Accessibility Map (CAM)
Take advantage of structural locality of accessibility
Index accessibility information in a compressed way
Both time-efficient and space-efficient
Structural Locality of Accessibility Data items grouped together have similar
accessibility properties Common in hierarchically-structured data
like XML [Bertino et al. 1999][Damiani et al. 2000]
Declarative authorization rules based on hierarchical structures
Accessibility propagation and overriding
Compressed Accessibility Map (CAM) Essentially an accessibility index Maintain a CAM for each user and access
type Identify “crucial” data items and store extra
accessibility information on them Other data items’ accessibility can be
inferred efficiently
Identify Crucial Data Items
A
B G
C D
E F
H
I J
Accessible node
Inaccessible node
A
B
(d+,s+)
(d-,s+)
Ancestor Accessibility and Unit Regions If a node is accessible, so are its
ancestors A unit region is a maximal subgraph of an
XML database such that ancestor accessibility holds
Easy to partition an XML database into unit regions
Unit Region Partition
A
C
E F I J
Accessible node
Inaccessible node
B
D
G
H
CAM for Unit Regions Allowed labels in unit region cam
(d+,s+), (d-,s+) and (d-,s-) Inference rules
Label on a node is most specific, thus overrides other inferences
Ancestor accessibility overrides descendents’ inference
Nearest labeled ancestor overrides other labeled ancestors
J
I
A
D K L
C E F M
G H
B
Valid CAM
A
D L
F
Accessible node
Inaccessible node
KB
C E
G H
I M
A
D
IF
B
E
G H
LK
M
(d-,s+)
(d-,s+) (d+,s+)
J
C
J
(d+,s+)
Accessibility Unknown
CAM Lookup Algorithm Given a node e, look up CAM
If e is labeled, check the sign of self label s If e has labeled descendents, e is accessible Get e’s nearest labeled ancestor f. e’s
accessibility is determined by the sign of f’s label d.
Complexity: proportion to the product of the depth of e in the XML tree and log of the size of CAM.
Optimal Unit Region CAM CAM with minimum size
Space-efficient Also reduce lookup time
Build optimal CAM Assign labels to each data node in a bottom-
up way Remove redundant labels
Redundant Labels: Induced labels Labels that are the same as what is
inferred from its ancestors’ labels
A
B
C
D E
(d+,s+)
(d+,s+)
(d-,s+)
(d-,s-) (d-,s-)
redundant
Accessible node
Inaccessible node
Redundant Labels: Upward Redundant Labels labels that can be inferred from its
descendents’ labels
A
B (d-,s+)
(d+,s+)
C
E
D F
(d-,s+)
Accessible node
Inaccessible node
redundant
Build Optimal CAM Assign labels in a bottom-up way
Accessible leaf (d+,s+), inaccessible leaf (d-,s-) Internal nodes’ labels is assigned according to
children’s labels Remove redundant labels
First remove induced labels Then remove upward redundant labels
Build Optimal CAM
Accessible node
Inaccessible node
A
D L
F
KB
C E
G H
I M
J
(d?,s+)(d+,s+)
(d-,s+) (d+,s+)
(d-,s-)
(d-,s+)
(d-,s-)
(d+,s+)
(d+,s+)
(d-,s-)
(d-,s-)
(d-,s-)
(d-,s-)
(d-,s+)
CAM for Multi Unit Regions Only need to mark out those nodes
(marker nodes) that start a unit region Build optimal CAM for each unit region Combine CAM for each unit regions
Lookup algorithm is almost the same, but need to take marker nodes into consideration. complexity remains the same
Further Compression in CAM for Multiple Unit Regions
A
C
E F I J
B
D
G
H
HH
(d+,s+)
Experimental Verification Metric – compression ratio
Size of CAM / fully materialized accessibility map
Synthetic data set Generated by IBM XML generator Study accessibility locality’s impact on
compression ratio of CAM Real data set
Large file systems with real access control data
Impact of Accessibility Locality
00.10.20.30.40.50.60.70.80.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Accessibility Ratio
Co
mp
ress
ion
Rat
io
Compression ratio when accessible nodes are uniformly distributed in the XML tree
Impact of Accessibility Locality
00.10.20.30.40.50.60.70.80.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Accessibility Ratio
Com
pres
sion
Rat
io
Compression ratio when accessibility locality is high
Conclusion Compressed accessibility map as an
efficient way to evaluate access control data for XML documents Time-efficient and space-efficient
Future work Better support for incremental CAM updates Take advantage of commonalities of users’
access rights and globally optimize CAM