Upload
rea
View
31
Download
0
Tags:
Embed Size (px)
DESCRIPTION
VOX O rder-sensitive V iew Maintenance of Materialized X Query Views. ER 2003 October 14 th 2003 Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner Worcester Polytechnic Institute *Now at Microsoft. Views in general Information integration Access control, privacy, ..etc - PowerPoint PPT Presentation
Citation preview
VOXVOX OOrder-sensitive rder-sensitive VView Maintenance of iew Maintenance of
Materialized Materialized XXQuery ViewsQuery Views
ER 2003 October 14th 2003
Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner
Worcester Polytechnic Institute *Now at Microsoft
2
MotivationMotivation
Views in general Information integration Access control, privacy, ..etc Data warehouses
XML Views (EXTRA useful) Information inter-portability Crossing gaps between different
data models
Materialized Views Fast access over complex views Increased availability Query optimization RDB XML
Other Sources
View
View Definition
Query
3
View
Maintaining Materialized Views Maintaining Materialized Views
Methods of view maintenance
Recomputation recompute view from scratch from base data
View
Source 1 Source 2 Sources 3..n
View Definition
Query
update
When sources are updated, materialized view may become inconsistent.
Incremental view maintenance is usually cheaper than full recomputation.
Incremental view maintenance compute changes to view in response
to changes to base sources
update
4
GoalGoal
Incrementally maintaining XQuery views
Why is it a challenge? XML features
Hierarchical Optional elements Self-typed IDRefs Ordered
Expressiveness of XQuery language Complex operations: tagging,
unnesting, aggregation, .. Expected large auxiliary
information
XMLSource
XMLSource
XMLSource
View
View Definition XQuery
5
Basics of VOX Approach: AlgebraicBasics of VOX Approach: Algebraic
General approaches to view maintenance Algorithmic – Fixed procedure exists for fixed view type Algebraic - Update propagation rules for each algebra operator and each update type
XML Source
XML Source
XML Source
XML View
Update
Update
Algebra
Tree
XQuery Definition
Operator
D1
D2
Operator
D1 Update
D2 Update
Execution View Maintenance
time
6
ExampleExample
Insert element<price>55.48</price> into second book
Bib.xml
<result>
<cheap_book>
<title>Data on the Web</title>
</cheap_book>
</result>
<result>
<cheap_book>
<title>Data on the Web</title>
</cheap_book>
</result>View Extent
<cheap_book> <title>TCP/IP Illustrated</title> </cheap_book>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book>
<book> <price>39.95</price>
<title> Data on the Web </title> </book></bib>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book>
<book> <price>39.95</price>
<title> Data on the Web </title> </book></bib>
<price>55.48</price>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>
$b/title </cheap_book>
</result>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>
$b/title </cheap_book>
</result>View Definition Query
Bib.xml
List all books that cost less than $60
7
Background on XML Algebra XATBackground on XML Algebra XAT
XQuery XAT algebra tree [ZR02] XAT Operators:
XAT SQL Operators: Select, Project … XAT XML Operators: Navigate Unnest,
Navigate Collection, Tagger, Combine ..
$s6, /book $b
S “bib.xml” $s6
$b, title $col3
$b, price/text()$col5
T<cheap_book>$col3</cheap_book>$col2
C $col2
T <result>$col2</ result > $col1
bib.xml
($col5 < 60.0)
$col1
view
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>
$b/title </cheap_book>
</result>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>
$b/title </cheap_book>
</result>
8
Background on XML Algebra XAT – Data ModelBackground on XML Algebra XAT – Data Model
$s6, /book $b
S “bib.xml” $s6
$b, title $col3
$b, price/text()$col5
T<cheap_book>$col3</cheap_book> $col2
C $col2
T <result>$col2</ result > $col1
bib.xml
($col5 < 60.0)
$col1
view
$b $col3
<book> <price> 65.95 </price> <title>Advanc ..</title> </book> <title> Advanc ..</title>
<book> <title> TCP/IP …</title> </book> <title> TCP/IP …</title>
<book> <price> 39.95 </price> <title> Data on ..</title> </book> <title> Data on ..</title>
$col3 $col5
<title>Advanc ..</title> 65.95
<title>TCP/IP …</title>
<title> Data on ..</title> 39.95
$b, price/text()$col5
XAT Data Model (XAT Table) Order sensitive table of tuples Columns denote user-specified or internally generated variable bindings A cell in a tuple holds an XML node for a sequence of XML nodes
The XAT algebra has ordered bag semantics
InputOutput
9
Order in XAT Context: View MaintenanceOrder in XAT Context: View Maintenance
$col3 $col5
<title> Advanced Prog…
</title>
65.95
<title> TCP/IP Illustrated </title>
<title> Data on the Web </title>
39.95
($col5 < 60.0)
$col3
<title> Data on the Web </title>
Non Order-sensitive Order-sensitive
55.48
$col3
<title> Data on the Web </title>
<title> TCP/IP Illustrated </title>
$col3
<title> Data on the Web </title>
$col3 $col5
<title> Advanced Prog…
</title>
65.95
<title> TCP/IP Illustrated </title>
<title> Data on the Web </title>
39.95
$col3
<title> Data on the Web </title>
($col5 < 60.0)
55.48
$col3
<title> TCP/IP
Illustrated </title>
<title> Data on the Web </title>
10
Use node identity node identity
Why? Already present as concept in XQuery Can be referencereference to base XML data set Can encode structure and order
Our Approach to Maintaining OrderOur Approach to Maintaining Order
11
bib
book book
book
price
title
title
price
title
b
b.h
b.n
b.t
b.h.r
b.h.k
b.n.m
b.t.k
b.t.r
Lexicographical Keys: LexKeysLexicographical Keys: LexKeys
price
b.n.f
Multi-level lexicographical keys
Comparison b.h < b.t bab < bd.cc b.b
< b.b.c
Advantages It is always possible to generate a
key between two keys The deletion of a LexKey in a
sequence does not affect other LexKeys
12
LexKeys - References to source XML nodesLexKeys - References to source XML nodes
$b $col3
b.h b.h.r
b.n b.n.m
b.t b.t.r
$b
b.h
b.n
b.t
$b
<book> <price> 65.95 </price> <title>Advanc ..</title> </book>
<book> <title> TCP/IP …</title> </book>
<book> <price> 39.95 </price> <title> Data on ..</title> </book>
$b $col3
<book> <price> 65.95 </price> <title>Advanc ..</title> </book>
<title> Advanc ..</title>
<book> <title> TCP/IP …</title> </book>
<title> TCP/IP …</title>
<book> <price> 39.95 </price> <title> Data on ..</title> </book>
<title> Data on ..</title>
$b, title $col3$b, title $col3
bib.xml
bib
book book
book
price
title price
title
b
b.h
b.n
b.t
b.h.r
b.h.k
b.n.m
b.t.k
b.t.r
title
Storage ManagerStorage Manager
13
LexKeys - References to constructed nodesLexKeys - References to constructed nodes
$col3
b.n.m
b.t.r
$col2
y.c
y.b
Constructed Nodes
SkeletonLexKey
y.b
y.c
cheap_book
b.t.r
cheap_book
b.n.m
T<cheap_book>$col3</cheap_book>$col2
$s6, /book $b
S “bib.xml” $s6
$b, title $col3
$b, price/text()$col5
T<cheap_book>$col3</cheap_book>$col2
C $col2
T <result>$col2</ result > $col1
bib.xml
($col5 < 60.0)
$col1
viewStorage ManagerStorage Manager
bib.xmlbib
book book
book
bb.h
b.n
b.t
titleb.t.r
14
Order Among XAT TuplesOrder Among XAT Tuples
Notion: designate order schema to XAT tables
Ordering by LexKeys in columns in order schema yields correct tuple order.
Comparison operation ‘<’ on tuples.
$b $col3
b.h b.h.r
b.n b.n.m
b.t b.t.r
$b
b.h
b.n
b.t
$b, title $col3
2
1
3
b.h < b.n < b.t
11
11
15
Order Schema ComputationOrder Schema Computation
Operator op(R) Order Schema OSQ, Q = op(R)
Tagger Tpattern $col’ (R) OSR
Calculated in a postorder traversal of the tree
Schema Computation Rules
16
Concept of overriding order
Key (LexKey)Overriding
Order (LexKey)
LexKey with overriding orderLexKey with overriding order
Node identity part, by default also represents order
Optional, only represents order when present
Notation: key [order] Examples
b.c.b [h] b.c.b
Order Among Nodes in a CellOrder Among Nodes in a Cell
Combine creates a collection in which nodes may be in order different then one encoded in node identity
Most collections of XML nodes are in document order Navigate Collection, XML Union, …
17
The Impact of Using LexKeys on View MaintenanceThe Impact of Using LexKeys on View Maintenance
XML algebra now has (non-ordered) bag semantics
Gained distributiveness with regard to bag union and difference
Compact intermediate results $b $col3 $col5
b.h b.h.r b.h.k.m
b.n b.n.m
b.t b.t.r b.t.k.m
($col5 < 60.0)
$b $col3
b.t b.t.r
b.n.f.m
$b $col3
b.t b.t.r
b.n b.n.m
18
Update Propagation Strategy Update Propagation Strategy
XML Source XML Source XML Source
XML ViewUpdate
XAT
iup
k
Storage ManagerStorage ManagerRainbowRainbow
UpdateXQuery
19
Update Propagation RulesUpdate Propagation Rules
Operator
R
Q
Operator
Update to R
Update to Q
Execution View Maintenance
time
Use distributiveness with regard to bag union
Reuse rules from relational view maintenance for XAT SQL operators
Provide rules for XAT XML operators
20
Update Propagation Rules Example - Update Propagation Rules Example - Navigate Unnest on Insertion of Tuples Navigate Unnest on Insertion of Tuples
Qold = $col,path$col’ (Rold)
Rnew=Rold + R
Qnew = $col,path$col’ (Rold + R) =
= $col,path$col’ (Rold) + $col,path
$col’ (R) =
= Qold + Q
+ represents bag union
R
Q
$col,path$col’
u (R)
u (Q)
Execution View Maintenance
time
$col,path$col’
Propagate u(Q)
21
bib.xml
Constructed XDOMs
tid $b $col3 $col5
1 b.h b.h.r b.h.k.m
2 b.n b.n.m b.n.f.m3 b.t b.t.r b.t.k.m
$s6, /book $b
S “bib.xml” $s6
$b, title $col3
$b, price/text()$col5
T<cheap_book>$col3</cheap_book>$col2
C $col2
T <result>$col2</ result > $col1
($col5 < 60.0)
$col1
view
bib
book book
book
price
title
pricetitle
bb.h
b.n
b.t
b.h.r
b.h.kb.n.m
b.t.k
b.t.r
pid tid $b
1 1 b.h
1 2 b.n1 3 b.t
65.95
39.95
b.h.k.m
b.t.k.m
tid
3
$b
b.t
SkeletonLexKey
y.bcheap_book
b.t.r
$col1
x
x
result
y.b[b.t]
title
View Maintenance ExampleView Maintenance Example
+b.n.f, book[b.n]/price[b.n.f] b
u (+b.n.f, book[b.n]/price[b.n.f] b,
$s6, 1)
u (+b.n.f, price[b.n.f] b.n, $b, 2)
u (c, $col5, 2) | c= b.n.f.m
u (s) | s = (b.n, b.n.m)
u (s) | s = (b.n, y.c )
u (c, $col2, 1) | c = y.c[b.n]
u (+y.c[b.n], result[1]/$col2 x,
$col1, 1)
u (+b.n.f, price[b.n.f] b.n, $b, 2)
Storage ManagerStorage Manager
RainbowRainbow
Insert element<price>55.48</price>
into second book
priceb.n.f
55.48b.n.f.m
tid
32
y.ccheap_book
b.n.m
$b
b.tb.n
y.c[b.n]
22
bib.xml
Constructed XDOMs
bib
book book
book
price
title
pricetitle
bb.h
b.n
b.t
b.h.r
b.h.kb.n.m
b.t.k
b.t.r
65.95
39.95
b.h.k.m
b.t.k.m
SkeletonLexKey
y.bcheap_book
b.t.r
x
result
y.b[b.t]
title
View Maintenance ExampleView Maintenance Example
Storage ManagerStorage Manager
RainbowRainbow
Insert element<price>55.48</price>
into second book
priceb.n.f
55.48b.n.f.m
y.ccheap_book
b.n.m
y.c[b.n]
S “bib.xml” $s6
T <result>$col2</ result > $col1
$col1
view
x
$col1
.
.
.
.
result
y.b[b.t] y.c[b.n]cheap_book
b.n.m
TCP/IP Illustrated
title
cheap_book
title
Data on the Web
23
637 elements of interest
selectivity 50%
0500
10001500200025003000350040004500
400.
00%
200.
00%
100.
00%
50.0
0%
25.0
0%
12.5
0%
6.25
%3.
13%
1.56
%
Size of update as % of original data size
tim
e(m
s)
Recomputation View Maintenance
Experimental EvaluationExperimental Evaluation
Basic performance comparison Varying size of insert
Implemented in Java on top of Rainbow system Experimental evaluation
24
Related workRelated work
Relational [GMS93] Survey [GL95] Algebraic approach to maintain relational views with duplicates [BLT86], [CW91], [ZGHW95], [Q96], [MK00], [PSCP02]…
Object-Oriented [KR96] MultiView. Object algebra, exploit OO features like inheritance, path indexes. [AFP02] Algebraic approach. Store OID-s rather then actual data.
XML-like data models [ZM98] Select-Project graph structured views as collections of objects. [AMRVW98] Semistructured data model OEM, query language LOREL. Only atomic
updates. Does not handle order. [QLR02] Dynamic web data. Based on XPath. Maintains path index structure. [LD00] Hierarchical semistructured data. View defined with WHAX-QL. Does not handle
order. [EWDR02] – Motivation for this work. Algebraic approach. Does not handle order. Large
intermediate results.
25
ConclusionsConclusions
Proposed order-encoding scheme that migrates XML algebra from ordered to non-ordered bag semantics
Gave first solution to order-sensitive XQuery view maintenance
Handles core of XQuery
Handles complex updates
Proved correctness of approach
Implemented the solution within Rainbow
Experimental evaluation confirms feasibility of solution
26
For more information For more information
The Rainbow projecthttp://davis.wpi.edu/dsrg/rainbow/
Related publications K. Dimitrova, M. El-Sayed and E. Rundensteiner. Order-sensitive View
Maintenance of Materialized XQuery Views. Technical Report WPI-CS-TR-03-17, May 2003.
M. El-Sayed, K. Dimitrova, E. Rundensteiner, Efficiently Supporting Order in XML Query Processing, WIDM'03, New Orleans, Nov.2003.
X. Zhang, K. Dimitrova, L. Wang, B. Pielech, L. Ding, B. Murphy, M. El-Sayed and E. Rundensteiner. RainbowII: Multi-XQuery Optimization Using Materialized XML Views. SIGMOD DEMO, Jun. 2003.
M. Sayed, L. Wang, L. Ding and E. Rundensteiner. An Algebraic Approach for Incremental Maintenance of Materialized XQuery Views. In Proceedings of WIDM02, page88, 2002.(.ps)
X. Zhang, B. Pielech and E. Rundensteiner. Honey, I Shrunk the Xquery!- An XML Algebra Optimization Approach. In Proceedings of WIDM02, 2002.
X. Zhang, M. Mulchandani, S. Christ, B. Murphy and E. Rundensteiner. Rainbow: Mapping-Driven XQuery Processing System. Proceeding of SIGMOD02, In Demo Session, page 614, 2002.
27
Thank you !Thank you !