27
VOX VOX O O rder-sensitive rder-sensitive V V iew iew Maintenance of Materialized Maintenance of Materialized X X Query Views Query Views ER 2003 October 14 th 2003 Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner Worcester Polytechnic Institute *Now at Microsoft

VOX O rder-sensitive V iew Maintenance of Materialized X Query Views

  • Upload
    rea

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

VOX O rder-sensitive V iew Maintenance of Materialized X Query Views. ER 2003 October 14 th 2003 Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner Worcester Polytechnic Institute *Now at Microsoft. Views in general Information integration Access control, privacy, ..etc - PowerPoint PPT Presentation

Citation preview

Page 1: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

VOXVOX OOrder-sensitive rder-sensitive VView Maintenance of iew Maintenance of

Materialized Materialized XXQuery ViewsQuery Views

ER 2003 October 14th 2003

Katica Dimitrova*, Maged El-Sayed and Elke Rundensteiner

Worcester Polytechnic Institute *Now at Microsoft

Page 2: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

2

MotivationMotivation

Views in general Information integration Access control, privacy, ..etc Data warehouses

XML Views (EXTRA useful) Information inter-portability Crossing gaps between different

data models

Materialized Views Fast access over complex views Increased availability Query optimization RDB XML

Other Sources

View

View Definition

Query

Page 3: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

3

View

Maintaining Materialized Views Maintaining Materialized Views

Methods of view maintenance

Recomputation recompute view from scratch from base data

View

Source 1 Source 2 Sources 3..n

View Definition

Query

update

When sources are updated, materialized view may become inconsistent.

Incremental view maintenance is usually cheaper than full recomputation.

Incremental view maintenance compute changes to view in response

to changes to base sources

update

Page 4: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

4

GoalGoal

Incrementally maintaining XQuery views

Why is it a challenge? XML features

Hierarchical Optional elements Self-typed IDRefs Ordered

Expressiveness of XQuery language Complex operations: tagging,

unnesting, aggregation, .. Expected large auxiliary

information

XMLSource

XMLSource

XMLSource

View

View Definition XQuery

Page 5: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

5

Basics of VOX Approach: AlgebraicBasics of VOX Approach: Algebraic

General approaches to view maintenance Algorithmic – Fixed procedure exists for fixed view type Algebraic - Update propagation rules for each algebra operator and each update type

XML Source

XML Source

XML Source

XML View

Update

Update

Algebra

Tree

XQuery Definition

Operator

D1

D2

Operator

D1 Update

D2 Update

Execution View Maintenance

time

Page 6: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

6

ExampleExample

Insert element<price>55.48</price> into second book

Bib.xml

<result>

<cheap_book>

<title>Data on the Web</title>

</cheap_book>

</result>

<result>

<cheap_book>

<title>Data on the Web</title>

</cheap_book>

</result>View Extent

<cheap_book> <title>TCP/IP Illustrated</title> </cheap_book>

<bib><book>

<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>

</book> <book>

<title> TCP/IP Illustrated </title></book>

<book> <price>39.95</price>

<title> Data on the Web </title> </book></bib>

<bib><book>

<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>

</book> <book>

<title> TCP/IP Illustrated </title></book>

<book> <price>39.95</price>

<title> Data on the Web </title> </book></bib>

<price>55.48</price>

<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>

$b/title </cheap_book>

</result>

<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>

$b/title </cheap_book>

</result>View Definition Query

Bib.xml

List all books that cost less than $60

Page 7: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

7

Background on XML Algebra XATBackground on XML Algebra XAT

XQuery XAT algebra tree [ZR02] XAT Operators:

XAT SQL Operators: Select, Project … XAT XML Operators: Navigate Unnest,

Navigate Collection, Tagger, Combine ..

$s6, /book $b

S “bib.xml” $s6

$b, title $col3

$b, price/text()$col5

T<cheap_book>$col3</cheap_book>$col2

C $col2

T <result>$col2</ result > $col1

bib.xml

($col5 < 60.0)

$col1

view

<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>

$b/title </cheap_book>

</result>

<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <cheap_book>

$b/title </cheap_book>

</result>

Page 8: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

8

Background on XML Algebra XAT – Data ModelBackground on XML Algebra XAT – Data Model

$s6, /book $b

S “bib.xml” $s6

$b, title $col3

$b, price/text()$col5

T<cheap_book>$col3</cheap_book> $col2

C $col2

T <result>$col2</ result > $col1

bib.xml

($col5 < 60.0)

$col1

view

$b $col3

<book> <price> 65.95 </price> <title>Advanc ..</title> </book> <title> Advanc ..</title>

<book> <title> TCP/IP …</title> </book> <title> TCP/IP …</title>

<book> <price> 39.95 </price> <title> Data on ..</title> </book> <title> Data on ..</title>

$col3 $col5

<title>Advanc ..</title> 65.95

<title>TCP/IP …</title>

<title> Data on ..</title> 39.95

$b, price/text()$col5

XAT Data Model (XAT Table) Order sensitive table of tuples Columns denote user-specified or internally generated variable bindings A cell in a tuple holds an XML node for a sequence of XML nodes

The XAT algebra has ordered bag semantics

InputOutput

Page 9: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

9

Order in XAT Context: View MaintenanceOrder in XAT Context: View Maintenance

$col3 $col5

<title> Advanced Prog…

</title>

65.95

<title> TCP/IP Illustrated </title>

<title> Data on the Web </title>

39.95

($col5 < 60.0)

$col3

<title> Data on the Web </title>

Non Order-sensitive Order-sensitive

55.48

$col3

<title> Data on the Web </title>

<title> TCP/IP Illustrated </title>

$col3

<title> Data on the Web </title>

$col3 $col5

<title> Advanced Prog…

</title>

65.95

<title> TCP/IP Illustrated </title>

<title> Data on the Web </title>

39.95

$col3

<title> Data on the Web </title>

($col5 < 60.0)

55.48

$col3

<title> TCP/IP

Illustrated </title>

<title> Data on the Web </title>

Page 10: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

10

Use node identity node identity

Why? Already present as concept in XQuery Can be referencereference to base XML data set Can encode structure and order

Our Approach to Maintaining OrderOur Approach to Maintaining Order

Page 11: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

11

bib

book book

book

price

title

title

price

title

b

b.h

b.n

b.t

b.h.r

b.h.k

b.n.m

b.t.k

b.t.r

Lexicographical Keys: LexKeysLexicographical Keys: LexKeys

price

b.n.f

Multi-level lexicographical keys

Comparison b.h < b.t bab < bd.cc b.b

< b.b.c

Advantages It is always possible to generate a

key between two keys The deletion of a LexKey in a

sequence does not affect other LexKeys

Page 12: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

12

LexKeys - References to source XML nodesLexKeys - References to source XML nodes

$b $col3

b.h b.h.r

b.n b.n.m

b.t b.t.r

$b

b.h

b.n

b.t

$b

<book> <price> 65.95 </price> <title>Advanc ..</title> </book>

<book> <title> TCP/IP …</title> </book>

<book> <price> 39.95 </price> <title> Data on ..</title> </book>

$b $col3

<book> <price> 65.95 </price> <title>Advanc ..</title> </book>

<title> Advanc ..</title>

<book> <title> TCP/IP …</title> </book>

<title> TCP/IP …</title>

<book> <price> 39.95 </price> <title> Data on ..</title> </book>

<title> Data on ..</title>

$b, title $col3$b, title $col3

bib.xml

bib

book book

book

price

title price

title

b

b.h

b.n

b.t

b.h.r

b.h.k

b.n.m

b.t.k

b.t.r

title

Storage ManagerStorage Manager

Page 13: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

13

LexKeys - References to constructed nodesLexKeys - References to constructed nodes

$col3

b.n.m

b.t.r

$col2

y.c

y.b

Constructed Nodes

SkeletonLexKey

y.b

y.c

cheap_book

b.t.r

cheap_book

b.n.m

T<cheap_book>$col3</cheap_book>$col2

$s6, /book $b

S “bib.xml” $s6

$b, title $col3

$b, price/text()$col5

T<cheap_book>$col3</cheap_book>$col2

C $col2

T <result>$col2</ result > $col1

bib.xml

($col5 < 60.0)

$col1

viewStorage ManagerStorage Manager

bib.xmlbib

book book

book

bb.h

b.n

b.t

titleb.t.r

Page 14: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

14

Order Among XAT TuplesOrder Among XAT Tuples

Notion: designate order schema to XAT tables

Ordering by LexKeys in columns in order schema yields correct tuple order.

Comparison operation ‘<’ on tuples.

$b $col3

b.h b.h.r

b.n b.n.m

b.t b.t.r

$b

b.h

b.n

b.t

$b, title $col3

2

1

3

b.h < b.n < b.t

11

11

Page 15: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

15

Order Schema ComputationOrder Schema Computation

Operator op(R) Order Schema OSQ, Q = op(R)

Tagger Tpattern $col’ (R) OSR

Calculated in a postorder traversal of the tree

Schema Computation Rules

Page 16: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

16

Concept of overriding order

Key (LexKey)Overriding

Order (LexKey)

LexKey with overriding orderLexKey with overriding order

Node identity part, by default also represents order

Optional, only represents order when present

Notation: key [order] Examples

b.c.b [h] b.c.b

Order Among Nodes in a CellOrder Among Nodes in a Cell

Combine creates a collection in which nodes may be in order different then one encoded in node identity

Most collections of XML nodes are in document order Navigate Collection, XML Union, …

Page 17: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

17

The Impact of Using LexKeys on View MaintenanceThe Impact of Using LexKeys on View Maintenance

XML algebra now has (non-ordered) bag semantics

Gained distributiveness with regard to bag union and difference

Compact intermediate results $b $col3 $col5

b.h b.h.r b.h.k.m

b.n b.n.m

b.t b.t.r b.t.k.m

($col5 < 60.0)

$b $col3

b.t b.t.r

b.n.f.m

$b $col3

b.t b.t.r

b.n b.n.m

Page 18: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

18

Update Propagation Strategy Update Propagation Strategy

XML Source XML Source XML Source

XML ViewUpdate

XAT

iup

k

Storage ManagerStorage ManagerRainbowRainbow

UpdateXQuery

Page 19: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

19

Update Propagation RulesUpdate Propagation Rules

Operator

R

Q

Operator

Update to R

Update to Q

Execution View Maintenance

time

Use distributiveness with regard to bag union

Reuse rules from relational view maintenance for XAT SQL operators

Provide rules for XAT XML operators

Page 20: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

20

Update Propagation Rules Example - Update Propagation Rules Example - Navigate Unnest on Insertion of Tuples Navigate Unnest on Insertion of Tuples

Qold = $col,path$col’ (Rold)

Rnew=Rold + R

Qnew = $col,path$col’ (Rold + R) =

= $col,path$col’ (Rold) + $col,path

$col’ (R) =

= Qold + Q

+ represents bag union

R

Q

$col,path$col’

u (R)

u (Q)

Execution View Maintenance

time

$col,path$col’

Propagate u(Q)

Page 21: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

21

bib.xml

Constructed XDOMs

tid $b $col3 $col5

1 b.h b.h.r b.h.k.m

2 b.n b.n.m b.n.f.m3 b.t b.t.r b.t.k.m

$s6, /book $b

S “bib.xml” $s6

$b, title $col3

$b, price/text()$col5

T<cheap_book>$col3</cheap_book>$col2

C $col2

T <result>$col2</ result > $col1

($col5 < 60.0)

$col1

view

bib

book book

book

price

title

pricetitle

bb.h

b.n

b.t

b.h.r

b.h.kb.n.m

b.t.k

b.t.r

pid tid $b

1 1 b.h

1 2 b.n1 3 b.t

65.95

39.95

b.h.k.m

b.t.k.m

tid

3

$b

b.t

SkeletonLexKey

y.bcheap_book

b.t.r

$col1

x

x

result

y.b[b.t]

title

View Maintenance ExampleView Maintenance Example

+b.n.f, book[b.n]/price[b.n.f] b

u (+b.n.f, book[b.n]/price[b.n.f] b,

$s6, 1)

u (+b.n.f, price[b.n.f] b.n, $b, 2)

u (c, $col5, 2) | c= b.n.f.m

u (s) | s = (b.n, b.n.m)

u (s) | s = (b.n, y.c )

u (c, $col2, 1) | c = y.c[b.n]

u (+y.c[b.n], result[1]/$col2 x,

$col1, 1)

u (+b.n.f, price[b.n.f] b.n, $b, 2)

Storage ManagerStorage Manager

RainbowRainbow

Insert element<price>55.48</price>

into second book

priceb.n.f

55.48b.n.f.m

tid

32

y.ccheap_book

b.n.m

$b

b.tb.n

y.c[b.n]

Page 22: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

22

bib.xml

Constructed XDOMs

bib

book book

book

price

title

pricetitle

bb.h

b.n

b.t

b.h.r

b.h.kb.n.m

b.t.k

b.t.r

65.95

39.95

b.h.k.m

b.t.k.m

SkeletonLexKey

y.bcheap_book

b.t.r

x

result

y.b[b.t]

title

View Maintenance ExampleView Maintenance Example

Storage ManagerStorage Manager

RainbowRainbow

Insert element<price>55.48</price>

into second book

priceb.n.f

55.48b.n.f.m

y.ccheap_book

b.n.m

y.c[b.n]

S “bib.xml” $s6

T <result>$col2</ result > $col1

$col1

view

x

$col1

.

.

.

.

result

y.b[b.t] y.c[b.n]cheap_book

b.n.m

TCP/IP Illustrated

title

cheap_book

title

Data on the Web

Page 23: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

23

637 elements of interest

selectivity 50%

0500

10001500200025003000350040004500

400.

00%

200.

00%

100.

00%

50.0

0%

25.0

0%

12.5

0%

6.25

%3.

13%

1.56

%

Size of update as % of original data size

tim

e(m

s)

Recomputation View Maintenance

Experimental EvaluationExperimental Evaluation

Basic performance comparison Varying size of insert

Implemented in Java on top of Rainbow system Experimental evaluation

Page 24: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

24

Related workRelated work

Relational [GMS93] Survey [GL95] Algebraic approach to maintain relational views with duplicates [BLT86], [CW91], [ZGHW95], [Q96], [MK00], [PSCP02]…

Object-Oriented [KR96] MultiView. Object algebra, exploit OO features like inheritance, path indexes. [AFP02] Algebraic approach. Store OID-s rather then actual data.

XML-like data models [ZM98] Select-Project graph structured views as collections of objects. [AMRVW98] Semistructured data model OEM, query language LOREL. Only atomic

updates. Does not handle order. [QLR02] Dynamic web data. Based on XPath. Maintains path index structure. [LD00] Hierarchical semistructured data. View defined with WHAX-QL. Does not handle

order. [EWDR02] – Motivation for this work. Algebraic approach. Does not handle order. Large

intermediate results.

Page 25: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

25

ConclusionsConclusions

Proposed order-encoding scheme that migrates XML algebra from ordered to non-ordered bag semantics

Gave first solution to order-sensitive XQuery view maintenance

Handles core of XQuery

Handles complex updates

Proved correctness of approach

Implemented the solution within Rainbow

Experimental evaluation confirms feasibility of solution

Page 26: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

26

For more information For more information

The Rainbow projecthttp://davis.wpi.edu/dsrg/rainbow/

Related publications K. Dimitrova, M. El-Sayed  and E. Rundensteiner.  Order-sensitive View

Maintenance of Materialized XQuery Views. Technical Report WPI-CS-TR-03-17, May 2003.

M. El-Sayed, K. Dimitrova, E. Rundensteiner, Efficiently Supporting Order in XML Query Processing, WIDM'03, New Orleans, Nov.2003.

X. Zhang, K. Dimitrova, L. Wang, B. Pielech, L. Ding, B. Murphy, M. El-Sayed and E. Rundensteiner. RainbowII: Multi-XQuery Optimization Using Materialized XML Views. SIGMOD DEMO, Jun. 2003.

M. Sayed, L. Wang, L. Ding and E. Rundensteiner. An Algebraic Approach for Incremental Maintenance of Materialized XQuery Views. In Proceedings of WIDM02, page88, 2002.(.ps)

X. Zhang, B. Pielech and E. Rundensteiner. Honey, I Shrunk the Xquery!- An XML Algebra Optimization Approach. In Proceedings of WIDM02, 2002.

X. Zhang, M. Mulchandani, S. Christ, B. Murphy and E. Rundensteiner. Rainbow: Mapping-Driven XQuery Processing System. Proceeding of SIGMOD02, In Demo Session, page 614, 2002.

Page 27: VOX O rder-sensitive  V iew Maintenance of Materialized  X Query Views

27

Thank you !Thank you !