26
11/08/2002 WIDM2002 1 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A. Rundensteiner DSRG Lab, Computer Science Dept. Worcester Polytechnic Institute

11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

Embed Size (px)

Citation preview

Page 1: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 1

An Algebraic Approach For Incremental Maintenance

of Materialized XQuery Views

Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A. Rundensteiner

DSRG Lab, Computer Science Dept.

Worcester Polytechnic Institute

Page 2: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 2

Materialized ViewsWhy Views?

Data warehousesInformation IntegrationInformation Inter-portability

Why Materialized?Speeding up data retrieval Query optimization

Focus on materialized views built using XQuery

RDB XML other Sources

View

Query

Page 3: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 3

Updating Materialized Views

Updates to sources are commonViews need to be maintained to be consistent with sources.How to update views:

Re-computation Incremental update

Page 4: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 4

Related WorkRelational Model: [GM95], [ZGMHW95], [AASY98], [ZR00], [KR02]

Data updates, schema changes, relevant updates, algebraic-based propagation, concurrent updates, extended SQL, batch updates.

XML model ARGOS [QCR01] :

XQL, local cash indexes, relevant updates

Semi-structured data [AMRVW98] :

Lorel query language

Page 5: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 5

Goal Efficient update propagation for

updates on XML viewsTo handle different updates type :

Element: insert, delete, and change

Attribute: insert, delete, and change

To handle any XML view defined by XQuery

Page 6: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 6

Our Approach: XPROPUses XAT algebra [Rainbow] Uses update primitives to update XMLAlgebraic-rules for updates:

For each algebraic update For each update primitive

Propagate only relevant updates

XML Source XML Source XML Source

XML View

Update

Update

Algebra Tree

XQuery Definition

Page 7: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 7

Background: XML AlgebraXAT Algebraic Operators: [Rainbow]

XML Operators: ex. Navigate SQL Operators: ex. Select

Special Operators: ex. Function

XAT Data Model: [Rainbow]

XAT table can hold:Atomic valuesElements/attributes Collection of elements

$s2, entry $b

S “reviews.xml” $s2

$s $b

<reviews>r</reviews>

<entry>e1</entry>

<reviews>r</reviews>

<entry>e2</entry>

<reviews>r</reviews>

<entry>e3</entry>

$s2

<reviews>r</reviews>

Page 8: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 8

Example<bib>

<book year="1992">

<title>Advanced Programming in the Unix environment</title>

<author>Darcy Gerbarg</author>

<publisher>Addison-Wesley</publisher>

</book>

<book year="2000">

<title>Data on the Web</title>

<author>Serge Abiteboul</author>

<publisher>Morgan Kaufmann Publishers</publisher>

</book>

<book year="1994">

<title>TCP/IP Illustrated</title>

<author>W. Stevens</author>

<publisher>Addison-Wesley</publisher>

</book>

</bib>

<reviews>

<entry>

<title>Data on the Web</title>

<review> A very good discussion of semi-structured database systems and XML</review>

</entry>

<entry>

<title>Advanced Programming in the Unix environment</title>

<review>A clear and detailed discussion of UNIX programming</review>

</entry>

<entry>

<title>TCP/IP Illustrated</title>

<review>One of the best books on TCP/IP</review>

</entry>

</reviews>

reviews.xml

bib.xml

Page 9: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 9

XQuery ExampleRetrieve book title for all books published by

“Morgan Kaufmann Publishers” and their respective reviews

<Result>

FOR $a IN document("bib.xml")/book,

$b IN document("reviews.xml")/entry

WHERE $a/title = $b/title and

$a/publisher = "Morgan

Kaufmann Publishers"

RETURN

<Book_Review>

$a/title, $b/review

</Book_Review>

</Result>

<Result>

<Book_Review>

<title>Data on the Web</title>

<review> A very good discussion of semi-structured database systems and XML

</review>

</Book_Review>

</Result>

View Definition Materialized View

Page 10: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 10

From XQuery to Algebra Tree

S “bib.xml” $s1 S “reviews.xml” $s2

$s1, book $a $s2, entry $b

$b, review $col12$a, title $col11

$b, title $col5

J (($col11 == $col5) AND ($col7 == "Morgan Kaufmann Publishers")) $col1, $col12

T<Book_Review>$col11$col12</Book_Review> $col13

Agg $col13

$a, publisher $col7

T<Result>$col13</ Result > $col14

Operators

S : Source

:Navigate

J : Join

T : Tagger

Agg: aggregate

Page 11: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 11

Sample XAT Execution

$s2, entry $b

S “reviews.xml” $s2

$b

<entry>e1</entry>

<entry>e2</entry>

<entry>e3</entry>

$b, review $col12

$b $col12

<entry>e1</entry>

<review>r1</review>

<entry>e2</entry>

<review>r2</review>

<entry>e3</entry>

<review>r3</review>

$ col12 $col5

<review>r1</review> <title>t1</title>

<review>r2</review> <title>t2</title>

<review>r3</review> <title>t3</title>$b, title

$col5

$s2

<reviews>r</reviews>

<reviews>

<entry>

<title>Data on the Web</title>

<review> A very good discussion of semi-structured database systems and XML</review>

</entry>

<entry>

<title>Advanced Programming in the Unix environment</title>

<review>A clear and detailed discussion of UNIX programming</review>

</entry>

<entry>

<title>TCP/IP Illustrated</title>

<review>One of the best books on TCP/IP</review>

</entry>

</reviews>

reviews.xml

Page 12: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 12

XPROP Update Primitives XML Update primitives (xmlup): applies to XML

documents AddAtt(att, valu, pos)DeleteAtt(pos)ChangeAtt(valu, pos)AddEle(el, pos)DeleteEle(pos)ChangeEle(el, pos)

XAT Update primitives (xatup): applies to XAT tablesInsertTuple (tup, ord)DeleteTuple (id)

ChangeTuple (xmlup, ucol, id)

Page 13: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 13

Update Propagation in XAT Table

XAT Update Primitives:InsertTuple (tup, ord)

DeleteTuple (id)

ChangeTuple (xmlup, ucol, id)

Example:

ChangeTuple (DeleteEle(Author[2]:book), $col1, 2)

ID P $col1

1 1 <Book>b1</Book>

2 1 <Book> <Author>.. </Author> <Author>.. </Author> </Book>

posXmlup ucol id

Page 14: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 14

XPROP Update Primitives

XML Source XML Source XML Source

XML ViewUpdate

XAT

XQuery Definition

XQuery

xatup

xmlup

Page 15: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 15

Update Propagation Example“Change publisher element of first book element to

"Morgan Kaufmann Publishers"

<Result>

<Book_Review>

<title>Advanced Programming in the Unix environment</title>

<review>A clear and detailed discussion of UNIX programming</review>

</Book_Review>

<Book_Review>

<title>Data on the Web</title>

<review> A very good discussion of semi-structured database systems and XML</review>

</Book_Review>

</Result>

ChangeEle(<publisher>Morgan Kaufmann Publishers</publisher>, book[1].publisher[1] : bib)

Update to the source

Effect on View

bib.xml reviews.xml

XAT

Page 16: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 16

$s1, book $a $s2, entry $b

S “bib.xml” $s1 S “reviews.xml” $s2

ID P $b

1 1 <entry>e1</entry>

2 1 <entry>e2</entry>

3 1 <entry>e3</entry>

ID P $a

1 1 <book>b1</book>

2 1 <book>b2</book>

3 1 <book>b2</book>

$b, review $col12$a, title $col11

ID P $a $col11

1 1 <book>b1</book> <title>t1</title>

2 2 <book>b2</book> <title>t2</title>

3 3 <book>b3 </book> <title>t3</title>

ID P $b $col12

1 1 <entry>e1</entry>

<review>r1</review>

2 2 <entry>e2</entry>

<review>r2</review>

3 3 <entry>e3</entry>

<review>r3</review>

$b, title $col5$a, publisher $col7

ID $s2

1 <reviews>r</reviews>

ID $s1

1 <bib>b</bib>

bib.xml reviews.xmlChangeEle (<publisher> Morgan Kaufmann Publishers</publisher>, book[1].publisher[1]:bib)

1

ChangeTuple(ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher> , book[1].publisher[1]:bib), $s1, 1)

2

ChangeTuple(ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher> ,publisher[1]:book), $a, 1)

3

ChangeTuple(ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher>, publisher[1]:book), $a, 1)

4

Update Propagation in XAT (I)

Page 17: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 17

ID P $ col11 $col7

1 1 <title>t1</title>

<publisher>p1</publisher>

2 2 <title>t2</title>

<publisher>p2</publisher>

3 3 <title>t3</title>

<publisher>p3</publisher>

J (($col11 == $col5) AND $col7 == ("Morgan Kaufmann Publishers")) $col1, $col12

ID P $ col12 $col5

1 1 <review>r1</review> <title>t1</title>

2 2 <review>r2</review> <title>t2</title>

3 3 <review>r3</review> <title>t3</title>

ID P $ col11 $ col12

2 {1 2}

<title>t1</title>

<review>r2</ review >

1 {2 1}

<title>t2</title>

<review>r1</ review >

T<Book_Review>$col11$col12</Book_Review> $col13

ID P $col13

2 2 <Book_Review> <title>t1</title> <review>r2</review><Book_Review>

1 1 <Book_Review> <title>t2</title> <review>r1</review><Book_Review>

ChangeTuple( ChangeEle (<publisher> Morgan Kaufmann Publishers </publisher> ,publisher), $col7, 1)

InsertTuple({2 ,{1, 2}, <title>t1</title>, <review>t2</ review >}, 1)

InsertTuple({2 ,2, <Book_Review> <title>t1</title> <review>r2</review> </Book_Review>}, 1)

5

6

7

$b, title $col5$a, publisher $col7

Agg $col13

Update Propagation in XAT (II)

Page 18: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 18

Agg $col13

ID P $col13

1 {2, 1}

<Book_Review> <title>t2</title> <review>r1</review><Book_Review>

T<Result>$col13</ Result > $col14

ID P $col14

1 1 <Result>

<Book_Review> <title>t2 </title> <review>r1</review> <Book_Review></Result>

ChangeTuple(AddtEle( <Book_Review> <title>t1</title><review>r2</review> </Book_Review>, 1), $col13, 1)

<Book_Review> <title>t1</title> <review>r2</review><Book_Review>

<Book_Review> <title>t1</title> <review>r2</review> <Book_Review>

ChangeTuple(AddtEle (<Result> <Book_Review>.. </Book_Review> <Result>, 1: Result), $col14, 1)

8

9

Update Propagation in XAT (III)XML ViewAddtEle (<Result><Book_Review>.. </Book_Review> <Result>, 1: Result)

10

Page 19: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 19

System Architecture

XML Source XML Source XML Source

XML View

XML Query Manager

Propagation Rules

Update Handle

Update

Maintainer .

IntermediateMaterialization

XAT

XPROP*

RAINBOW

System

Module

Storage

Data

Legend

*Our system is built as an extension to the Rainbow system developed at WPI

Page 20: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 20

Experimental Result

0

4000

8000

12000

16000

20000

10 20 30 40 50 60 70 80 90 100

XML Documnt size (Elements/file)

Time (ms)

re-compute XPROP

Page 21: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 21

Summery One of the first algebraic incremental view maintenance approaches for XMLProposed a set of update primitive for updating XML documentsHandle general updates on XML Proposed propagation rules to handle updates incrementally

Page 22: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 22

Future WorkUse references in XAT tablesBatch updates XAT tree rewriteExtensive Experiments

Page 23: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 23

Rainbow Project:

http://davis.wpi.edu/dsrg/rainbow/index.html

XPROP Extension: http://davis.wpi.edu/dsrg/xprop/index.html

Page 24: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 24

Extension to XAT Table We need extra information:

Tuple ID (ID)

Tuple parent ID (P)

ID P $col1 $col2

1 1 <Book>b1</Book> <Author>a1</ Author >

2 1 <Book>b1</Book> <Author>a2</ Author >

3 2 <Book>b2</Book> <Author>a3</ Author >

$cols, Author $col2

ID P $col1

1 1 <Book>b1</Book>

2 1 <Book>b1</Book>

$s1, book $col1

S “bib.xml” $s1

ID $s1

1 <bib>b</bib>

bib.xml

Page 25: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 25

Update PositionUpdate Position (pos): [Niagara]

Entry point part (EP)

Relative forward part (RF)Example: book[2].author[1] : bib

EP can be: element name : ex. book[2].author[1] : bib

XML fragment order: ex. author[2] : 1 Null: ex. Null : Null

RF can be: set of elements/attributes with order: ex. book[2].author[1] : bib

Null : ex. Null : Book

Page 26: 11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A

11/08/2002 WIDM2002 26

Update Position and XAT

$col1

<Book> <Author> <Last> … <Last> <Author><Book>

$col1

{<Author> <Last> … <Last> <Author> <Author> <Last> … <Last> <Author>}

$col1

55

Element/attribute

Collection

Atomic Value Null : Null

Author[1].last[1] : Book

last[1] : 2

Data type in XAT Example Update position