16
September 2000 XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay.

September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

Embed Size (px)

DESCRIPTION

September 2000XML Workshop, IIT Bombay Why is Indexing Needed? Allows fast access to data by replicating portions of the data in special purpose structures. Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries.

Citation preview

Page 1: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Indexing of XML Data

Raghuraman RangarajanKReSIT, IIT Bombay.

Page 2: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Plan of Talk Why is indexing needed? Queries and Indexes in Traditional

DBMS Querying in XML Indexes: Path, Value Conclusion

Page 3: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Why is Indexing Needed? Allows fast access to data by

replicating portions of the data in special purpose structures.

Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries.

Page 4: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Queries and Indexes in Traditional DBMS

Databases

Query Example

Relational Associative

SELECT nameFROM accountWHERE acctNo =14

OO Path Expressions

SELECT X.nameFROM dept.empl X

Page 5: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

supplier

An XML Fragment

name

address

address

part subpart

name

name

name

namename

name

name address

address

partsupplier

supplier

(with leaf values omitted)

subpart

supplier

Page 6: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Queries in XML1. SELECT X

FROM part._*.supplier.name X

2. Select XFrom part._*.supplier: {name X, address: “Mumbai”}

Page 7: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Indexes for XML Path indexes: regular path

expressions Value Indexes: locating atomic

objects

Page 8: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

supplier

Building A Path Index

name

address

address

part subpart

name

name

name

namename

name

name address

address

partsupplier

supplier

part subpart

subpart name

name

name

namename

namesupplier supplier

supplier

address

address address

subpart

h1

h7

h6

h5

h4h3

h2

Page 9: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Path Index

•Index summarises path information

•Each entry: list of pointers to data nodes

part subpart

subpart name

name

name

namename

namesupplier supplier

supplier

address

address address

h1

h7

h6

h5

h4h3

h2

Page 10: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Using Path Index for Regular Path Expressions

(R1) part.name(R2) part.supplier.name(R3) _*.supplier.name(R4) part._*.subpart.name

part subpart

subpart name

name

name

namename

namesupplier supplier

supplier

address

address address

h1

h7

h6

h5

h4h3

h2

Page 11: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Path Indexes XSet project (Berkeley) Dataguides (Lore, Stanford)

Page 12: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Value Index Useful for comparisons (=, <, etc.) Example: Find supplier whose name is “XYZ”?

VIndex(name)

addressnamename address

part

supplier

subpart

supplier

“XYZ”“ABC”

Page 13: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Other Indexes Text Indexes: Information retrieval

style keyword search.Example: Find the suppliers in Mumbai(“address”)

Also supports search features like AND, OR, NEAR, etc.

Page 14: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Conclusion Performance improves significantly

when indexing is used for query processing (Lore).

Performance of the path indexes depends on the type of queries.

Page 15: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

References The Lore Project (www-db.stanford.edu/lore)

Work done by Dan Suciu (www.research.att.com/~suciu/)

Data on the Web: Serge Abiteboul, et al.

Page 16: September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay

September 2000 XML Workshop, IIT Bombay

Indexing of XML Data

Raghuraman RangarajanKReSIT, IIT Bombay.