Upload
tyler-ferguson
View
218
Download
4
Embed Size (px)
DESCRIPTION
September 2000XML Workshop, IIT Bombay Why is Indexing Needed? Allows fast access to data by replicating portions of the data in special purpose structures. Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries.
Citation preview
September 2000 XML Workshop, IIT Bombay
Indexing of XML Data
Raghuraman RangarajanKReSIT, IIT Bombay.
September 2000 XML Workshop, IIT Bombay
Plan of Talk Why is indexing needed? Queries and Indexes in Traditional
DBMS Querying in XML Indexes: Path, Value Conclusion
September 2000 XML Workshop, IIT Bombay
Why is Indexing Needed? Allows fast access to data by
replicating portions of the data in special purpose structures.
Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries.
September 2000 XML Workshop, IIT Bombay
Queries and Indexes in Traditional DBMS
Databases
Query Example
Relational Associative
SELECT nameFROM accountWHERE acctNo =14
OO Path Expressions
SELECT X.nameFROM dept.empl X
September 2000 XML Workshop, IIT Bombay
supplier
An XML Fragment
name
address
address
part subpart
name
name
name
namename
name
name address
address
partsupplier
supplier
(with leaf values omitted)
subpart
supplier
September 2000 XML Workshop, IIT Bombay
Queries in XML1. SELECT X
FROM part._*.supplier.name X
2. Select XFrom part._*.supplier: {name X, address: “Mumbai”}
September 2000 XML Workshop, IIT Bombay
Indexes for XML Path indexes: regular path
expressions Value Indexes: locating atomic
objects
supplier
Building A Path Index
name
address
address
part subpart
name
name
name
namename
name
name address
address
partsupplier
supplier
part subpart
subpart name
name
name
namename
namesupplier supplier
supplier
address
address address
subpart
h1
h7
h6
h5
h4h3
h2
September 2000 XML Workshop, IIT Bombay
Path Index
•Index summarises path information
•Each entry: list of pointers to data nodes
part subpart
subpart name
name
name
namename
namesupplier supplier
supplier
address
address address
h1
h7
h6
h5
h4h3
h2
September 2000 XML Workshop, IIT Bombay
Using Path Index for Regular Path Expressions
(R1) part.name(R2) part.supplier.name(R3) _*.supplier.name(R4) part._*.subpart.name
part subpart
subpart name
name
name
namename
namesupplier supplier
supplier
address
address address
h1
h7
h6
h5
h4h3
h2
September 2000 XML Workshop, IIT Bombay
Path Indexes XSet project (Berkeley) Dataguides (Lore, Stanford)
September 2000 XML Workshop, IIT Bombay
Value Index Useful for comparisons (=, <, etc.) Example: Find supplier whose name is “XYZ”?
VIndex(name)
addressnamename address
part
supplier
subpart
supplier
“XYZ”“ABC”
September 2000 XML Workshop, IIT Bombay
Other Indexes Text Indexes: Information retrieval
style keyword search.Example: Find the suppliers in Mumbai(“address”)
Also supports search features like AND, OR, NEAR, etc.
September 2000 XML Workshop, IIT Bombay
Conclusion Performance improves significantly
when indexing is used for query processing (Lore).
Performance of the path indexes depends on the type of queries.
September 2000 XML Workshop, IIT Bombay
References The Lore Project (www-db.stanford.edu/lore)
Work done by Dan Suciu (www.research.att.com/~suciu/)
Data on the Web: Serge Abiteboul, et al.
September 2000 XML Workshop, IIT Bombay
Indexing of XML Data
Raghuraman RangarajanKReSIT, IIT Bombay.