58
On Views and XML Serge Abiteboul INRIA PODS 1999

On Views and XML Serge Abiteboul INRIA PODS 1999

Embed Size (px)

Citation preview

Page 1: On Views and XML Serge Abiteboul INRIA PODS 1999

On Views and XML

Serge AbiteboulINRIAPODS 1999

Page 2: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 2

Organization

Introduction XML View := Query

+:= Change Control+:= Objects+:= Structured & Semistructured Data+:= Active Features +:= Incomplete Information+:= more...

Page 3: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 3

This is not a survey on database views This is not a tutorial on XML

This is about the use of XML&ecommerce as excuses to survey some works on views cast in a fashionable context: O2views, views of OEM, ActiveViews, Lorel/Ozone...(and also motivate future works)

Warning

Page 4: On Views and XML Serge Abiteboul INRIA PODS 1999

Executive Summary: Database folks should be interested in XML Views and more and more are

Footnote: this is a great way to recycle your old results on views, incomplete information, deductive databases, universal instance assumption, dependency theory, etc.

Page 5: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 5

Introduction: XML in short

Document mark-up language; descendant of SGML

Standard for data exchange on the Web

We are interested here in data exchange and not in document editing and retrieval

Page 6: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 6

EXAMPLE: EDI Electronic Data Interchange

Standard for business data exchange 2 standards:

ANSI X12 in US -- all B2G by end 1999 EDIFACT in world -- UN committee

translate EDI transmit

Page 7: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 7

<!DOCTYPE Book-Order PUBLIC "-//Editor//DTD Book Order Message//EN">

<Book-Order Supplier="4012345000094" Send-to="http://www.bic.org/order.in">

<title>Editor Lite-EDI Book Ordering</title> <Order-No>967634</Order-No>

<Message-Date>19961002</Message-Date> <Buyer-EAN>5412345000176</Buyer-EAN>

<Order-Line Reference-No="0528837">

<ISBN>0316907235</ISBN>

<Author-Title>Labaln, Brian/Chrome</Author-Title>

<Quantity>2</Quantity>

</Order-Line>

<Order-Line Reference-No="0528838">

<ISBN>0856674427</ISBN>

<Author-Title>Parry, Linda (ed)/William Morris</Author-Title>

<Quantity>1</Quantity>

</Order-Line><input type="checkbox" name="partial" value="allowed"/>

<text>Tick here if a delayed/partial supply of order is acceptable</text>

<input type="checkbox" name="confirmation" value="requested"/>

<text>Tick here if Confirmation of Acceptance of Order is to be returned by e-mail</text>

<input type="checkbox" name="DeliveryNote" value="required"/>

<text>Tick here if e-mail Delivery Note is required to confirm details of delivery</text>

<E-Address>E-mail address: <input name="e-address" size="25"></input></E-Address>

<Language>Please respond in:<select name="response-language">

<option value="EN" selected>English</option><option value="FR">Fran&ccedil;ais</option>

<option value="DE">Deutsch</option> <option value="ES">Espagnol</option>

<option value="IT">Italian</option> </select></language>

<input type="submit" value="Press here to send completed form to supplier">

</Book-Order>

data in XML/EDI

Page 8: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 8

I personally prefer:

Page 9: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 9

XML

Some noise and confusion Is the syntax important? No What is XML?

the means to exchange tree/graph data on the Web

an object-oriented API for it more

Page 10: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 10

A (simplified) model for XML

XML-tree :- list(node)node :- string | element | ref nodeelement :- label list(att : string)

list(node) label :- string att :- string an attribute occurs at most once

Page 11: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 11

XML in short

<person> <name>Serge Abiteboul</name>PODS invited speaker <a xml:link=`simple’ href=“gif/serge.gif”> old picture</a><address> <city>Le Chesnay</city><zip>92310</zip></address> <a xml:link=`simple’ href=“www-rocq.inria.fr/~abitebou”>Web</a>

</person>

DTD: grammar DCD: some typingDOM: object API RDF: meta dataXPOINTER/XLINK ...

Page 12: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 12

XML Views

Webbrowsers

Webbrowsers

Webbrowsers

Viewserver

QueryPublish&subscribe Crawler&filter engineSecurity managerRequest brokerBusiness intelligenceOutput/report/delivery

Information repository

DataWarehouse

OLAP

Imagevideo

reports

Page 13: On Views and XML Serge Abiteboul INRIA PODS 1999

What databases can bring to XML is query optimization and query rewriting

View := Query

Page 14: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 14

View = Query

like for relational model use of query optimization techniques use of query rewriting techniques processing queries using views

main issue: virtual vs. materialized

Page 15: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 15

B2C: Comparative Shopping

http://www.addall.com

24 bookstores searched in about 10 seconds

between $42 and $78 that’s why people will use them!

Page 16: On Views and XML Serge Abiteboul INRIA PODS 1999

What DB can bring to XML is the control of changes

View +:= Change Control

Page 17: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 17

Some of the most studied problems for relational views

update propagation: incremental updates view update problem

Page 18: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 18

D2V: Incremental Updates

a customer has loaded portions of the catalog

some prices change no need to reload the entire catalog

many such examples on the Web updates

Page 19: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 19

V2D: View Update

Sometimes considered less of an issue: the Web is read only!

Many Web applications involve updates We may be able to annotate the

products of the catalog some of the data is in read mode some data is not visible (this is only a view!) some data may be updated

Page 20: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 20

Example: Change Detection

A customer (self) is in a department (self.department) and may want to see only the current promotions of products in this department (MyPromotions)

let MyPromotions beselect I.*from I in Catalog.promotions.item where I.department = self.department

Page 21: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 21

Query Subscription: Changes [from Chawathe’s thesis]

Changes in label graphs : as in DOEM

Catalog

promotion

name

department

price

Gismos78

electronic

£234description

super sale

£27899/02/01

01/05/03

item

departmentself

Page 22: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 22

Query Subscription: Changes

Change value of atomic vertex value Creation of new vertex Addition/removal of an edge

Change of the label on an edge: add/remove

Move a vertex: add/remove

annotations on edges and vertexes

Page 23: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 23

Query Subscription: Queries

select P.code, P.description

from P in Catalog.product

where P.price <changed>Q vertex annotation

where P.<added>description edge annotation

where P.price data in annotation

<changed <old=Q’, date T>>Q

and Q - Q’ > 100 and T > “99/04/03”

Page 24: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 24

Query Subscription: Examples On the first of each month, send me the list

of all products in my interest list such that their price increased by more than 10%

Each time there are ten new employees, send me their names and departments

Notify me if the price of this house decreases

similarity on event when condition do action

Page 25: On Views and XML Serge Abiteboul INRIA PODS 1999

XML +:= World of Objects

The underlying model for XML is object-based and XML views should be based on OO(DB) technology

Page 26: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 26

Views +:= World of objects

API for XML: Domain Object Model Views XML as object-oriented Allows designing C++ or Java

applications E.g.:

use subclass Promotion of XMLNode Catalog.promotions is only a set of virtual elements

the list of promotions is generated on demand based on the nature of customers

Page 27: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 27

Views in OODB: O2Views

Virtual values like for relational views entirely virtual XML document, e.g., view of

relational data virtual attributes

e.g., product: code, name, price,…alternatives = the set of products thatare “similar” and are on promotion

Page 28: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 28

Views in OODB: O2Views

Virtual class: a set of database objects that are grouped together and as such acquire a new interface catalog1/DTD1,…,catalog17/DTD17 products are represented differently in

each catalog unique DTD that allows to view all products each product can be “viewed” with that

DTD

Page 29: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 29

Views in OODB: O2Views

Imaginary class: groups objects that are all virtual, e.g., join of two relations

For more: see Souza’s thesis

Page 30: On Views and XML Serge Abiteboul INRIA PODS 1999

XML data/views +:= semistructured + structured data

XML should also allow the exchange of structured data as in relational/ODMG models

Page 31: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 31

Semistructured + Structured Data

If we know about the structure of data, not using it may damage performance

The use of structure facilitates the programming of applications, e.g., in Java

Structure may be useful to explain data to users

For more: see Lahiri’s thesis [and Ozone = OQL + Lorel ]

Page 32: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 32

Web catalog - continued

Product-basic all productscategory=electronic, subcategory=sound,name=Gismo223, code=F2GHYYRF,selling-price=1200FF

Product-specific for Gismos onlyvoltage=list(110,220), Gismo-norm=GHTF333

External resourcesdescription=http://m.ec.fr/cat/Gismoreviews=http://reviews.com/Gismo

Private databuying-price=100$, quantity-in-stock=20000, supplier=Sears, authorized-discount=30%

Page 33: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 33

This data in XML<product>

<basic> <cat> electronic <subcat >sound </subcat><cat> <n>Gismo223 </n><c>F2GHYYRF</c><sp currency=French-franc>1200</sp> </basic>

<specific><v>110</v><v>220</v> <Gismo-norm>=GHTF333</Gismo-norm> </specific>

<external> … </external><private>

<bp currency=dollar>100</bp> <qis>20000</qis>, <s>Sears</s> <ad>30</ad></private><\product>

Page 34: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 34

What is such data exactly?

A mix of structured and semistructured data with pointers between two worlds

Purely XML. Then use a relation as a materialized viewProduct(name, code, category, subcategory, price,

rest) Index on name and subcategory select P.name, P.price from P in Product

where P.subcategory = “sound”

Page 35: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 35

Digression: storage of XML

as blobs generic mapping : ignore the

structure specific mapping

relational object

hybrid

Page 36: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 36

As blobs

<product> <basic> <cat> electronic <subcat >sound </subcat><cat> <n>Gismo22</n><c>F2GHYYRF</c> <sp currency=French-franc>1200</sp> </basic> <specific> <v>110</v><v>220</v> <Gismo-norm>=GHTF333</Gismo-norm> </specific> <external> … </external> <private> <bp currency=dollar>100</bp> <qis>20000</qis>, <s>Sears</s> <ad>30</ad></private><\product>

+ full-text index

Page 37: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 37

Generic mapping

root product o1 o3 electronico1 basic o2 o4 sound o2 cat o3 o5 Gismo223 o2 subcat o4 o6 F2GHYYRFo2 n o5 o7 1200...o2 c o6o2 sp o7...

o7 currency French-franco12 currency dollar...

Page 38: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 38

Specific

Class Product type tuple( cat:string; subcat:set(string);

n: string, c:string; price: Price; specific: OEM;

external: list(tuple(label:string;val:URL));

private pr: tuple(bp:Price; qis: integer; supplier: Company; ) )

type Price : tuple(sum:int, currency:Currency);

Page 39: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 39

What is better? Hybrid?

Need for comparative studies My feeling/common sense?:

Use structure for very structured portions of data

Use semistructured for less so or portions with very evolving structures

Use blobs for components accessed mostly via full-text indexing, e.g., paragraphs in a document

Page 40: On Views and XML Serge Abiteboul INRIA PODS 1999

Views += Active Features

Page 41: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 41

Active Views

System developed at INRIA Long term goals:

Declarative specification of data intensive applications with cooperation between partners

Ease of use and fast deployment (Automatic) verification

Page 42: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 42

ArchitectureArchitecture

O2 O2

XMLrepositoryXMLrepository

Java ClientJava Client

Java RMIJava RMI

Web BrowserWeb Browser

O2 NotificationO2 Notification

JAVAJAVAAVApi

Java applicationJava application

DOM

Page 43: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 43

Motivations Database Applications:

passive behavior closed systems persistence, concurrency, access control

New needs interactions between clients: e.g., notification change control reactive behavior E.g: e-Commerce, cooperative work

Page 44: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 44

Illustration of Interactions: Notification

In the vendor view:

when Customer.entersDept(dept)if dept = self.deptthen notifyme

Page 45: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 45

Notification

AVServer

AVClient customer

AVClient vendorin book dept

AVServer entersDeptentersDeptbookbook

notifynotify

notifynotify

Page 46: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 46

Illustration of Interaction : Change Control

In the customer viewlet monitored MyPromotions be

s elect I.name, I.pricefrom I in Catalog.promotions.item where I.department = self.department

read, write, append, monitored, refresh, deferred…

simpler case: monitoring of the catalog

Page 47: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 47

Change control

AVServer

AVClient

AVClient

1 Read

2 Read

3 Modification

4 Write

5 Notification

6 Notification

AVServer

7.Read

Page 48: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 48

Choices

All XML XML repository XML query language XML views

Declarative specification almost no code to write compilation to an executable application active rules

Page 49: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 49

Important Aspects

workflow e.g., customization: to search for a biblio ref, look first in my own files, otherwise look in dblp otherwise look…

activities (search, buy, accounting, chat…)

active rules logical traces notifications

Page 50: On Views and XML Serge Abiteboul INRIA PODS 1999

View +:= Incomplete Information

Use something like Imielinski-Lipski tables

Page 51: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 51

Example: portal

Q1: gismo vendors{ V | P sell(V,gismo,P) }Q1 = v1, v2, v3, v4, v5

Q2: price for each vendor{ V, P | sell(V,gismo,P) }

Q3: cheap gismo vendors{ V | P (sell(V,gismo,P) and P<80) }

Q1Q1 Q2Q2compcomp comp comp pricepricev1v1 v1v1 109109v2v2 v2v2 XXv3v3 v3v3 9999v4v4 v4v4 8989v5v5 v5v5 YY

Q3Q3comp comp priceprice condcondv2v2 XX X<80X<80v5v5 YY Y<80Y<80

Page 52: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 52

Example: more portal

Load all electronic products expiration: e.g. to recover storage

space for all products loaded before May 1st,

discard images and text of annotations give me the gismos that have been

annotated by Jeff Ullman and the annotations

Page 53: On Views and XML Serge Abiteboul INRIA PODS 1999

View +:= workspace, distribution, cache...

Just to say, there is much more to it...

Page 54: On Views and XML Serge Abiteboul INRIA PODS 1999

Conclusion

Page 55: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 55

Some Challenges: Semistructured Data Processing

XML storage under non generic form XML query language & optimization XML bulk loading data conversion, integration incomplete information

Page 56: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 56

Some Challenges: Change Control and View Interaction

update detection incremental propagation temporal XML: versions, DOEM... rule and trigger management management of large number of

user active views (personalized)

Page 57: On Views and XML Serge Abiteboul INRIA PODS 1999

05/99 Views and XML - Serge Abiteboul 57

Some Challenges: Workflow

workflow management: task sequencing

declarative specification of applications

program Verification

Page 58: On Views and XML Serge Abiteboul INRIA PODS 1999

Conclusion

Database folks should be interested in XML Views and more and more are...