66
EXCHANGING INTENSIONAL EXCHANGING INTENSIONAL XML DATA XML DATA Tova Milo Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul Serge Abiteboul INRIA ; Bernd Amann Bernd Amann Cedric-CNAM ; Omar Benjelloun Omar Benjelloun INRIA ; Fred Dang Ngoc Fred Dang Ngoc INRIA H. GÜL ÇALIKLI 2002700743 H. GÜL ÇALIKLI 2002700743 MURAT KORAŞ 2002700797 MURAT KORAŞ 2002700797

EXCHANGING INTENSIONAL XML DATA

  • Upload
    joie

  • View
    39

  • Download
    4

Embed Size (px)

DESCRIPTION

EXCHANGING INTENSIONAL XML DATA. Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul INRIA ; Bernd Amann Cedric-CNAM ; Omar Benjelloun INRIA ; Fred Dang Ngoc INRIA. H. GÜL ÇALIKLI 2002700743 MURAT KORAŞ 2002700797. INTRODUCTION. - PowerPoint PPT Presentation

Citation preview

Page 1: EXCHANGING INTENSIONAL XML DATA

EXCHANGING INTENSIONAL EXCHANGING INTENSIONAL XML DATAXML DATA

Tova MiloTova Milo INRIA & Tel-Aviv U. ; Serge AbiteboulSerge Abiteboul INRIA ;

Bernd AmannBernd Amann Cedric-CNAM ; Omar BenjellounOmar Benjelloun INRIA ;

Fred Dang NgocFred Dang Ngoc INRIA

H. GÜL ÇALIKLI 2002700743H. GÜL ÇALIKLI 2002700743

MURAT KORAŞ 2002700797MURAT KORAŞ 2002700797

Page 2: EXCHANGING INTENSIONAL XML DATA

INTRODUCTIONINTRODUCTION

Emergence of Web Services as standard means of publishing and accessing data on the web introduced a new class of XML documents called “intensional documents”.“intensional documents”.

Intensional Documents:Intensional Documents:XML documents where; some of some of the documents are defined defined

explicitlyexplicitly some are defined by programsdefined by programs that generate

data.

Page 3: EXCHANGING INTENSIONAL XML DATA

INTRODUCTIONINTRODUCTION

materialisation: the process of evaluating some of the programs included in an XML document and replacing them by their results.

GOAL of this PAPER:GOAL of this PAPER: Study the new issues raised by the exchange of Study the new issues raised by the exchange of

intensional XML document btw. Applicationsintensional XML document btw. Applications Decide on Decide on which data should be materialised which data should be materialised

before it is sent and which should not before it is sent and which should not

Page 4: EXCHANGING INTENSIONAL XML DATA

INTRODUCTIONINTRODUCTIONCONSIDERATIONS for MATERIALISATIONCONSIDERATIONS for MATERIALISATION

Performance:Performance: current system loadcurrent system load cost of communicationcost of communication

Capabilities:Capabilities: unability to handle intensional parts of a documentunability to handle intensional parts of a document lack of access rights (to a particular service)lack of access rights (to a particular service)

Security:Security: invoking service calls from an untrusted party may invoking service calls from an untrusted party may

cause severe security violations cause severe security violations Functionalities:Functionalities:

confidentiality reasonsconfidentiality reasons calling services may involve fees to be paid.calling services may involve fees to be paid.

Page 5: EXCHANGING INTENSIONAL XML DATA

INTRODUCTIONINTRODUCTION

Sendercapabilities

ACLcost...

Receivercapabilities

ACLcost...

Data Exchange Schema

g

q f

fq g

...

g

q r

g

f

r qg

r

g

q

... ... ... ...

Data exchange scenario for intensional documentsData exchange scenario for intensional documents

g

r

Page 6: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE INTENSIONAL XML:SIMPLE INTENSIONAL XML: Model intentional XML documents asModel intentional XML documents as Labelled Labelled

TreesTrees consisting of two types of nodes: consisting of two types of nodes: Data nodes Data nodes Function NodesFunction Nodes correspond to “ correspond to “Service Calls”Service Calls”

Assume the existance of someAssume the existance of some Disjoint Domains:Disjoint Domains: N :N : domain of NODESdomain of NODES

L :L : domain of LABELSdomain of LABELS F : F : domain of FUNCTION NAMESdomain of FUNCTION NAMES D : D : domain of DATA VALUESdomain of DATA VALUES

Page 7: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE INTENSIONAL XML (cont’d)SIMPLE INTENSIONAL XML (cont’d) DEFINITION 1:DEFINITION 1: An An intensional documentintensional document dd is an is an

expression expression (T,(T,λλ)) where: where: T=(N,E,<)T=(N,E,<) is an is an ordered tree.ordered tree.

N N NN : finite set of nodes: finite set of nodes E N X NE N X N : : edges edges << : associates with each node in N a total : associates with each node in N a total

order on its children.order on its children. λλ :N :N L L U U F F U U D D is a is a labeling functionlabeling function for for

the nodes.the nodes.

NOTE:NOTE: only leaf nodes may be assigned data only leaf nodes may be assigned data

values from values from DD

Page 8: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE INTENSIONAL XML (cont’d)SIMPLE INTENSIONAL XML (cont’d) Nodes with a label in Nodes with a label in L L U U D D are called are called Data Data

Nodes.Nodes. Nodes with a label in Nodes with a label in F F are called are called Function Function

Nodes.Nodes. The children subtrees of a function node are The children subtrees of a function node are

the the Function ParametersFunction Parameters When the function is called;When the function is called;

These subtrees are passed to itThese subtrees are passed to it The return value replaces the function node in The return value replaces the function node in

the document.the document.

Page 9: EXCHANGING INTENSIONAL XML DATA

newspaper

title

“The Sun”

date

“04/10/2002”

Get_Temp

city

“Paris”

TimeOut

“Exhibits”

temp

“16 ºC”

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

Page 10: EXCHANGING INTENSIONAL XML DATA

SIMPLE SCHEMA:SIMPLE SCHEMA: DEFINITION 2:DEFINITION 2: A document schema s document schema s is an

expression (L,F,(L,F,ττ)) where, L L LL :finite set of labelsfinite set of labels F F F F ::finite set of function namesfinite set of function names ττ : :function that maps:function that maps:

Each label name l Є L to a regular expression over L U F or to the keyword data

Each function name f Є F to a pair of expressions called

τin(f ) input type of f τout(f ) output type of f

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

Page 11: EXCHANGING INTENSIONAL XML DATA

SIMPLE SCHEMA (cont’d)SIMPLE SCHEMA (cont’d)

Example of a Schema:Example of a Schema: data:data: ττ (newspaper) =title.date.(Get_Temp|temp) (newspaper) =title.date.(Get_Temp|temp)

.(TimeOut|exhibit).(TimeOut|exhibit) ττ (title) = data (title) = data ττ (date) = data (date) = data ττ (temp) = data (temp) = data ττ (city) = data (city) = data ττ (exhibit) = data (exhibit) = data

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

Page 12: EXCHANGING INTENSIONAL XML DATA

SIMPLE SCHEMA (cont’d)SIMPLE SCHEMA (cont’d)

Example of a Schema (cont’d):Example of a Schema (cont’d): functions:functions: ττinin (Get_Temp)= city (Get_Temp)= city ττoutout (Get_Temp)= temp (Get_Temp)= temp ττinin (TimeOut)= data (TimeOut)= data ττoutout (Timeout)= (exhibit|performance) (Timeout)= (exhibit|performance) ττinin (Get_Date)= title (Get_Date)= title ττinin (Get_Date)= date (Get_Date)= date

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

Page 13: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE SCHEMA (cont’d):SIMPLE SCHEMA (cont’d): DEFINITION 3:DEFINITION 3: An An intensional document t intensional document t is is

instance of a schema instance of a schema s=s=(L,F,(L,F,ττ)) if for each:

Data Node n Є t with label l Є L, the labels of

n’s children form a word in lang(ττ((l ))

Same is valid for Function Node.

Used to denode the regular language defined by ττ ( (l )

Page 14: EXCHANGING INTENSIONAL XML DATA

SIMPLE SCHEMA (cont’d):SIMPLE SCHEMA (cont’d): DEFINITION 3 (cont’d):DEFINITION 3 (cont’d): f f : a function name : a function name

tt11,......,t,......,tn n :: a sequence of intensional trees a sequence of intensional trees IFIF the labels of n’s children form a word in the labels of n’s children form a word in lang(ττinin(f)) (lang(ττoutout(f)) )ANDANDall the trees are instances of s.THENTHEN

tt11,......,t,......,tnn is an is an input instanceinput instance of of f f (output instance)(output instance)

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

every subtree conforms to the

same schema as the whole document

Page 15: EXCHANGING INTENSIONAL XML DATA

SIMPLE SCHEMA (cont’d):SIMPLE SCHEMA (cont’d): DEFINITION 4: (about Rewritings)DEFINITION 4: (about Rewritings)

t,t’: treest,t’: trees IFIF t’ is obtained from t by;t’ is obtained from t by;

selecting a function node selecting a function node v v in t with some in t with some label label ff andand

replacing it by an arbitrary output instance replacing it by an arbitrary output instance of of ff

THENTHEN we say thatwe say that t t’t t’

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

v

Page 16: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE SCHEMA (cont’d):SIMPLE SCHEMA (cont’d): DEFINITION 4: (about Rewritings) (cont’d)DEFINITION 4: (about Rewritings) (cont’d)

IFIF t tt t1 1 tt2 ------ 2 ------ ttn n THENTHEN

we say that we say that t tt tn n

nodes nodes vv11,........, v,........, vnn are called are called rewriting rewriting

sequencesequence the set of all trees the set of all trees t’t’ such that such that t t’ t t’ is is

denoted denoted ext(t)ext(t)..

vv11 vv22 vvnn

* t rewrites into tt rewrites into tnn

*

Page 17: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE SCHEMA (cont’d):SIMPLE SCHEMA (cont’d): DEFINITION 5: (DEFINITION 5: (about Rewritings)about Rewritings) Let:Let:

t be a treet be a tree s be a schemas be a schema

1.1. IF IF ext(t) contains some instance of s ext(t) contains some instance of s THENTHEN

t t possibly rewrites possibly rewrites into s.into s. 2. 2. IFIF either either t is already an instance of st is already an instance of s

oror there exists some node there exists some node vv in t such that in t such that all trees t’ where all trees t’ where t t’t t’ safely rewrite safely rewrite into sinto s

THEN THEN we say that we say that t t safely rewritessafely rewrites into s into s

vv

Page 18: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

SIMPLE SCHEMA (cont’d):SIMPLE SCHEMA (cont’d): DEFINITION 6:DEFINITION 6: Let:Let:

s be a schemas be a schema r is a distinguished label called root labelr is a distinguished label called root label

IF IF all the instances t of s with root label r rewrite all the instances t of s with root label r rewrite safely into instances of s’ safely into instances of s’

THENTHEN we say that:we say that:

s s safely rewritessafely rewrites into s’into s’

Page 19: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

A Richer Data Model :A Richer Data Model :

Function Patterns:Function Patterns:

The schemas we have seen so far specify that a The schemas we have seen so far specify that a particular functionparticular function, identified by its name, may , identified by its name, may appear in the document.appear in the document.

But sometimes, one does not know in advance But sometimes, one does not know in advance which functions will be used at a given place.which functions will be used at a given place.

A common intensional schema for such A common intensional schema for such documents should not require the use of a documents should not require the use of a particular function, but rather allow for a set of particular function, but rather allow for a set of functions, which have a proper signature.functions, which have a proper signature.

Page 20: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

to specify such set of functions we useto specify such set of functions we use Function Function PatternsPatterns

Function Patterns:Function Patterns: A function belongs to the A function belongs to the pattern if its name satisfies thepattern if its name satisfies the boolean predicateboolean predicate

and itsand its signaturesignature is the same as the required oneis the same as the required one EX:EX:

ττnamename (Forecast)= UDDIF InACL(Forecast)= UDDIF InACL ττinin (Forecast)= city(Forecast)= city ττoutout (Forecast)= temp(Forecast)= temp

V

Page 21: EXCHANGING INTENSIONAL XML DATA

THE MODEL and THE PROBLEMTHE MODEL and THE PROBLEM

A Richer Data Model (cont’d): A Richer Data Model (cont’d): Restricted Service Invocations:Restricted Service Invocations:

We assumed so far that all the functions appearing We assumed so far that all the functions appearing in a document may be invoked in a rewriting, in in a document may be invoked in a rewriting, in order to match a given schema.order to match a given schema.

This is not always the case, for the reasons like;This is not always the case, for the reasons like; securitysecurity,, costcost,, access rightsaccess rights , etc. , etc.

THUS, function names/patterns in the schema can THUS, function names/patterns in the schema can be partitioned into two disjoint groups of be partitioned into two disjoint groups of invocable invocable and and noninvocablenoninvocable ones. ones.

A A legal rewritinglegal rewriting is then one that invokes only is then one that invokes only invocable functionsinvocable functions..

Page 22: EXCHANGING INTENSIONAL XML DATA

EXCHANGING INTENSIONAL DATAEXCHANGING INTENSIONAL DATA

Rewriting Process: Rewriting Process:

1.1.Safe Writing:Safe Writing: check if check if tt safely rewrites to safely rewrites to ss

if so, find a if so, find a rewriting sequencerewriting sequence.. rewriting sequencerewriting sequence a sequence of functions a sequence of functions

that need to be invoked to transformthat need to be invoked to transform tt into the into the required structure required structure

preferred required structure preferred required structure shortest/ cheapest shortest/ cheapest oneone

Page 23: EXCHANGING INTENSIONAL XML DATA

EXCHANGING INTENSIONAL DATAEXCHANGING INTENSIONAL DATA

Rewriting Process(cont’d):Rewriting Process(cont’d): 2.2.Possible Writing :Possible Writing : IFIF a safe rewriting does not exist a safe rewriting does not exist

check whether at least check whether at least tt may rewrite to may rewrite to ss.. IFIF it is acceptable to do so (the sender accepts it is acceptable to do so (the sender accepts

that the rewriting may fail),that the rewriting may fail), try to find a successful rewriting sequence if try to find a successful rewriting sequence if

one existsone exists preferred rewriting sequence preferred rewriting sequence one with the one with the

least cost.least cost.

Page 24: EXCHANGING INTENSIONAL XML DATA

EXCHANGING INTENSIONAL DATAEXCHANGING INTENSIONAL DATA

Rewriting Process(cont’d): Rewriting Process(cont’d):

3.3.Mixed Approached:Mixed Approached:

In mixed approach, one could In mixed approach, one could first invoke some function callsfirst invoke some function calls then attempt from there to find safe rewritings.then attempt from there to find safe rewritings.

Page 25: EXCHANGING INTENSIONAL XML DATA

EXCHANGING INTENSIONAL DATAEXCHANGING INTENSIONAL DATA

Rewriting Process(cont’d):Rewriting Process(cont’d): DEFINITION 7:DEFINITION 7:

For a rewriting sequenceFor a rewriting sequence ttvv ::tt11 .. .. ttn n ,,

IFIF V V j j ЄЄ ttii butbut V V jj ЄЄ ttii-1 -1 ..

THENTHEN we say that we say that function nodefunction node VVjj depends on depends on

aa function nodefunction node V V ii .. IF IF the dependency graph among the nodes the dependency graph among the nodes

contains no paths of length greater than contains no paths of length greater than kk.. THEN THEN we say that we say that a rewriting sequence is ofa rewriting sequence is of

depth kdepth k

v1 vn

Page 26: EXCHANGING INTENSIONAL XML DATA

EXCHANGING INTENSIONAL DATAEXCHANGING INTENSIONAL DATA

RESTRICTION:RESTRICTION:

“Consider onsider onlyonly k-depth left-to-rightk-depth left-to-right rewritings. rewritings.“

Page 27: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

Algorithm for k-depth left to right safe rewriting Algorithm is decomposed into three parts:

1.Rewriting Function Parameters: to invoke a function

its parameters should be of right type if not

they should be rewritten to fit that type.

when rewriting the parameters; the functions in them can be invoked

ONLY IFONLY IF their own parameters can be rewritten into (i.e. are the expected input type.)

Page 28: EXCHANGING INTENSIONAL XML DATA

Algorithm is decomposed into three parts (cont’d) 1.Rewriting Function Parameters (cont’d)

For deepest functions Verify that their parameters are instances

of the corresponding input types. If notrewriting fails.

Move upward ( do till all functions in the tree(forest) are done)

Try to safely rewrite f ’s own parameters into the required structure.

If notrewriting fails.

SAFE REWRITINGSAFE REWRITING

Page 29: EXCHANGING INTENSIONAL XML DATA

Algorithm is decomposed into three parts (cont’d) 2.Top Down Traversal:2.Top Down Traversal:

In each iteration of the recursive procedure In each iteration of the recursive procedure “Rewriting Function Parameters”“Rewriting Function Parameters”,the ,the parameters of the outmost functions of tree parameters of the outmost functions of tree (forest) are handled.(forest) are handled.

In this part In this part safely rewrite the tree (forest) safely rewrite the tree (forest) by invoking only these outmost functions.by invoking only these outmost functions.

THUS:THUS: traverse the tree (forest) traverse the tree (forest) top downtop down At At each stepeach step treat a treat a single nodesingle node and and its its

childrenchildren..

SAFE REWRITINGSAFE REWRITING

Page 30: EXCHANGING INTENSIONAL XML DATA

Algorithm is decomposed into three parts (cont’d) 2.Top Down Traversal (cont’d)2.Top Down Traversal (cont’d)

node nnode n with children whose labels form a with children whose labels form a word word ww The subtree rooted at node n can be rewritten The subtree rooted at node n can be rewritten

into the target schema into the target schema s=(L,F,s=(L,F,ττ))IF and ONLY IF:IF and ONLY IF: 1. 1. ww can be safely rewritten into a word in can be safely rewritten into a word in

lang(lang(ττ(label(n)))(label(n)))ANDAND 2. each of n’s children can be safely

rewritten into an instance of s.

SAFE REWRITINGSAFE REWRITING

Page 31: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

Algorithm is decomposed into three parts (cont’d) 3.Rewriting the children of a node n:3.Rewriting the children of a node n: Given:Given:

w word (sequence of labels of n’s children) Goal:Goal:

rewrite rewrite w so that it becomes a word in the regular language R=R=ττ(label(n))(label(n))

The process of The process of rewritingrewriting involves: involves: choosing some functions in choosing some functions in ww and replacing them and replacing them

by a possible outputby a possible output then choosing some other functions (which might then choosing some other functions (which might

have been returned by previous calls) and have been returned by previous calls) and replacing them by their outputreplacing them by their output

and so on up to the depth kand so on up to the depth k

Page 32: EXCHANGING INTENSIONAL XML DATA

Safe Rewriting Algorithm:Safe Rewriting Algorithm: Given:Given:

word word ww the output types the output types RRf1f1,.....,R,.....,Rfnfn of the available functionsof the available functions

target regular language target regular language RR Purpose of the algorithm:Purpose of the algorithm:

to test ifto test if ww can be safely rewritten into a word in can be safely rewritten into a word in RR if so, to find a if so, to find a safe rewriting sequencesafe rewriting sequence

SAFE REWRITINGSAFE REWRITING

Page 33: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

Safe Rewriting Algorithm:Safe Rewriting Algorithm: Note:Note:For illustration purposes we use the For illustration purposes we use the newspaper documentnewspaper document

w=title.date.Get_Temp.TimeOutw=title.date.Get_Temp.TimeOut word children labels formword children labels form

R=title.date.temp (TimeOut|exhibitR=title.date.temp (TimeOut|exhibit**)) safe rewriting of the above word into the word in safe rewriting of the above word into the word in RR

The Algorithm:The Algorithm: 1)1) Build the finite state automata for the following Build the finite state automata for the following

regular languagesregular languages 1.1) 1.1) An AutomatonAn Automaton AAww accepting accepting ww as a single as a single

wordword..

Page 34: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The Algorithm (cont’d)The Algorithm (cont’d) 1.2)1.2) Build automata A Build automata Afi ,fi ,i=1,...,n i=1,...,n each accepting each accepting

the regular language Rthe regular language Rfifi

1.3) 1.3) Build an automaton A accepting the Build an automaton A accepting the complement of the regular language complement of the regular language R R . . The The automaton should be deterministic and complete.automaton should be deterministic and complete.

Page 35: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The complement automation A for schema

ττ’(newspaper)=title.temp(TimeOut|exhibit*)’(newspaper)=title.temp(TimeOut|exhibit*)

p5

p3 p3 p4 p6temp TimeOut

exhibit

exhibit

*

*

**

*

p1 datep0 title

*

Page 36: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The Algorithm (cont’d)The Algorithm (cont’d) 2)2) Let ALet Aw w := := AAww 3)3) For j=1,...,k For j=1,...,k

Consider all the edgesConsider all the edges e=(v,u) e=(v,u) in in AAww that are that are labelled by the function name labelled by the function name ffi i and not iterated and not iterated in previous iterationsin previous iterations

3.1)3.1) extend A extend Aww by attaching a copy of the by attaching a copy of the automaton Aautomaton Afifi with its inital and final states with its inital and final states linked to linked to v v andand u u respectively by respectively by εε moves.moves.

3.2)3.2) denote denote v v as a as a fork node fork node ((for the edge efor the edge e)) 3.3) 3.3) two fork options of two fork options of v v areare e e itself and the new itself and the new

outgoing outgoing εε edge edge

k

k

k

Page 37: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

1 depth automaton Aw for the word

w=title.date.Get_Temp.TimeOutw=title.date.Get_Temp.TimeOut

1

q1date

q0title

q2Get_Temp

q3 TimeOut q4

q5

ε

q6

εtemp

q7

ε ε

exhibit

performance

Fork node Fork node

Represents choice of invoking the function

Represents choice of not invoking the function

Page 38: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The Algorithm (cont’d)The Algorithm (cont’d) 4) 4) Construct the cartesian product automatonConstruct the cartesian product automaton

AAXX=A=Aw w X AX A The fork nodes and fork options in The fork nodes and fork options in AAX X reflect reflect

those of those of AAw :w : 4.1)4.1) the fork nodes the fork nodes [q p] [q p] ЄЄ A AX X nodes where nodes where qq was was

a fork node in a fork node in AAw w 4.2)4.2) a fork option in a fork option in AAX X consists of all edges consists of all edges

originating from one fork option edge in originating from one fork option edge in AAw.w.

k

k

k

k

Page 39: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The cartesian product automaton Ax = Aw x A

q0,p0

q3,p6

q1,p1 q2,p2 q3,p3

q5,p2 q6,p3

q4,p4

q7,p3 q4,p3

q7,p5 q5,p5

q7,p6

q4,p6

q7,p6

title date

Get_Temp

temp

TimeOut

Perform.

exhibit

PerformanceexhibitTimeOut

εExhibit

Performance

ε

ε ε

ε

ε

εε

Figure6:Figure6:

Page 40: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The Algorithm (cont’d):The Algorithm (cont’d): 5)5) Mark nodes in Mark nodes in AAXX ::

5.1)5.1) mark states that are accepting states in both mark states that are accepting states in both AAww and and A A

5.2)5.2) iteratively mark; iteratively mark; nonfork (regular) nodes: nonfork (regular) nodes: IF IF one of their one of their

outgoing edges points to a outgoing edges points to a marked nodemarked node fork nodes: fork nodes: IF IF both of their fork options (for both of their fork options (for

some some fi fi ) contain an edge that points to a ) contain an edge that points to a marked node.marked node.

k

Page 41: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The cartesian product automaton Ax = Aw x A

q0,p0

q3,p6

q1,p1 q2,p2 q3,p3

q5,p2 q6,p3

q4,p4

q7,p3 q4,p3

q7,p5 q5,p5

q7,p6

q4,p6

q7,p6

title date

Get_Temp

temp

TimeOut

Perform.

exhibit

PerformanceexhibitTimeOut

εExhibit

Performance

ε

ε ε

ε

ε

εε

Figure6:Figure6:

Page 42: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The Algorithm (cont’d):The Algorithm (cont’d): 6)6) Try to obtain a SAFE REWRITING.Try to obtain a SAFE REWRITING.

““A safe rewriting exists IFF the initial state is not A safe rewriting exists IFF the initial state is not marked”marked”

6.1) 6.1) Follow a non-marked pathFollow a non-marked path (corresponding to(corresponding to w w ) starting from the initial state of) starting from the initial state of AAx x to a state to a state

[q p] where q is an accepting state[q p] where q is an accepting state ofof AAww 6.1.1)6.1.1) non-marked fork options on the path non-marked fork options on the path

determine the rewriring choices (i.e. which determine the rewriring choices (i.e. which functions to call)functions to call)

6.1.2)6.1.2)when a function is invoked, we cont,nue when a function is invoked, we cont,nue the path with the new rewritten word rather the path with the new rewritten word rather than the wordthan the word w w

k

Page 43: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The Algorithm (cont’d):The Algorithm (cont’d): 6.2)6.2) To minimize the rewriting cost, choose a To minimize the rewriting cost, choose a

path with minimal number/cost of function path with minimal number/cost of function invocations.invocations.

EXIT EXIT % End of the algorithm% End of the algorithm

Page 44: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The complement automaton A for schema

ττ’(newspaper)=title.date.temp.exhibit*’(newspaper)=title.date.temp.exhibit*

p5

q3 p3 p4 p6temp *

exhibit

exhibit

*

*

**

*

q1 dateq0 title

*

1

Figure7:Figure7:

Page 45: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The cartesian product automaton Ax = Aw x A

q0,p0

q3,p6

q1,p1 q2,p2 q3,p3

q5,p2 q6,p3

q7,p3 q4,p3

q7,p5 q5,p5

q7,p6

q4,p6

q7,p6

title date

Get_Temp

temp

TimeOut

Perform.

exhibit

PerformanceexhibitTimeOut

εExhibit

Performance

ε

ε ε

ε

ε

εε

1 1 1

Figure8:Figure8:

Page 46: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

Complexity of the Algorithm:Complexity of the Algorithm: ss00 schema of the sender schema of the sender ss agreed data exchange schema agreed data exchange schema ComplexityComplexity is determined by is determined by the size of thethe size of the

cartesian product of the automatoncartesian product of the automaton. . 1.1. Construct the cartesian product Construct the cartesian product 2.2. Traverse and mark the nodes of the resulting Traverse and mark the nodes of the resulting

productproduct THUS complexity is bounded by:THUS complexity is bounded by: O(|AO(|Axx| )=O( ( | A| )=O( ( | Aw w | | X X | | A |) )A |) )

2 2k

Page 47: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

Complexity of the Algorithm:Complexity of the Algorithm: (cont’d) (cont’d)

O(|AO(|Axx| )=O( ( | A| )=O( ( | Aw w | | X X || A |) )A |) )2 2k

Maximum size:

O((|s0|+|w|) )k Complexity is polynomial

in the size of schemas s and s0 (with the exponent determined by k)

Page 48: EXCHANGING INTENSIONAL XML DATA

POSSIBLE REWRITINGPOSSIBLE REWRITING

The AlgorithmThe Algorithm 1.1. Build finite state automaton for the following Build finite state automaton for the following

languages:languages: 1.1.1.1. An automaton A An automaton Aww 1.2. 1.2. An automaton accepting the regular An automaton accepting the regular

language language RR

k

Page 49: EXCHANGING INTENSIONAL XML DATA

POSSIBLE REWRITINGPOSSIBLE REWRITING

An automaton A for schema

ττ’’(newspaper)=title.date. Temp.exhibit*’’(newspaper)=title.date. Temp.exhibit*

p2 p3 p4temp Exhibit

exhibit

p1 datep0 title

Figure10:Figure10:

Page 50: EXCHANGING INTENSIONAL XML DATA

POSSIBLE REWRITINGPOSSIBLE REWRITING The Algorithm (cont’d)The Algorithm (cont’d) 2.Construct the cartesian product automaton Ax=Aw x A

q0,p0 q1,p1 q2,p2 q3,p3

q5,p2 q6,p3

q7,p3

title date

tempε ε

ε

Figure11:Figure11:

q4,p3

q4,p4

q7,p4

ε

ε

exhibit

k

Page 51: EXCHANGING INTENSIONAL XML DATA

POSSIBLE REWRITINGPOSSIBLE REWRITING The Algorithm (cont’d)The Algorithm (cont’d) 3.3.Mark all nodes in Ax having some

outgoing path leading to a final state 4.4.IF the initial state is marked THEN a

rewriting may exist. To obtain such a rewriting:

Follow a marked path from the initial state of Follow a marked path from the initial state of AAxx to a final one , with the fork options on the to a final one , with the fork options on the path determining the rewriting choices.path determining the rewriting choices.

Backtrack when the call return a value that Backtrack when the call return a value that does not allow to continue to an accepting statedoes not allow to continue to an accepting state

To minimize thE rewriting cost, choose a path To minimize thE rewriting cost, choose a path with the minimal number/cost of function with the minimal number/cost of function invocations.invocations.

Page 52: EXCHANGING INTENSIONAL XML DATA

SAFE REWRITINGSAFE REWRITING

The cartesian product automaton for possible rewritting.

q0,p0 q1,p1 q2,p2 q3,p3

q5,p2 q6,p3

q7,p3

title date

tempε ε

ε

Figure11:Figure11:

q4,p3

q4,p4

q7,p4

ε

ε

exhibit

Page 53: EXCHANGING INTENSIONAL XML DATA

implementation performed in the Schema Enforcement Module of ActiveXML.

We’ll describe: how the intensional document and schema

model map to: XML XML schema SOAP WSDL

Describe ActiveXML and Schema Enforcement Module

IMPLEMENTATIONIMPLEMENTATION

Page 54: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

In the implementation; intensional XML document a synctactically

well-formed XML document

To distinguish intensional parts from the rest of the document; namespace http://www.activexml.com/ns/int is used. http://www.activexml.com/ns/int namespace

defined for function (service) calls.

Page 55: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

newspaper

title

“The Sun”

date

“04/10/2002”

Get_Temp

city

“Paris”

TimeOut

“Exhibits”

Page 56: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

Namespace Namespace defined for defined for

function (service) function (service) callscalls

Data nodes Data nodes title title and and datedate

1.1.URL of URL of the serverthe server

Three attributes of the Three attributes of the function nodes provide function nodes provide necessary information necessary information

to call the to call the SOAP ServiceSOAP Service

2.2.Method Method namename

3.3.associated associated namespacenamespace

Page 57: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

Function TimeOutFunction TimeOut

1.1.URL of URL of the serverthe server

2.2.Method Method namename

3.3.associated associated namespacenamespace

Page 58: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

XML Representation of Function AttributesXML Representation of Function Attributes

id attribute:id attribute: identifies identifies the function attributesthe function attributes

Attributes: designate the SOAP function that

implements the boolean predicate used for function

pattern

The “contents” detail the function signature i.e. Expected types of input

parameters and the result of function calls

Page 59: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

Function Pattern “Forecast”Function Pattern “Forecast”

Captures any function with one input parameter of element type “city”

Returns an element of type “temp”

Page 60: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

Newspaper element with structureNewspaper element with structure title.date.(Forecast|temp). (TimeOut|exhibit*)

Page 61: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

ActiveXML System:ActiveXML System: Active XML is a peer-to-peer system centered

around intensional XML documents. Each peer;

contains a repository of intensional documents provides active features to enrich them by

automatically triggering the function calls they contain.

also provides some Web Services defined declaratively as queries/updates on top of the repository documents.

All exchanges between the ActiveXML peers and with Web Service providers/consumers use the SOAP Protocol

Page 62: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

The Role ofThe Role of Schema Encorcement Module Schema Encorcement Module :: 1. 1. to verify whether the call parameters conform to

the WSDLint description of the service. 22. if not, try to rewrite them into the required

structure. 3. 3. if if 2 2 fails, to report an error.fails, to report an error.

NOTE:NOTE: Similarly, before an ActiveXML returns its answer,

the Schema Encorcement Module performs the same three steps on the returned data.

Page 63: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

Implementation of Schema Enforcement Module : Parser uses a standard SAX parser.

does not cover all the features of XML Schema implements the important features such as;

complex types element/type references schema import does not check simple types, inheritance and

keys, but could easily be added to the code.

Page 64: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

Different from the algorithm proposed, implementation builds the automaton in a lazy mode;

start from the inital state and construct only needed parts

The construction is pruned whenever a node can be marked directly without looking at the remaining, unexplored branches.

Main ideas that guide this process: 1.Sink Nodes once you get there you can’t get out 2.Marked Nodes

Page 65: EXCHANGING INTENSIONAL XML DATA

IMPLEMENTATIONIMPLEMENTATION

The pruned automaton

q0,p0

q3,p6

q1,p1 q2,p2 q3,p3

q5,p2 q6,p3

q4,p4

q7,p3 q4,p3

q7,p5 q5,p5

q7,p6

q4,p6

q7,p6

title date

Get_Temp

temp

TimeOut

Perform.

exhibit

Performance

exhibitTimeOut

εExhibit

Performance

ε

ε ε

ε

ε

εε

Figure12:Figure12:

Page 66: EXCHANGING INTENSIONAL XML DATA

CONCLUSION and RELATED WORKCONCLUSION and RELATED WORK

XML documents with embedded calls to Web services are already present in several existing products.

WHAT’S NEW ? However, the proposed extension of the XML

Schema with function types is a first step towards a more precise description of XML documents embedding computation.

MAIN PROBLEM: whether Safe Rewriting remains decidable when the

k-depth restriction is removed.