22
Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Keys For XML

Peter BunemanSusan Davidson

Wenfei FanCarmem Hara

Wang Chiew Tan

Page 2: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Overview Motivation Definition of Keys Examples of Keys Value Equality Relative Keys Examples of Relative Keys Stronger Keys Examples of Stronger Keys Advantages Disadvantages Conclusion

Page 3: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Motivation

Keys are used for citing parts of a document that is important

Defects of XPath1. Complex2. Technical problems3. Questions about the equivalence of

XPath expressions

Page 4: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

In the absence of keys the only way to identify a tuple is to give the entire tuple

<db> <student>

<name> Smith </name> <course> Math2 </course> </student><student> -

<name> Jones </name> <course> Math2 </course>

</student> </db>

Page 5: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Definition of Keys Key Specification

is a pair (Q,{P1, ... , Pn}) where Q is a path expression and {P1, ... , Pn} is a set of simple path expressions.

Path expression Q identifies a set of nodes target set on which the key constraint is to hold

Set {P1, ... , Pn} as the key paths. Example

(person.employees, {name.firstname, name.lastname})

Page 6: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Formal Definition. A node n satisfies a key specification (Q,{P1,... , Pk}) if for any n1, n2 in n[[Q]], if for all, 1 <=i<= k, there exist z1 belonging to n1[[Pi]] and z2 belonging to n2[[Pi]] such that z1 =v z2, then n1 = n2.

=v stands for value equality

Page 7: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Value Equality. Stands for equality of the "values" associated

with nodes In XML schema nodes may have complex

structure Examplename may have a complex structure consisting of first-name and last-name subelements

Page 8: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Examples of Keys (_*.person, {id})

Any person element, if it has id subelements, is uniquely identified by the values of the id's.

(person, {e})Any two person nodes immediately under the root have different values (e is the empty path).

Page 9: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

(employees, {})An empty key. This means that the path employees, if it exists, is unique at the root. That is, there is at most one employees node immediately under the root.

(_*,{id}) Any element that has id subelements is

uniquely identified by the values of the id's

Page 10: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Relative Keys A document satisfies a relative key

specification (Q, (Q',S)) if for all nodes n in [[Q]], n satisfies the key (Q',S).

(Q, K) is a relative key if K is a key for every "sub-document" rooted at a node in [[Q]].

Page 11: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Examples of Relative Keys (bible.book.chapter, (verse, {number})) A verse number uniquely identifies a verse

within a chapter. (bible.book, (chapter, {number}))

Chapter numbers uniquely identify a chapter within a book.

(bible, (book, {name}))If there is only one bible node immediately under the root, this is the same as specifying a key

(, (bible,{}))

Page 12: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Notation for relative keys

The basic syntactic form is Q1{P1 ,...,P k1}.Q2{P1,...,Pk2}. ... .Qn{P1 ,...,Pkn}

Example

bible{}.book{name}.chapter{number}.verse{number}

Page 13: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Specifies:-(, (bible,{}))(bible, (book, {name}))(bible.book, (chapter, {number}))(bible.book.chapter, (verse, {number}))

Page 14: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Stronger Keys The definition of keys we have adopted in this

paper is quite weak To mirror the requirements imposed by a key

in relational databases 1. Uniqueness of a key and

2. Equality of key values.

Page 15: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Definition. A node n satisfies a key specification (Q,{P1,... , Pk}) if for all n' in n[[Q]] and for all Pi (1<= i<= k), Pi is unique at n'. For any n1, n2 in n[[Q]], if n1[[Pi]] =v n2[[Pi]] (1<=i<= k) then n1 = n2.

Page 16: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Examples of Stronger Keys (_*.person, {id})

Any two person elements, no matter where they occur, have unique id subelements and differ on those elements.

(person, {})The interpretation of this key remains unchanged under a strong key semantics.

Page 17: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

(employees, {})Again, the semantics of this key is the same with respect to the strong and weak key specifications.

(_*,{k})This requires that every element has a key k, including any element whose name is k.

Page 18: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Advantages More generic than XML schema. There is no direct notion of a relative key in

XML-Schema but it is covered in this paper. The paper covers any alternative XML

representations .1. Tags expressed as attributes.2. Introduce new type

Page 19: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

<db> <parts>

<widget> <id> 123 </id> <weight> 1.5 </weight> </widget><widget><id> 234 </id><weight> 2.5 </weight> </widget>

</parts> </db>

.

Page 20: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Disadvantages Definition of target set :-

XML Schema is from any arbitrary point where as this paper is from specific point

Definition of key paths. There is no general method of checking

whether two such specifications are equivalent in the proposal

Page 21: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

In defining a key (Q,{P1, ..., Pn}), the language used to describe the target path Q needs to be the same as the language used to define the key paths P1, ..., Pn. One could choose a simpler language for key paths that is a sublanguage of the language for target paths.

Page 22: Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan

Conclusion

More generic way of representing keys The paper takes care of setbacks of XPath