Upload
cher
View
31
Download
1
Embed Size (px)
DESCRIPTION
XML Structures for Relational Data. Wenyue Du, Mong Li Lee, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore {duwenyue, leeml, lingtw}@comp.nus.edu.sg. Contents. Introduction Motivation Related Works Our Approach Background XML - PowerPoint PPT Presentation
Citation preview
Wenyue Du, Mong Li Lee, Tok Wang LingWenyue Du, Mong Li Lee, Tok Wang Ling
Department of Computer Science
School of Computing
National University of Singapore
{duwenyue, leeml, lingtw}@comp.nus.edu.sg
XML Structures for Relational DataXML Structures for Relational Data
2
ContentsContents
1. Introduction – Motivation– Related Works– Our Approach
2. Background– XML– XML DTD– Semantic Enrichment
3. Proposed Relational to XML Translation4. Comparison5. Conclusion
3
1. Introduction1. Introduction
Outline
– Motivation– Related Works– Our Approach
4
MotivationMotivation
XML is emerging as a standard for information publishing on the World Wide Web. However, the underlying data is often stored in traditional relational databases. Some mechanism is needed to translate the relational data into XML data.
IntroductionIntroduction
5
Motivation Motivation (cont.)(cont.)
Generates XML structures that are able to describe the semantics and structures in underlying relational databases.
Obtains properly structured XML data without unnecessary redundancies and proliferation of disconnected XML elements.
IntroductionIntroduction
6
Related WorksRelated Works
• [1, 5, 6] basically focus on single relation translation. In order to handle a set of related relations, the relations are first denormalized to one single relation.
– The flat XML structure does not provide a good way to show the structure of data.
– It causes a lot of redundancies.
IntroductionIntroduction
Relations:
Dept(D#, Dname)
Employee (E#, Ename, JoinDate, D#)
Maps to
<!ELEMENT Results(Employee*)><!ELEMENT Employee (EMPTY)> <!ATTLIST Employee E# CDATA #REQUIRED Ename CDATA #IMPLIED JoinDate CDATA #IMPLIED D# CDATA #REQUIRED DNAME CDATA #IMPLIED >
7
Relations:
Dept (D#, Dname)
Employee (E#, Ename, JoinDate, D#)
Related Works Related Works (cont.)(cont.)
• [7] developed a method to generate a hierarchical DTD for XML data from a relational schema.
– It lacks of semantic enrichment. So it cannot handle more complex situations.
IntroductionIntroduction
Is it an attribute of object or relationship?
<!ELEMENT Results(Employee*)>
<!ELEMENT Employee (Dept)>
<!ATTLIST Employee
E# ID #REQUIRED
Ename CDATA #IMPLIED
JoinDate CDATA #IMPLIED>
<!ELEMENT Dept (EMPTY)>
<!ATTLIST Dept … >
Maps to
8
Our ApproachOur Approach
XML structures for relational data can be obtained by the following steps:
Sem anticEnrichm ent
O R A-SS toX M L-Schem a
AlgorithmT ranslation
R ulesR elationalSchem a
Sem antica llyEnriched
R elationalSchem a
O R A-SSSchem aD iagram
X M L-Schem a
IntroductionIntroduction
9
2. Background2. Background
Outline
– XML– XML Schema– Semantic Enrichment
10
XMLXML
Basic constructs of XML:
– Element
– Attribute
– Reference (link) :
a relationship between resources (e.g. elements). It is specified by attaching specific attributes or sub-elements.
Background / XMLBackground / XML
11
XML DTDXML DTD
Background / XML DTDBackground / XML DTD
XML document Corresponding DTD
<RESULTS>
<CUSTOMER CID=“C980054Z">
<CNAME>J. Tan</CNAME>
<AGE>36</AGE>
</CUSTOMER>
…
</RESULTS>
<!ELEMENT RESULTS (CUSTOMER*)>
<!ELEMENT CUSTOMER
(CNAME, AGE)>
<!ATTLIST CUSTOMER
CID ID #REQUIRED>
<!ELEMENT CNAME (#PCDATA)>
<!ELEMENT AGE (#PCDATA)>
A Document Type Definition (DTD) describes structure on an
XML document.
12
Semantic EnrichmentSemantic Enrichment
• Semantic enrichment is a process that upgrades the semantics of databases, in order to explicitly express semantics that is implicit in the data.
Background / Semantic EnrichmentBackground / Semantic Enrichment
Such as various relationship types, cardinality constraints, etc.
13
Extra information needed:Extra information needed:
• Functional Dependencies (FDs) and keys
• Inclusion dependencies (INDs)
e.g. STUDENT (S#, SNAME)
HOBBIES(S#, HOBBY)
HOBBIES[S#] STUDENT[S#]
• Semantic dependencies (SDs) (T.W. Ling & M.L. Lee, 1995)
Background / Semantic EnrichmentBackground / Semantic Enrichment
14
Semantic DependenciesSemantic Dependencies
Background / Semantic Enrichment Background / Semantic Enrichment
EMPLOYEE(E#, ENAME, JOINDATE, D#)
- JOINDATE is functionally dependent on only E#
- Assuming JOINDATE refers to the date on which an employee assumes duty with the department. We say that
JOINDATE is semantically dependent on {E#, D#}
15
Semantic Enrichment using SD together with Semantic Enrichment using SD together with FD and INDFD and IND
Background / Semantic EnrichmentBackground / Semantic Enrichment
To obtain:
Object relations and object attributes that represent regular and weak entity types, and their properties.
Relationship relations and relationship attributes that represent various relationship types such as binary, n-ary, recursive and ISA (inheritance), and their properties.
Mix-type relations: We need to split them into object relations and relationship relations
Fragments of object relations or relationship relations that represent multi-valued attributes of entity types or relationship types.
Cardinality constraints
16
An Original Relational SchemaAn Original Relational Schema
Background / Semantic EnrichmentBackground / Semantic Enrichment
COURSE (CODE, TITLE)
DEPT (D#, DNAME)
STUDENT (S#, SNAME)
TUTORIAL (T#, TUTORIALTITLE)
HOBBIES(S#, HOBBY)
STUDENTDEPT (S#, D#)
C_S (CODE, S#, GRADE)
ATTEND (CODE, T#, S#)
COURSEMEETING (CODE, S#,MEETINGHISTORY)
17
The Semantically Enriched SchemaThe Semantically Enriched Schema
Background / Semantic EnrichmentBackground / Semantic Enrichment
Object Relations:
COURSE (CODE, TITLE)
DEPT (D#, DNAME)
STUDENT (S#, SNAME)
TUTORIAL (T#, TUTORIALTITLE)
Fragment of Object Relations
HOBBIES(S#, HOBBY)
Relationship Relations:
STUDENTDEPT (S#, D#)
C_S (CODE, S#, GRADE)
ATTEND (CODE, T#, S#)
Fragment of Relationship Relations
COURSEMEETING (CODE, S#,MEETINGHISTORY)
fragment of C_S
18
3. Proposed Relational to XML Translation3. Proposed Relational to XML Translation
Outline
– ORA-SS Model– Relational Schema to ORA-SS Translation– ORA-SS to XML Schema Translation
19
ORA-SS ModelORA-SS Model
ORA-SS (Object-Relationship-Attribute model for Semi-Structured data)
G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, “ORA-SS: An Object-Relationship-Attribute Model for Semi-structured Data”, TR 21/00, National Univ. of Singapore, 2001
Proposed Relational to XML Translation / ORA-SSProposed Relational to XML Translation / ORA-SS
20
Concepts of ORA-SS Concepts of ORA-SS (cont.)(cont.)
Proposed Relational to XML Translation / ORA-SSProposed Relational to XML Translation / ORA-SS
C O U R SE
GRADE
ST U D EN T 1
C_S2,1:n,1:n
T U T O R IAL1
ATTEND3,1:n,1:n
ST U D EN T
S# SNAME
T U T O R IAL
T# TUTORIALTITLE
C_S
CODE TITLE
T_Ref
C_S_Ref
Object class
Ternary relationship
Binary relationship
Identifier ReferenceRelationship attribute
21
Enriched Relational Schema to ORA-SS Schema Enriched Relational Schema to ORA-SS Schema TranslationTranslation
Objectives:
• Identify object classes and their attributes from object relations
• Identify relationship types and their attributes from relationship relations
• Identify hierarchical structure
• Generate ORA-SS schema
Enriched Relational Schema to ORA-SS Schema TranslationEnriched Relational Schema to ORA-SS Schema Translation
22
Overview of Translation RulesOverview of Translation Rules
1. Object relation rules: to translate object relations
2. Relationship relation rules: to translate relationship relations
3. Combination rule: to be applied to the result obtained from the application of object and relationship relation rules, and generate the final ORA-SS schema.
Enriched Relational Schema to ORA-SS Schema TranslationEnriched Relational Schema to ORA-SS Schema Translation
23
Rule O1: Mapping object relationsRule O1: Mapping object relations
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Object Relation Translation Rules/Object Relation Translation Rules
STUDENT(S#, SNAME)
ST U D EN T
SNAMES#
Single-valued attribute
Maps to
24
Rule O2: Mapping fragment of object relationsRule O2: Mapping fragment of object relations
STUDENT(S#, SNAME)
HOBBIES(S#, HOBBY)
ST U D EN T
SNAMES# HOBBY*
Multivalued attribute
Maps to
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Object Relation Translation Rules/Object Relation Translation Rules
25
Rule R1: Mapping 1-m/1-1 relationship relation Rule R1: Mapping 1-m/1-1 relationship relation
Objectives:
Reduce disconnected elements
Use parent-child structure Avoid unnecessary redundancies
Use references
Example:
ADVISOR(STAFF#, POSITION) // object relation
STUDENT(S#, SNAME) // object relation
STU_ADV(S#, STAFF#) //1-m relationship relation
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules/Relationship Relation Translation Rules
26
Rule R1: Mapping 1-m/1-1 relationship relation Rule R1: Mapping 1-m/1-1 relationship relation (cont.)(cont.)
Case 1: All the objects (instances) of STUDENT participate in the relationship type STU_ADV
ADVISOR
STUDENT
STU_ADVMaps to
STU_ADV 2,0:n,1:1
Use parent-child structure
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules/Relationship Relation Translation Rules
27
Case 2:1. Not all the objects of STUDENT participate in STU_ADV.
2. STUDENT is already as a child object and all the objects of ADVISOR participate in STU_ADV .
Use parent-child structure
STUDENT
ADVISOR
STU_ADVMaps to
STU_ADV 2,0:1,1:n
Rule R1: Mapping 1-m/1-1 relationship relation Rule R1: Mapping 1-m/1-1 relationship relation (cont.)(cont.)
or
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules/Relationship Relation Translation Rules
28
Case 3:There exist objects of STUDENT and ADVISOR do not participate in STU_ADV
Rule R1: Mapping 1-m/1-1 relationship relation Rule R1: Mapping 1-m/1-1 relationship relation (cont.)(cont.)
STUDENT
ADVISOR1
STU_ADV orMaps to STU_ADV
2,*,?
Use reference
ADVISOR ADVISOR
STUDENT1
STU_ADV 2,*,?
STUDENT
A_Ref S_Ref
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules/Relationship Relation Translation Rules
29
Rule R2: Mapping m-n binary relationship relationRule R2: Mapping m-n binary relationship relation
C _S
G RADE
C O U R SE
TITLECODE
ST U D EN T
SNAMES#
GRADE
C O U R SE
TITLECODE
ST U D EN T
SNAMES#ST U D EN T 1
C_S,2,1:n,1:n
C_S C_S_REF
COURSE(CODE, TITLE)
C_S(S#, CODE, GRADE)
STUDENT (S#, SNAME)
GRADE
ST U D EN T
SNAMES#
C O U R SE
TITLECODEC O U R SE1
C_S,2,1:n,1:n
C_S C_S_REF
Preferred Mapping
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules/Relationship Relation Translation Rules
Three ways to map:
30
Other relationship relation rulesOther relationship relation rules
Fragment of relationship relation is translated similarly to the translation of the fragment of object relation.
N-ary relationship relation is translated using reference structures. The level of each referencing object may be determined by the aggregations.
If B ISA A, then B is mapped to a child object class (OB) of OA.
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules/Relationship Relation Translation Rules
31
Combination Rule: Combination Rule:
Example:
PERSON(SSNO, RACE) //object relation
STUDENT(S#, SSNO, MAJOR) //object relation
DEPT(D#, DNAME) //object relation
STU_DEPT(S#, D#) //relationship relation
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule/Combination Rule
STUDENT ISA PERSON and one DEPT has many STUDENT.
In this case, STUDENT potentially has multiple parents (i.e., DEPT and PERSON).
to be applied to the result obtained from the application of object and relationship relation rules, and generate the final ORA-SS schema.
32
Combination Rule: Combination Rule:
Current solution:
Use references (K. Williams, et al. January 2001)
-- It causes too many disconnected elements.
<!ELEMENT Results
(PERSON*, STUDENTS* DEPT*)>
<!ELEMENT PERSON (EMPTY)>
<!ATTLIST PERSON
SSNO ID #REQUIRED
RACE CDATA #IMPLIED
STU_REF1 IDREF #REQUIRED>
<!ELEMENT STUDENT (EMPTY)>
<!ATTLIST STUDENT
S# ID #REQUIRED
MAJOR CDATA #IMPLIED >
<!ELEMENT DEPT (EMPTY)>
<!ATTLIST DEPT
D# ID #REQUIRED
DNAME CDATA #IMPLIED
STU_REF2 IDREFS #REQUIRED>
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule/Combination Rule
33
Combination Rule: Combination Rule: (cont.)(cont.)
The priorities of translations (in descending order)
1. ISA, etc. semantic relationship relations and their fragments // high semantic cohesion among these participating object classes
2. 1-1 and 1-m relationship relation and their fragments // potentially represented as hierarchy (p-c) structure
3. m-1 relationship relations and their fragments // potentially represented as hierarchy structure; preferably view as 1-m
4. m-n, n-ary relationship relations and their fragments
This rule is used to avoid or reduce potential multiple parents.
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule/Combination Rule
Our approach:
Translations are produced sequentially according to their priorities. The translation with the lowest priority will be carried out last.
34
Combination Rule: Combination Rule: (cont.)(cont.)
S# ID #REQUIRED
MAJOR CDATA #IMPLIED >
<!ELEMENT DEPT (EMPTY)>
<!ATTLIST DEPT
D# ID #REQUIRED
DNAME CDATA #IMPLIED
D_S_REF IDREFS #REQUIRED>
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule/Combination Rule
S#
PER SO N
RACESSNO
D EPT
DNAMED#ST U D EN T
ISA,2,1:?,1:1
MAJOR
ST U D EN T1
D_S_REF
<!ELEMENT OurSolution (PERSON*, DEPT*)>
<!ELEMENT PERSON (STUDENT)>
<!ATTLIST PERSON
SSNO ID #REQUIRED
RACE CDATA #IMPLIED >
<!ELEMENT STUDENT (EMPTY)>
<!ATTLIST STUDENT
We map STUDENT to the child object class of PERSON first. Then map DEPT according to 1-m relationship relation rule. Thus, we may get the following result.
35
A possible ORA-SS Schema diagram derived from A possible ORA-SS Schema diagram derived from universityuniversity database database
ST U D EN T
SNAMES#
D EPT
DNAMEST U D EN T 2 D#
STUDENTDEPT2,0:n,1:1
GRADE
C O U R SE
CODE ST U D EN T 1
C_S2,1:n,1:n
C_S
TITLE HOBBY*
T U T O R IAL1
ATTEND3,1:n,1:n
T U T O R IAL
T# TUTORIALTITLE
C_S
MEETINGHISTORY
*T_REF
D_S_REF
C_S_REF
Enriched Relational Schema to ORA-SS Schema TranslationEnriched Relational Schema to ORA-SS Schema Translation
Object Relations:
COURSE (CODE, TITLE)
DEPT (D#, DNAME)
STUDENT (S#, SNAME)
TUTORIAL (T#, TUTORIALTITLE)
Fragment of Object Relations
HOBBIES(S#, HOBBY)
Relationship Relations:
STUDENTDEPT (S#, D#)
C_S (CODE, S#, GRADE)
ATTEND (CODE, T#, S#)
Fragment of Relationship Relations
COURSEMEETING (CODE, S#,MEETINGHISTORY)
fragment of C_S
36
Input: an ORA-SS schema diagram SDOutput: an XML DTDBegin Start from the top of SD and proceed downward, for each object class O encountered do:Step 1. Sub-object classes of O <!ELEMENT O (subelementsList)>Step 2. For each attribute A of O Case (1) A is a single valued simple attribute <!ATTLIST O A type> Case (2) A is a single valued composite attribute, replace A with its components and add to <!ATTLIST O attributename type> Case (3) A is a multivalued simple attribute <!ELEMENT A(#PCDATA)> Case (4) A is a multivalued composite attribute <!ELEMENT A(EMPTY)> A’s components <!ATTLIST A componentName type>Step 3. For each relationship attribute A under O, add A to subelementsList in <!ELEMENT O(subelementsList)>. Case (1) A is a simple attribute <!ELEMENT A(#PCDATA)>. Case (2) A is a composite attribute <!ELEMENT A(EMPTY)>, A’s components <!ATTLIST A componentName type>
Algorithm: Mapping ORA-SS Schema DiagramAlgorithm: Mapping ORA-SS Schema Diagram to XML DTD to XML DTD
37
Algorithm: Mapping ORA-SS Schema Diagram to XML DTDAlgorithm: Mapping ORA-SS Schema Diagram to XML DTD
<!ELEMENT UNIVERSITY (COURSE*, STUDENT*, DEPT*, TUTORIAL*)><!ELEMENT COURSE (STUDENT1*)> <!ATTLIST COURSE CODE ID #REQUIRED TITLE CDATA #IMPLIED> <!ELEMENT STUDENT1 (MEETINGHIS*,TUTORIAL1*)> <!ATTLIST STUDENT1 C_S_REF IDREF #REQUIRED GRADE CDATA #IMPLIED> <!ELEMENT MEETINGHIS (#PCDATA)> <!ELEMENT TUTORIAL1 (EMPTY)> <!ATTLIST TUTORIAL1 T_REF IDREF #REQUIRED><!ELEMENT STUDENT (HOBBIES*)>
<!ATTLIST STUDENT S# ID #REQUIRED SNAME CDATA #IMPLIED> <!ELEMENT HOBBIES (#PCDATA)><!ELEMENT DEPT (STUDENT2*)> <!ATTLIST DEPT D# ID #REQUIRED DNAME CDATA #IMPLIED> <!ELEMENT STUDENT2 (EMPTY)> <!ATTLIST STUDENT2 D_S_REF IDREF #IMPLIED><!ELEMENT TUTORIAL(EMPTY)> <!ATTLIST TUTORIAL T# ID #REQUIRED TUTORIAL_TITLE CDATA #IMPLIED>
The obtained XML structures (DTD)The obtained XML structures (DTD)
4. Comparison4. Comparison
Rich structured and represents the real world accurately
Yes ( ) [7], This paper
Partially [3]
No [1, 5, 6]
The representation of various relationship types and their attributes
Yes ( ) This paper
Partially [7]
No [1, 3, 5, 6]
Number of disconnected elements Few ( ) [7], This paper
Many Naïve approaches
Unnecessary redundancies Avoidable ( ) This paper
Partially [3, 7]
Many [1, 5, 6]
39
5 Conclusion5 Conclusion
Method proposed in this paper achieves
Generation of semantically sound XML structures for relational data possible
Generation of properly structured XML data without unnecessary redundancies and proliferation of disconnected XML elements possible
ReferencesReferences
[1] S. Banerjee, et al “Oracle 8i – The XML Enabled Data Management System”,
Proc. 16th Int’l Conf. on Data Engineering, 2000
[2] G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, “ORA-SS: An Object-
Relationship- ttribute Model for Semi-structured Data”, TR 21/00, NUS, 2001
[3] D.W. Lee, M. Mani, F. Chiu, W.W Chu, “Nesting-based Relational-to-XML
Schema Translation”, Proc, 4th Int’l Workshop on Web and Databases, 2001
[4] T.W. Ling, M.L. Lee, “Relational to Entity-Relationship Schema Translation
Using Semantic and Inclusion Dependencies”, In Journal of Integrated
Computer-Aided Engineering, pages 125-145, 1995
[5] SYBASE, “Using XML with the Sybase Adaptive Server SQL Databases, A
Technical Whitepaper”, http://www.sybase.com,2000
[6] V. Turau, “Making Legacy Data Accessible for XML Applications”,
http://www.informatik.fh-wiesbaden.de/~turau/veroeff.html1999
[7] K. Williams, et al., “XML Structures for Existing Databases”, http://www-
106.ibm.com/developerworks/library/x-struct/ January 2001
[8] W.Y. Du, M.L. Lee, T.W. Ling, “XML Structures for Relational Data”,
Proc. 2nd Int’l Conf. on Web Information Systems Engineering (WISE) , IEEE Computer Society, 2001