46
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 1 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

  • View
    217

  • Download
    4

Embed Size (px)

Citation preview

Page 1: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 1

Database Systems I

The Semistructured Data Model

Page 2: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 2

The Web Today

HTML documentsgenerated by humans or by applications,consumed by humans only,easy access: across platforms, across organizations.

only layout, no semantic information

Limited application interoperabilityHTML not understood by applications

at most, some heuristic rules. Database technology

SQL standard, but still lots of vendor specific aspects in implementations.

Page 3: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 3

XML Data Exchange FormatA standard from the W3C (World Wide Web Consortium, http://www.w3.org).The mission of the W3C

„. . . developing common protocols that promote its evolution and ensure its interoperability. . .“.Basic ideas

XML = dataXML generated by applicationsXML consumed by applicationsEasy access: across platforms, organizations.

Page 4: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 4

Paradigm Shift on the Web

For web search engines: From documents (HTML) to data (XML)From document management to document understanding (e.g., question answering)From information retrieval to data management

For database systems:From relational (structured) model to semistructured dataFrom data processing to data /query translationFrom storage to transport

Page 5: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 5

The Semistructured Data Model

Developed by the DBS community to address the following, emerging issuesData sets with non-rigid structure

Biological datasequence data, 3D data, text data . . . and their relationships Web data

Integration of heterogeneous sourcesnot only, but especially for Web data and biological data.

Page 6: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 6

The Semistructured Data Model

Data is self-describing, i.e. the data description is integrated with the data itself rather than in a separate schema.Database is a collection of nodes and arcs (directed graph).Leaf nodes represent data of some atomic type (atomic objects, such as numbers or strings).Interior nodes represent complex objects consisting of components (child nodes), connected by arcs to this node.Arcs are directed and connect two nodes.

Page 7: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 7

The Semistructured Data Model

Arc labels indicates the relationship between the two corresponding nodes.The root node is the only interior node without in-arcs, representing the entire database.All database objects are children of the root node.Every node must be reachable from the root.A general graph structure is possible, i.e. the graph need not be a tree structure.

Page 8: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 8

Graphical Representation

&o1

&o12 &o24 &o29

&o43&96

&243 &206

&25

“Serge”“Abiteboul”

1997

“Victor” “Vianu” 122 133

paper bookpaper

references

references references

authortitle year httpauthor

authorauthor

titlepublisherauthor

authortitle

page

firstnamelastname firstname lastname first

last

Bib

complex object

atomic object

Page 9: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 9

Textual RepresentationExample: Bib: &o1 { paper: &o12 { … },

book: &o24 { … }, paper: &o29 { author: &o52 “Abiteboul”, author: &o96 { firstname: &243 “Victor”, lastname: &o206 “Vianu”}, title: &o93 “Regular path queries with constraints”, references: &o12, references: &o24, pages: &o25 { first: &o64 122, last: &o92 133} } }

Nested tuples, set-values, object identifiers (oids)

Page 10: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 10

Textual Representation

Simplified textual representation.Can omit oids.

{ paper: { author: “Abiteboul”, author: { firstname: “Victor”, lastname: “Vianu”}, title: “Regular path queries …”, page: { first: 122, last: 133 } } }

Page 11: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 11

Comparison with Relational Model

Missing attributesAdditional attributesMultiple attribute values (set-valued attributes)Objects as attribute valuesNo global schema

only the first characteristics supported by relational model, all others are not

Page 12: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 12

Comparison with Relational Model

Semistructured data Self-describing,

Irregular data,

No a-priori structure.

Relational DB Separate schema,

Regular data,

A-priori structure.

Page 13: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 13

Comparison with Relational Model

{ row: { name: “John”, phone: 3634 }, row: { name: “Sue”, phone: 6343 }, row: { name: “Dick”, phone: 6363 }}

n a m e p h o n e

J o h n 3 6 3 4

S u e 6 3 4 3

D i c k 6 3 6 3

row row row

name name namephone phone phone

“John” 3634“Sue” “Dick”6343 6363

Example

Page 14: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 14

XML

A W3C standard for an Extensible Markup Language.Origins: Structured text SGML (Standard Generalized Markup Language).Motivation

HTML describes presentation only, XML describes content and its meaning (semantics). HTML is fix language, XML allows to define your own markup languages.

SGMLXMLHTML

Page 15: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 15

From HTML to XML

HTML describes the presentation / layout

Page 16: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 16

From HTML to XML

HTML example

<h1> Bibliography </h1><p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995<p> <i> Data on the Web </i> Abiteboul, Buneman, Suciu <br> Morgan Kaufmann, 1999

Page 17: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 17

From HTML to XMLXML example<bibliography>

<book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley

</publisher> <year> 1995 </year> </book> …

</bibliography>

XML describes the content

Page 18: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 18

Elements

Tags book, title, author, …

start tag: <book>, end tag: </book>defined by user / programmer (different from HTML!)

Elements <book>…<book>,<author>…</author>

An element consists of a matching start and end tag and the enclosed content.Elements can be nested, i.e. content of one element can consist of sequence of other elements.

Page 19: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 19

Attributes

Attributes can be associated with any element.

Provide additional information about elements.

Attributes can have only one value.

Example<book price = “55” currency = “USD”>

<title> Foundations of Databases </title>

<author> Abiteboul </author>

<year> 1995 </year>

</book>

Attributes can also be used to connect elements.

Page 20: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 20

Non-tree-like XML

So far: only tree-like XML documents,i.e. each element is nested within at most one other element.Attributes can also be used to create non-tree XML documents.Attributes with a domain of ID serve as primary keys of elements.Attributes with a domain of IDREF serve as foreign keys referencing the ID of another element.

Page 21: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 21

Non-tree-like XML

Example of a non-tree structure<persons> <person personid=“o555”>

<name> Jane </name> </person> <person personid=“o456”> <name> Mary </name> <children refs=“o123 o555”</children > </person> <person personid=“o123” mother=“o456”> <name>John</name> </person></persons>

Page 22: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 22

NamespacesAn XML document can involve tags that come for multiple sources.One and the same tag can appear in more than one source.

<table> <tr> <td>Apples</td> <td>Bananas</td>

</tr> </table>

<table> <name>African Coffee Table</name> <width>80</width><length>120</length>

</table>

Page 23: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 23

NamespacesName conflicts can be resolved by prefixing tag names according to their source.<h:table>

<h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr>

</h:table> <f:table>

<f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length>

</f:table> When using prefixes in XML, a namespace for the prefix must be defined.The namespace must be referenced (via an URI) in the start tag of an enclosing element .

Page 24: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 24

Namespaces<h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> . . . </h:tr> </h:table> <f:table

xmlns:f="http://www.w3schools.com/furniture"> . . . </f:table> </root>

Or alternatively:

<root xmlns:h="http://www.w3.org/TR/html4/" xmlns:f="http://www.w3schools.com/furniture"> <h:table> . . .

</h:table> <f:table> . . .</f:table>

</root>

Page 25: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 25

NamespacesA URI is a Universal Resource Identifier, typically a URL.

The document referenced by the URI describes the meaning of the tags in the namespace.

This description is informal and is not used by the XML parser.

The description can even be empty.

Page 26: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 26

Well-Formed XMLA well-formed XML document satisfies the following conditions:

Begins with a declaration that it is XML.

Has a single root element that encloses the whole document.

Consists of properly nested elements, i.e. start and end tag of an element are within the same enclosing element.

standalone =“yes” states that document has no DTD.

In this mode, you can invent your own tags, like in semistructured data model.

Page 27: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 27

Well-Formed XML<?XML version=“1.0” standalone =“yes” ?><bibliography>

<book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> <book> <title> … </title> . . . </book> …

</bibliography>

Page 28: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28

Well-Formed XMLHTML browsers will display documents with errors (like missing end tags).

The W3C XML specification states that a program should stop processing an XML document if it finds an error.

The main reason is that XML is being consumed by programs rather than by humans (as HTML).

W3C provides a validator that checks whether an XML document is well-formed.

Page 29: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 29

Valid XML

The validator can also check whether an XML document is valid, i.e. conforms to a Document Type Definition (DTD).

A DTD specifies the allowable tags and how they can be nested.

XML with a DTD is no longer semistructured (self-describing).

However, a DTD is less rigid than the schema of a relational DB. E.g., a DTD allows missing and multiple attributes / elements.

Page 30: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 30

Document Type DefinitionsDocument Type Definition (DTD): set of rules (grammar) specifying elements, attributes and all other aspects of XML documents. For each element, specify name and content type. Content type can, e.g., be

#PCDATA (character string), other elements, regular expression made of the above content types

* = zero or more occurrences? = zero or one occurrence+ = one or more occurrences, = sequence of elements.

Page 31: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 31

Document Type Definitions

<!ELEMENT Book (title, author*) >

<!ELEMENT title #PCDATA> <!ELEMENT author (name, address,age?)>

<!ATTLIST Book id ID #REQUIRED> <!ATTLIST Book pub IDREF #IMPLIED>

Specification of element type “<!ELEMENT“ <Name> <Content> “>“

Specification of attributes “<!ATTLIST“ <ElementName>

<AttributeName> <Content> <Type> “>“

Attribute type either #REQUIRED or #IMPLIED (optional).

Page 32: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 32

Document Type DefinitionsID: domain with unique values within the given document.

IDREF: references one ID.

IDREFS: references a list of IDs.

Example

<Book id = „book1“ pub = „book5“ . . .>

. . .

<Book id = „book5“ pub = „book4“ . . .>

Page 33: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 33

Document Type DefinitionsDocument type contains all corresponding element types:

“<!DOCTYPE“ <Name> “[“ <ElementTypes> “]>“

Use of DTD by some document:

reference DTD in document opening line

STANDALONE = “no“.

Example

<?XML version=“1.0” standalone =“no” ?>

<!DOCTYPE Book SYSTEM =“Book.dtd”>

Page 34: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 34

Example DTD: Product Catalog<!DOCTYPE CATALOG [

<!ELEMENT CATALOG (PRODUCT+)>

<!ELEMENT PRODUCT (SPECIFICATIONS+,OPTIONS?,PRICE+,NOTES?)>

<!ATTLIST PRODUCT NAME CDATA #IMPLIED

CATEGORY (HandTool|Table|Shop-Professional) "HandTool"

PARTNUM CDATA #IMPLIED

PLANT (Pittsburgh|Milwaukee|Chicago) "Chicago"

INVENTORY (InStock|Backordered|Discontinued) "InStock">

<!ELEMENT SPECIFICATIONS (#PCDATA)>

<!ATTLIST SPECIFICATIONS WEIGHT CDATA #IMPLIED

POWER CDATA #IMPLIED>

<!ELEMENT OPTIONS (#PCDATA)>

<!ATTLIST OPTIONS FINISH (Metal|Polished|Matte) "Matte"

ADAPTER (Included|Optional|NotApplicable) "Included"

CASE (HardShell|Soft|NotApplicable) "HardShell">

<!ELEMENT PRICE (#PCDATA)>

<!ATTLIST PRICE MSRP CDATA #IMPLIED

WHOLESALE CDATA #IMPLIED

STREET CDATA #IMPLIED

SHIPPING CDATA #IMPLIED>

<!ELEMENT NOTES (#PCDATA)> ]>

Page 35: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 35

XML SchemaThe successor of DTDs to specify a schema for XML documents.

A W3C standard.

Includes and extends functionality of DTDs.

In particular, XML Schemas support data types. This makes it easier to validate the correctness of data and to work with data from a database.

XML Schemas are written in XML. You don't have to learn a new language and can use your XML parser to parse your Schema files.

Page 36: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 36

Simple ElementsSimple elements contain only text.

They can have one of the built-in datatypes:

xs:string, xs:decimal, xs:integer, xs:boolean

xs:date, xs:time.

Example<xs:element name="lastname“

type="xs:string"/>

<xs:element name="age" type="xs:integer"/>

<xs:element name="dateborn" type="xs:date"/>

Page 37: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 37

Simple ElementsRestrictions allow you to further constrain the content of simple elements.

<xs:element name="age">

<xs:simpleType>

<xs:restriction base="xs:integer">

<xs:minInclusive value="0"/> <xs:maxInclusive value="120"/>

</xs:restriction>

</xs:simpleType>

</xs:element>

Page 38: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 38

AttributesAttributes can be specified using the attribute element:

<xs:attribute name="xxx" type="yyy"/>

Attribute elements are nested within the element of the element with which they are associated.

By default, attributes are optional.

To make an attribute mandatory, use

<xs:attribute name="lang“

type="xs:string“use="required"/>

Attributes can have the same built-in datatypes as simple elements.

Page 39: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 39

Complex ElementsComplex elements can contain other elements and can have attributes.

Nested elements need to occur in the order specified.

The number of repetitions of elements are controlled by the attributes minOccurs and maxOccurs. The default is one repetition.

A complex element with an attribute:

<xs:element name="product">

<xs:complexType> <xs:attribute name="prodid"

type="xs:positiveInteger"/> </xs:complexType> </xs:element>

Page 40: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 40

Complex Elements

A complex element containing a sequence of nested (simple) elements:

<xs:element name="employee"> <xs:complexType> <xs:sequence>

<xs:element name="firstname" type="xs:string"/>

<xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>

</xs:element>

Page 41: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 41

Complex Elements

If you name the complex element, other elements can reference and include it:

<xs:complexType name="persontype">

<xs:sequence>

<xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence>

</xs:complexType>

<xs:element name="person" type="persontype"/>

Page 42: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 42

XML Document With SchemaAn XML document that uses a schema has to reference the schema in the schemaLocation attribute of its root element :

<?xml version="1.0"?>

<note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

</note>

Page 43: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 43

Example XML Schema<schema version=“1.0”

xmlns=“http://www.w3.org/1999/XMLSchema”><element name=“author” type=“string” /><element name=“date” type = “date” /><element name=“abstract”> <type> … </type></element><element name=“paper”> <type> <attribute name=“keywords” type=“string”/> <element ref=“author” minOccurs=“0”

maxOccurs=“*” /> <element ref=“date” /> <element ref=“abstract” minOccurs=“0”

maxOccurs=“1” /> <element ref=“body” /> </type></element></schema>

Page 44: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 44

XML vs. Semistructured DataBoth described best by a graph.Both are schema-less, self-describing(XML without DTD / XML schema).XML is ordered, semistructured data is not.XML can mix text and elements:

<talk> Making Java easier to type and easier to type

<speaker> Phil Wadler </speaker> </talk>

XML has lots of other stuff: attributes, entities, processing instructions, comments.

Page 45: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 45

SummaryDue to their variable and complex structure, Web documents cannot naturally be modeled using the relational model.The Semistructured Data Model is a self-describing data model providing sufficient flexibility for representing Web documents.One of the weaknesses of the Web is that (HTML) documents cannot be processed automatically.The purpose of XML is to provide a way of recording the semantics of Web documents and their components. For this sake, XML allows you to define your application-specific tags.

Page 46: CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 46

SummaryXML documents are lists of elements and attributes. Elements can be nested to form tree-like structures.Non-hierarchical structures are also possible.Document type definitions (DTDs) are similar to but less restrictive than DB schemas, specifying rules that corresponding XML documents have to satisfy.XML schemas are a more recent and more DB-like extension of DTDs.