eXtensible Stylesheet Language
Department of Computer Science Institute for System Architecture, Chair for Computer Networks
2
Scenario of document generation
id name first name title room1 Hawking Stephen Professor 422 Newton Isaac Sir 433 Kepler Johannes Professor 71… … … … …
Web server
Data representation Data extraction
Transformation
InterpretationTransfer
Content + Style / Layout = Presentation
XHTML
3
Layer Model
• The progression can be viewed as layer model• Data inside Storage Layer is extracted and forwarded to the
user after transformation to presentation format through processing layer
Storage Layer
Processing Layer
Presentation Layer
RDBMS File SystemXML Repository . . .
eXtensible Markup Language
Style sheetsLogic
(X)HTML PDF PostScript WMLRTF . . .
Browser Printer PDF Reader Mobile Application . . .
4
Extensible Stylesheet Language
• The eXtensible Stylesheet Language (XSL) is a set of W3C recommendations for XML transformation and presentation that is widely used to generate arbitrary documents
• It consists out of:1. XML Path Language (XPath)
an expression language for addressing parts of an XML document
2. XSL Transformations (XSLT) a language for transforming one XML representation toanother
3. XSL Formatting Objects (XSL-FO) an XML vocabulary for specifying formatting semantics
Processing Layer
eXtensible Markup Language
XSL stylesheetsXSLT-, XSL-FO-Processors, XPATH, …
Used as integrative data representation
5
XML Path Language
• The XML Path Language (XPath) is a non-XML syntax for addressing parts of an XML document
• To navigate in XML trees and select nodes or sets of nodes it uses path expressions
• A location step consists out of three parts:
Examples: • child::person[child::name] all children of type “person”
which have a child of type “name”• /descendant::person[position()=23] selects the twenty third
person in the document (absolute location path)
specifies the tree relationship between the nodes selected by the location step and the context node (e.g. “child”, “parent”, “attribute”, “self”)
specifies the node type
zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step
Step = AxisName '::' NodeTest '[' Expression ']'
6
XML Path Language
• Beside the expanded syntax there exists an abbreviated form • Some important abbreviated location path expressions are:
nodename Selects all child nodes of the node of type ‘nodename’
“title” selects all child elements “title”
nodename1/nodename2 Selects children of ‘nodename1’ which have type ‘nodename2’
“document/title” selects all “title” child nodes of “document” elements
../nodename Selects children of the parent of the current node
“../title” moves to the parent of the current node and selects its child element “title”
nodename/@attr Selects the attribute ‘attr’ of the children of type “nodename”
“document/@id” selects the attribute “id” of the element “document”
. Selects the current node ---
.//nodename Selects all element nodes of type ‘nodename’ that are successors of the current node (located in an arbitrary depth of the XML tree)
“.//title” selects all elements “title” that are successors of the current node (e.g. “document”)
* Selects all children of the current node ---
/path ‘/’ introduces an absolute path (root-node) ---
7
XML Path Language
• XPath is the conceptual basis of further W3C specifications:– XQuery
Language for querying for information in XML documents
Example: Counts all elements “book” in an XML file
– XML Pointer Language (XPointer)Language for pointing to specific parts of an XML document
Example:Selects the second child element of the root element (= the first element)
in the file ‘list.xml’
– XML Linking Language (XLink)Language for creating hyperlinks in an XML document
Example:
XQuery
XSLT
XLinkXPointer
XPath
fn:count(//book)
xlink:href="list.xml#element(/1/2)"
<anchor xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://www.w3.org/list.xml#id('msc').child(5,item)">text</anchor>
Defines a simple link that points via XPointer to the fifth item in a list with a unique id of “msc” in the file list.xml located at www.w3.org
8
XSL Transformation
• XSLT is an XML based language used for transforming one XML tree to another XML tree (or e.g. another text based representation)
• It is based on a separation of content and style of the resulting tree:– Content is available as XML data– Style is available as an XSLT stylesheet file (valid XML)
• An XSLT processor takes the XML data as input and generates the output file whose structure is described in the XSLT stylesheet
XML Input file
XML Output file
XSLT ProcessorXSLT
declarations(stylesheet)
9
XSL Transformation
• There exist two main areas of application for XSLT:1. Communication Oriented Transformation
• XML output file is a format for machine communication
• Often used in the context of Message Oriented Middleware or Service Oriented Architectures
• One common target format is the SOAP protocol:
XMLdata XSLT Service
Service Provider
Network
Service Requestor
2. Presentation Oriented Publishing (POP) Output format is generated for purpose of presentation
(e.g. as PDF or Website)Content of this lecture
SOAPenvelope
SOAPstyle
10
XSL Transformation
• The XSLT stylesheet is a collection of templates• Each template defines which actions (e.g. generating XML
tags) are performed if a particular node is examined• The XML source tree is processed recursively beginning with
the root node• If the processing order is not changed explicitly the children
of the current node are traversed from left to right• XSLT uses XPath to specify nodes (e.g. by ‘match’ or ‘select’)
root node
document
title chapter
source tree result (XHTML) tree
chapter
paragraph paragraph paragraph
root node
htmlhead
title
body
h1 p p h1 p
transformation
title titletext text text text text text
text text text text text
text
11
<xsl:apply-templates select="chapter"/>. . . <xsl:template match="chapter"><html:h1><xsl:value-of select="title"/></html:h1><xsl:value-of select="paragraph"/></template>. . .
XSL Transformation
transformation
• Particular template is processed by ‘apply-templates’-instruction that results in a new set of selected nodes:
result (XHTML) tree
root node
htmlhead
title
body
h1 p p h1 p
text text text text text text
root node
document
title chapter
source tree
chapter
paragraph paragraph paragraphtitle title
text text text text text
text
12
XSL Transformation example
<?xml version="1.0" ?><staff>
<staffMember><name>Hawking</name><firstname>Stephen</firstname><title>Professor</title><room>42</room>
</staffMember><staffMember>
<name>Newton</name><firstname>Isaac</firstname><title>Sir</title><room>43</room>
</staffMember></staff>
<table><tr>
<td>Hawking</td><td>Stephen</td><td>Professor</td><td>42</td>
</tr><tr>
<td>Newton</td><td>Isaac</td><td>Sir</td><td>43</td>
</tr></table>
<xsl:template match="staff"><table><xsl:apply-templates/></table>
</xsl:template><xsl:template match="staffMember">
<tr><xsl:apply-templates/></tr></xsl:template><xsl:template match="name">
<td><xsl:value-of select=". "/></td></xsl:template><xsl:template match="firstname">
<td><xsl:value-of select="."/></td></xsl:template><xsl:template match="title">. . .
+
13
XSLT language constructs
• Beside the simple template mechanism XSLT features many powerful language constructs making it an expressive programming language
• Some of these constructs are:– <xsl:for-each select=“XPATH expression”>
• Selects every XML element of a specified node-set – <xsl:if test=“XPATH expression”>
• Puts a conditional test against the content of the XML file– <xsl:sort select=“XPATH expression”/>
• Selected nodes can be sorted in alphabetical or numeric order
– <xsl:variable name=“somename” select=“XPATH expression”/>• Declares a variable which is initialised with the value
specified by the XPATH expression and can be accessed by $somename
14
XHTML generation
• An often used application of XSLT is the generation of XHTML documents
• By this a website can be generated dynamically out of XML content and thus be adapted to the characteristics of a requesting client
• In contrast to the XSLT stylesheet, which describes the resulting XHTML tree, Cascading Style Sheets (CSS) are used to describe the layout of the XHTML document in the web browser
XMLdocument
XHTMLdocument
XSLT
Presentationlayout+
CSSdocument
XSLStylesheet
+
15
XHTML generation
XHTML Documents
1. Document has been generated before a client request occurs
2. Transformation done by server
3. Transformation done by client
web serverXHTML data XHTML data
XHTML dataXML data
XSLT stylesheet
XML data XML data
XSLT Stylesheet
web browser
web server web browser
web server web browser
generates XHTML
• There exist three different possibilities for the XHTML generation process:
16
XHTML generation on server
Web browser
Web server
1
1
2
3
4
Request for a dynamically generated document
Application server
Script code
2
Web accessible logic (e.g. Script code in form of Java Server Pages or Active Server Pages) is executed on application server
3
Load XML and XSL data into memory
Pass the XML and XSL data to a XSLT processor
3
XML XSL
XSLT processor4
5 Generate the XHTML file
6 Send the XHTML file to the web browser
6
XHTMLfile
5
17
XHTML generation on client
Web browser
Web server
1
2
3
4
Request for an XML document and delivery of this document
Web browser determines reference to XSL file that is included in the XML document
Request for XSL-document and delivery of this document
Out of the XML and XSL data the browser’s XSLT processor generates the XHTML document
• Alternative 1: the browser requests an XML document that contains XSL stylesheet reference:
<?xml-stylesheet type="text/xsl" href="pathtofile.xsl"?>
1
XML document
2
3
XSLT processor
XSLdocument
4 4
XHTML
4
18
XHTML generation on client
Web browser
Web server
1
1
2
3
4
Request for a document and delivery of this document
Web browser executes JavaScript code
JavaScript code loads XML data JavaScript code loads XSL data
5 The transformation of XML data to XHTML tags is done inside the JavaScript-Engine
6 Generated XHTML-tags are included into the document that finally can be displayed
2
3
• Alternative 2: the requested document contains JavaScript code, that organises the request for the XML and XSL files
This method works in non XSLT aware browsers
<html>JavaScript-Code</html>
JavaScript-Engine
XML XSL
4
XHTMLtags
65
19
Transformation on client
<html><body>
<script type="text/javascript">var xml = new ActiveXObject("Microsoft.XMLDOM")xml.async = falsexml.load("data.xml")var xsl = new ActiveXObject("Microsoft.XMLDOM")xsl.async = falsexsl.load("style.xsl")document.write(xml.transformNode(xsl))
</script></body>
</html>
• The following code fragment shows a possibility for a JavaScript based transformation on client
• In real applications it is necessary to detect the client’s web browser and depending on this information instantiate the right XML parser
Instantiating the Microsoft XML parser
Load the XML datawithout delay
Start the transformation
Load XSL data
20
Client-dependent transformation
Web server
Categorizer
Regular web browser
MobileClient
XSLT workflow
WML Scheme
XHTML Schemechooses
Adapteddocument reply
• Depending on the requesting client a categorizer chooses the necessary stylesheet and forwards it to the XSLT procedure that generates the client adapted document
request
21
XSL Formatting Objects
• XSLT can only transform one XML tree to another or alternatively to text based formats and thus is inapplicable for generating arbitrary result documents such as especially page oriented representations
• To specify complex page layout for a document the W3C has released the XSL Formatting Object (XSL-FO) standard
• XSL-FO is an XML based mark-up language describing the formatting of XML data for output to screen, printer or other media
• An XSL-FO file describes what the pages look like and where the content has to be placed in a very detailed way
• Further developments of the CSS specification (CSS Level 3) show an increasing convergence between CSS and XSL-FO though XSL-FO still offers a more powerful expressiveness regarding page-oriented layout
• Alternative for generating PDF files: iText (http://itextpdf.com/)
XSL-FOfile PDF
XSL-FOprocessor
RTF
Bitmap
. . .
External resources
(images etc.)
22
XSL Formatting Objects
<fo:layout-master-set><fo:simple-page-master
master-name="example"page-width="210mm"page-height="297mm"margin-top=""margin-bottom=""margin-left=""margin-right=""><fo:region-body margin="2cm"/><fo:region-before extent="2cm"/><fo:region-after extent="2cm"/><fo:region-start extent="1cm"/><fo:region-end extent="1cm"/>
</fo:simple-page-master></fo:layout-master-set>
region-before
region-after
regi
on-s
tart region-end
region-body
• Different page layouts can be defined inside the “layout-master-set”
23
XSL Formatting Objects
<?xml version="1.0" encoding="iso-8859-1"?><fo:root
xmlns:fo="http://www.w3.org/1999/XSL/Format"><fo:layout-master-set>
<!– definition of layout-master --></fo:layout-master-set><fo:page-sequence
master-reference="example"><fo:flow flow-name="xsl-region-body">
<fo:block font-family="Arial" font-size="12pt">
The more you sweat in training,</fo:block><fo:block font-family="Verdana"
font-size="18pt">the less you'll bleed in battle.
</fo:block></fo:flow>
</fo:page-sequence></fo:root>
The more you sweat in training,the less you'll bleed in battle.
• The content is embedded into text-blocks inside the page flow
24
Generating XSL-FO
• In practise an XSL-FO file is generated dynamically out of an XML document by XSLT
• The input file’s markup has to be replaced entirely by XSL-FO-Markup except a few allowed objects that can be embedded into the XSL-FO data (e.g. vector graphics)
XSL-FOdata
XSL-FOprocessor
RTF
Bitmap
. . .
XML Input file
XSLT declarations(style sheet)
XSLT processor
External resources
(images etc.)
25
Generating XML
• In general the content that should be presented is not stored in XML and thus must be transferred to XML at first
• The conversion of data from a relational database to XML is done in three steps:1. Query for data by SQL2. Storage of data in a temporary data structure (e.g. an
Java object)3. Transformation of this data structure to XML by using a
language specific Application Programming Interface (e.g. the Simple API for XML – SAX)
JavaObject
XMLdata
1. SQLLogic
3. API2.
26
XSL-FO based document generation
Server Client
HTTP response
webbrowser
HTTP request
ServletProgram
LogicSQL
XMLrepresentation
of data
XSLTstylesheet
XSL-FOprocessor
External viewer
1
23
4
5
7 8
9
• A (Java) Servlet can function as central point to deliver a document to a client
• The following image shows an reference infrastructure for XSL-FO based document generation (e.g. PDF files) in the World Wide Web
XSLT processor
6
27
XSL-FO based document generation1
2
3
Web browser sends an HTTP request to the Servlet
Servlet invokes program logic for accessing the database and for generating the document
Data is extracted via SQL from a relational database
4 Data is transferred to an XML representation
5 With the use of an XSLT processor the pure XSL-FO data is generated out of the XML data and the XSLT stylesheet
6 XSL-FO data is forwarded to an XSL-FO processor; document is generated
7 Document is passed as byte-stream to the Servlet
8 Servlet sets the http-response content type (“MIME-type”) to specific value (e.g. application/pdf or image/png) and sends the document to the client
9 If the document can not be directly presented in the web browser, an external application is invoked for this purpose
28
Useful target formats
• Beside the already described document types some further useful target formats for an XSL transformation are:– DocBook
Open document standard that is used to generate Unix Manpages and computer documentation in general but also HTML and PDF files
– WorldML / SpreadsheetMLXML based Markup Languages that can be used by current office products as e.g. Microsoft Word and Microsoft Excel
– Java2D / Abstract Window Toolkit (AWT)XSL-FO description can be transformed to a window that displays the single pages
– Really Simple Syndication (RSS)/AtomFamous formats for web feeds used in web blogs etc.
29
XSLT / XSL-FO processors
• Demands on XSLT / XSL-FO processors:– Implementation of at least large parts of the XSL
specification (because of the complexity of the spec. a full implementation is rare)
– Availability for desired programming language– Adequate documentation – Adequate set of example applications – Support for additional features– Modular structure which enables simple integration in
own software– Acceptable output generation speed– Support for the wanted XSLT version
• Special demand on XSLT processors:– Validating input and output XML trees
• Special demand on XSL-FO processors:– Support for a variety of output formats
XSLT processor
input
output
XMLValidator
Schemadefinition
valid / not valid
30
XSLT processors
• In addition to the mentioned web browser embedded XSLT processors, there exist many free or commercial implementations of the XSLT specifications
• Examples of XSLT processors are:– XALAN
• Java and C++ open source software library that is part of the Apache project
– SAXON• Open Source basic processor and commercial
schema-aware solution and thus supporting input/output validation
• Includes XQuery processor– .NET Class System.Xml.Xsl.XslCompiledTransform
• XSLT processor that is integrated in the Microsoft .NET Framework
31
XSL-FO processors
• XSL-FO processors differ highly in the degree of XSL-FO spec. implementation and supported output types
• Examples for XSL-FO processors are:– XSL Formatter
• Commercial processor with full XSL-FO spec. support and many additional features
• Offers interfaces for various programming languages (Java, C++, …)
– Apache Formatting Objects Processor (FOP)• Open source Java application that supports different
output formats and the redirection of the output directly to a printer
32
Conclusion
Storage Layer
Processing Layer
Presentation Layer
RDBMS File SystemXML Repository . . .
Temporary data structure
XML APIXML dataXSLTstylesheet
XSL-FOstyle
XSLT processor
XHTML
XSL-FOprocessor
XML Text
resources(images, …)
PDF PNGPostScript. . .
Browser Printer PDF Reader Mobile Application . . .
33
References
Links at W3C:
XSL homepage http://www.w3.org/Style/XSL/XSL spec. http://www.w3.org/TR/xsl11/XSLT 1.0 spec. http://www.w3.org/TR/xslt XSLT 2.0 spec. http://www.w3.org/TR/xslt20/XPath 1.0 spec. http://www.w3.org/TR/xpathXPath 2.0 spec. http://www.w3.org/TR/xpath20/
Further Links
FOP http://xmlgraphics.apache.org/fop/SAX http://www.saxproject.org/SAXON http://saxon.sourceforge.net/XALAN http://xalan.apache.org/XSL Formatter http://www.antennahouse.com/