Linguistic markup and processing of transclusion in XML documents (Notes)

mFiL 2015 1

Linguistic markup and processing of transclusion in XML documentsSimon Dew BA MISTC6 November 2015

Copyright © Simon Dew 2015.This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Hello. My name is Simon Dew. The presentation I’m going to give today grew out of my work at Stanley Black & Decker Innovations from 2006 to 2015, where I was required to deliver single source documentation in multiple languages.

I’m going to describe the linguistic problems I encountered with transclusion in XML documentation, and the solutions I developed to help solve them.

mFiL 2015 2

Transclusion

So, let’s define our terms. What is transclusion?

mFiL 2015 3

Transclusion

• Theodor Holm Nelson, 1981: Literary Machines• The inclusion of an electronic document, or part of a document, in

the rendering of another document.• The main document does not contain a copy of the transcluded

text, but only a reference to it.• The software used to render the document obtains the transcluded

material and incorporates it into the main work.

Ted Nelson photo by DgiesLicensed under CC BY-SA 3.0

Transclusion is a term invented by Ted Nelson, the originator of the concept of hypertext, in his 1981 work, Literary Machines [Nelson].

It refers to the inclusion of an electronic document, or part of a document, in the rendering of another document.

The main document does not contain a copy of the transcluded text, but only a reference to it.

The software used to render the document obtains the transcluded material and incorporates it into the main work.

Transclusion is a hugely important concept for the presentation of content on the WWW and in electronic publication workflows.

mFiL 2015 4

Transclusion

This presentation focuses on transclusion in XML (Extensible Markup Language) documents, including, but not limited to:

• DocBook• DITA• TEI• XHTML

This presentation focuses on transclusion in XML, the Extensible Markup Language [XML]. I’m assuming that everyone is familiar with XML. It's widely used in the digital humanities.

There are several XML standards for documentation.

The techniques I’m going to talk about were originally developed for DocBook XML [DocBook], but they are generally applicable for any XML documentation standard, including the Darwin Information Typing Architecture [DITA], the Text Encoding Initiative [TEI] and XHTML [HTML].

mFiL 2015 5

Transclusion

Transclusion can be large scale / context-free:

Transclusion works on two scales. It can be large scale, or context-free, which is when you build large documents by reusing smaller chunks of content. For example, using a DITAMAP or a DocBook assembly.

mFiL 2015 6

Transclusion

Transclusion can be small scale / parametrised:

Transclusion can also be small scale or parametrised. This is where you inject small snippets of text into a larger text flow. These snippets of text may be proper nouns such as product names or company names, or they may be common nouns such as product categories.

mFiL 2015 7

Transclusion


• General entities

Definition:

<!ENTITY device "Euro 500">

Reference:

<title>Configuring the &device;</title>

Result:

<title>Configuring the Euro 500</title>

In an XML document, parametrised transclusion might be realised using general entities ...

mFiL 2015 8

Transclusion


• General entities• XInclude

Definition:

<phrase xml:id="device">Euro 500</phrase>

Reference:

<title>Configuring the <xi:include xpointer="xpath(id('device')/node())"/></title>

Result:

<title>Configuring the Euro 500</title>

… generic mechanisms such as XInclude [XInclude] …

mFiL 2015 9

Transclusion


• General entities• XInclude• Specific transclusion mechanisms, e.g. DITA conref

Definition:

<ph id="device">Euro 500</para>

Reference:

<title>Configuring the <ph conref="device"/></title>

Result:

<title>Configuring the <ph>Euro 500</ph></title>

… or specific transclusion mechanisms such as conref in DITA.

The examples within this presentation show a specific transclusion mechanism that I developed for Stanley Black & Decker Innovations, but the linguistic techniques that I'm going to describe should work with any kind of parametrised transclusion in any XML document.

mFiL 2015 10

Transclusion

Transcluded content may vary.

The important thing to note about parametrised transclusion — and this is where the term transclusion has changed its meaning since Ted Nelson invented it — is that the referred content may vary. You may not know what the referred content will be until publication time.

You might want to re-use one topic in the documentation for several different products. Or you might need to change all the product names in a single document for different brandings.

mFiL 2015 11

Transclusion


1. Local redefinition

This variation might be achieved in several ways. You might locally redefine the parametrised terms when you pull a topic into a larger work;

mFiL 2015 12

Transclusion


1. Local redefinition

2.Conditional processing:

• Conditional profiling — DocBook• DITAVAL files — DITA

<xsl:param name="profile.vendor" select="'ACME'"/>

<val> <prop action="include" att="product" val="ACME"/> <prop action="exclude" att="product" val="Yoyodyne"/></val>

Or you might use conditional processing, when you mark up a range of alternatives within the transcluded text, and then include or exclude them as required in a separate step.

If you’re familiar with DocBook, you’d achieve this using conditional profiling attributes. If you're familiar with DITA, you’d use a DITAVAL file.

mFiL 2015 13

Linguistic consequences

Now, small scale, parametrised transclusion can have linguistic consequences.

mFiL 2015 14


A different form of the transcluded word or phrase may be required depending on the environment into which it is placed:

• Orthography, e.g. writing systems with upper case• Syntactic case• Definiteness• Number• Others, e.g. initial consonant mutation

<title>_____ Details</title>

organisational unit[TITLE CASE]

Firstly, you may require a different form of the transcluded word or phrase, depending on the environment into which it is placed.

To give a simple example: for writing systems with orthographic case, you may require the transcluded term to be capitalised at the start of a sentence or in a title.

mFiL 2015 15


A different form of the transcluded word or phrase may be required depending on the environment into which it is placed:

• Orthography, e.g. writing systems with upper case• Syntactic case• Definiteness• Number• Others, e.g. initial consonant mutation

<para>Om nödvändigt, välj _____.</para>

organisationsenhet[+DEFINITE]

Or, depending on the language, you may require the transcluded term to take a different form depending on syntactic case, definiteness, number, or other features.

So in this Swedish example, we want to transclude a term into the sentence but we want the definite form of the noun.

The second linguistic consequence relates to government and binding. If the transcluded term is the head of a phrase, it may demand agreement from dependent words.

To take an obvious example from English: the indefinite article can take one of two forms depending on the phonetic environment. So if you have an indefinite article followed by a transcluded term, the form of the indefinite article will vary depending on whether the transcluded term starts with a vowel or a consonant.

mFiL 2015 16


If the transcluded word or phrase is the head of a phrase, it may demand agreement from dependent words.

• Phonetics• Gender• Number• Case• Definiteness

<para>Configuring a _____ Server</para>

Oz 500[_V]

Furthermore, in many languages, dependent words may also vary according to the syntactic environment. For example, the transcluded term may have varying grammatical gender, so dependent articles and adjectives will also have to change in agreement with the transcluded term.

These changes can often be avoided by careful wording and translation. In some circumstances, though, they may be unavoidable.

mFiL 2015 17


If the transcluded word or phrase is the head of a phrase, it may demand agreement from dependent words.

• Phonetics• Gender• Number• Case• Definiteness

<para>Pour configurer le _____ auqel le modem est connecté :  </para>

tablette[_C] [FEM] [SING]

mFiL 2015 18

Principles

At Stanley Black & Decker Innovations I developed a publication toolchain which attempts to solve these linguistic problems, among other things.

Internally, we referred to this publication toolchain as PACBook. I’ll keep using this term for convenience.

Within PACBook, the linguistic solution that I developed has two facets. I’ll give an overview of the principles, and then I'll go into more detail.

mFiL 2015 19

Principles

1. Linguistic markup scheme

Defining transcluded term:

• Mark up all forms of term to be transcluded• Mark up features which affect dependent words

Where transcluded term required:

• Mark up required form• Mark up dependent words

First, I developed a linguistic markup scheme for use in XML documentation.

There are basically four things to be marked up:

● When you’re defining terms to be transcluded, you must mark up all the possible syntactic forms of the word, and mark up any linguistic features which affect dependent words.

● When you’re marking up the locations where a transcluded term is required, you must mark up the form of the word that’s required, and mark up any words or phrases that depend on the transcluded terms.

Secondly, I developed a set of XSLT stylesheets to perform linguistic pre-processing on documentation [XSLT]. (XSLT is a computer language for applying transformations to XML.) This pre-processing toolchain carries out a set of steps, like so:

1. Resolve parametrised transclusion;2. Select the correct form of head words;3. Perform conditional profiling;4. Select the correct form of dependent words with the

help of a syntactic dictionary;5. Finally, select the correct orthographic case.

The output is then passed on to the next step in the publication process.

mFiL 2015 20

Principles

2. Linguistic pre-processing

We automated the process using build tools like Apache Ant [Ant] and XProc [XProc]; this is outside the scope of this presentation.

mFiL 2015 21

Principles

2. Linguistic pre-processing

mFiL 2015 22

Markup

Let’s go into more detail on the linguistic markup. I’ll propose a set of XML attributes which extend the document’s markup schema.

mFiL 2015 23

Markup

XML attributes

• Extend markup schema

• Wrapper element:DocBook <phrase>DITA <ph>HTML <span>

• Namespace:http://stanleysecurity.github.io/PACBook/ns/linguistics

• Prefix:ling

These XML attributes can be added to any element which contains a run of text. To mark up a word or phrase, an author would most likely add linguistic markup attributes to a semantically empty wrapper element such as <phrase> in DocBook, <ph> in DITA or <span> in HTML.

The linguistic markup that I’m proposing has its own XML namespace:

http://stanleysecurity.github.io/PACBook/ns/linguistics

Authors can use any prefix they like to refer to this namespace, but I suggest the prefix ling.

mFiL 2015 24

Markup

ling:pron Phonetic environment. (V, C, ...)

ling:num Grammatical number.(sg, pl, ...)

ling:case Grammatical case.(nom, gen, dat, acc, ...)

ling:gen Grammatical gender.(c, m, f, n, ...)

ling:class Definiteness / inflectional class.(strong, weak, mixed, ind, def, ...)

ling:orth Orthographic case.(upper, lower, title, sentence)

ling:type head — form of a head word;depend — dependent word.

For the linguistic markup I propose these XML attributes:

● ling:pron, the phonetic environment governed by a term;● ling:num, used to mark up grammatical number;● ling:case, used to mark up grammatical case;● ling:gen, used to mark up grammatical gender;● ling:class, used to mark up definiteness or inflectional class;● ling:orth, used to mark up orthographic case;● ling:type, used to mark up whether a term is a form of a head

word or a dependent word.

The first five attributes can contain any value, but you must use consistent values within each language — you’ll see why later.

Obviously this is not an exhaustive list of all possible linguistic features; they’re the features I needed for the languages that PACBook supports. But the scheme is designed to be extensible to support further linguistic features if necessary.

So, let’s look at some examples of markup. First of all I’m going to show how to mark up the linguistic features that demand agreement when you define a resource for transclusion. This example is in English. You can see that there are two different forms of the product name, depending on the branding. I’ve used the ling:pron attribute to mark up the phonetic environment that these brand names govern: Euro is pronounced with an initial consonant and Oz is pronounced with an initial vowel.

mFiL 2015 25

Markup

Resource — features of head noun that demand agreement

<resource xl:label="Product_Name">

<phrase vendor="ACME" ling:pron="C">Euro 500</phrase>

<phrase vendor="Yoyodyne" ling:pron="V">Oz 500</phrase>

</resource>

Phonetic environment:

⟨Euro⟩ / j ə ə /ˈ ʊ ɹ ʊ _C

⟨Oz⟩ / z /ˈɒ _V

Here’s an example showing how to mark up all the possible grammatical variants of a word when you’re defining that word for transclusion.

This is an example in Swedish. It’s the markup for a term called Org_Unit. You can see that the outer <phrase> wrapper shows that this term has common gender and singular number.

The inner <phrase> elements are marked with ling:type="head" to show that these are grammatical variants of a head word. Each variant is marked up for case and definiteness and I’ve marked up all the possible variants.

mFiL 2015 26

Markup

Resource — all possible forms of head noun:

<resource xl:label="Org_Unit">

<phrase ling:gen="c" ling:num="sg">

<phrase ling:type="head" ling:case="nom"

ling:class="ind">organisationsenhet</phrase>

<phrase ling:type="head" ling:case="gen"

ling:class="ind">organisationsenhets</phrase>

<phrase ling:type="head" ling:case="nom"

ling:class="def">organisationsenheten</phrase>

<phrase ling:type="head" ling:case="gen"

ling:class="def">organisationsenhetens</phrase>

</phrase>

</resource>

mFiL 2015 27

Markup

Document — mark up required form of transcluded term

<para>Om nödvändigt, välj <phrase ling:class="def"

content:ref="Org_Unit"/>.</para>

<title><phrase ling:orth="title"

content:ref="Org_Unit"/> Details</title>

So that’s how to mark up linguistic features when defining terms for transclusion.

Now let’s see how to indicate the required form of a transcluded term in the main body of the document.

The first example is Swedish again. It shows that we want to transclude the term called Org_Unit and that we want the definite form of the word. The grammatical case isn’t specified, which means nominative is assumed.

The second example is in English; it shows that we want to transclude the term called Org_Unit, but because this is a title, we want the term to be output in title case, i.e. with initial capital letters.

mFiL 2015 28

Markup

Document — mark up dependent words in text

<title>Configuring <wordasword ling:type="depend">a</wordasword>

<phrase content:ref="Product_Name"/> Server</title>

<para>Wenn

<phrase>

<wordasword ling:type="depend">ein</wordasword>

<phrase content:ref="Device"/>

</phrase>

konfiguriert wird, werden die Details

<phrase>

<wordasword ling:type="depend">der</wordasword>

<phrase content:ref="Device" ling:case="gen"/>

</phrase>

auf der Weboberfläche angezeigt.</para>

Finally, here are some examples showing how to mark up dependent terms in the text.

The first example is English. I’ve used the <wordasword> element, which in DocBook is semantically empty, to mark up the word “a” as dependent on the following transcluded term.

The next example is German. You can see that I’ve marked up the two articles as dependent words, but because there are two head words in this run of text, I’ve had to wrap each head word together with its dependent article in a semantically empty <phrase> element. It’s very similar to building a phrase structure diagram!

mFiL 2015 29

Dictionary

So how does PACBook select the correct form of a dependent word? Well, as I mentioned in the overview, it uses a syntactic dictionary.

mFiL 2015 30

Dictionary

Complies with dictionaries module of the TEI.

<entry n="a">

<form>

<gramGrp><usg value="C"/></gramGrp>

<orth>a</orth>

</form>

<form>

<gramGrp><usg value="V"/></gramGrp>

<orth>an</orth>

</form>

</entry>

The syntactic dictionary must comply with the dictionaries module of the Text Encoding Initiative (TEI). Currently PACBook has ten dictionaries in development, one for each of the languages that PACBook supports.

You can see here an entry from the English dictionary. In fact, it’s the only entry in the English dictionary. It shows the two different forms of the indefinite article, and the environment in which each form is used.

mFiL 2015 31

Dictionary

<usg> Phonetic environment. (V, C, ...)

<num> Grammatical number.(sg, pl, ...)

<case> Grammatical case.(nom, gen, dat, acc, ...)

<gen> Grammatical gender.(c, m, f, n, ...)

<oVar> Definiteness / inflectional class.(strong, weak, mixed, ind, def, ...)

<orth> Output.

In the syntactic dictionaries, linguistic features are marked up using these TEI elements. You can see how they match up with the ling attributes:

● <usg>, the phonetic environment in which this form is used;● <num>, the grammatical number;● <case>, the grammatical case;● <gen>, the grammatical gender;● <oVar>, the definiteness or inflectional class;● <orth>, used to mark up the output form.

The first five elements can contain any values you like, but they must match the values used in the ling attributes in your document.

Again, this isn’t an exhaustive list of all possible linguistic features; they’re the features I needed. The scheme could be extended.

mFiL 2015 32

Software

So that’s the markup scheme. The other facet of PACBook is the software which carries out the transformations.

mFiL 2015 33

Transformational stylesheets

PACBook XSLT transformations:

• LingHead.xsl — select the required declension of head nouns.• LingDepend.xsl — inflect dependent words.● LingCasing.xsl — sets the orthographic case of specified text.

As I mentioned in the overview, PACBook contains an entire suite of XSLT stylesheets for document publication. The important ones from a linguistic perspective are:

● LingHead.xsl — you apply this transformation after parametrised transclusion. It selects the required grammatical form of any transcluded head words based on the markup in the document.

● LingDepend.xsl — this looks at the grammatical features of transcluded head words and selects the correct form of dependent words using the syntactic dictionary for the current language.

● LingCasing.xsl — this is very straightforward. It uses functions from the standard XSLT library to set the orthographic case of the specified text to upper case, lower case, sentence case or title case. Title case is English-only.

mFiL 2015 34

Transformational stylesheets

PACBook XSLT transformations:

• LingHead.xsl — select the required declension of head nouns.• LingDepend.xsl — inflect dependent words.• LingCasing.xsl — sets the orthographic case of specified text.

Licence:GNU Lesser General Public License (LGPL) v3

Repository:https://github.com/STANLEYSecurity/PACBook

These stylesheets are free software. SBD Innovations made the source code available under version 3.0 of the LGPL.

The stylesheets are available on GitHub at this URI.

https://github.com/STANLEYSecurity/PACBook

This GitHub repository also contains the syntactic dictionaries, schemas which enable you to validate the linguistic markup, and full documentation.

mFiL 2015 35

Limitations

● Only noun phrases.● Only tested with small handful of languages.● Linguistic markup different for translated texts.● Linguistic markup can be complex for authors.

Obviously this solution has limitations.

It can only inflect the head nouns and dependent words within a noun phrase. It can’t (yet) conjugate verbs, for instance.

Secondly, the solution has been developed for and tested with ten European languages.

Thirdly, inline markup can be quite different for translated texts. I’ve worked out a correspondence between the linguistic markup and the XLIFF translation standard [XLIFF]; see the GitHub website for details.

Finally, the linguistic markup is simple for authors or translators writing in English, but can become complex in other languages.

mFiL 2015 36

Related work

● Various linguistic markup schemas / ontologies● Internationalisation markup● Nothing else?● What should we call this?

The problems I’ve described are linguistically trivial, and the solution is computationally trivial. That being so, has anyone done any work in this area before?

There are lots of linguistic markup schemas and ontologies, e.g. the GOLD ontology [GOLD], or ISOcat [ISOcat]. None of them seemed concise enough to use in an existing document schema.

There are also various markup schemas for internationalisation. International Components for Unicode have developed MessageFormat, which enables you to mark up the singular or plural forms of a word for a locale [ICU].

But as far as I can tell, nothing has ever been developed which solves the problem that I’ve attempted to solve. In fact I'm not even sure what you’d call this problem. Perhaps we need a name for it, in the spirit of Ted Nelson...

mFiL 2015 37

Collaboration

● Dictionary — Wiktionary.● Testing and improving.● Integrating with other publication workflows.

Development fork:https://github.com/janiveer/PACBook

I’d be very interested to make contact with people who are working in similar areas and would be interested in collaboration. I need help with:

● Extending the dictionaries — perhaps programmatically, using the huge amount of linguistic data available on Wiktionary;

● Testing and improving the software;● Integrating with other publication workflows.

One final thing to say: Stanley Black and Decker Innovations was wound down at the end of July 2015. I’ve cloned the GitHub repository at the URI shown here, so that development can continue:

https://github.com/janiveer/PACBook

If anyone would like to contribute please contact me.

mFiL 2015 38

Examples

Here’s an example of how the markup and the transformations work together.

mFiL 2015 39

Example

Resource:

<resource xl:label="Doc"> <phrase outputformat="PDF" ling:gen="n" ling:num="sg"> <phrase ling:type="head" ling:case="nom">Dokument</phrase> <phrase ling:type="head" ling:case="acc">Dokument</phrase> <phrase ling:type="head" ling:case="gen">Dokuments</phrase> <phrase ling:type="head" ling:case="dat">Dokument</phrase> </phrase> <phrase outputformat="CHM" ling:gen="f" ling:num="sg"> <phrase ling:type="head" ling:case="nom">Hilfedatei</phrase> <phrase ling:type="head" ling:case="acc">Hilfedatei</phrase> <phrase ling:type="head" ling:case="gen">Hilfedatei</phrase> <phrase ling:type="head" ling:case="dat">Hilfedatei</phrase> </phrase></resource>

Imagine that set of help topics can be delivered as a print file or as online help.

So, when the author wants to refer to the document itself, she marks up the two possible document types with profiling attributes which effectively say, “for the print output format, this is a document; for the online output format, this is a help file”.

Then, because our author is German, she marks up all the possible singular forms of the words for “document” and “help file”, and adds further linguistic attributes to signal the number and gender.

mFiL 2015 40

Example

Document:

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">dies</wordasword><phrase content:ref="Doc" ling:case="dat"/>nicht enthalten.</para>

In the document, the author marks up all the points where the term will be inserted and specifies which syntactic case is required.

Then, she runs the document build process...

mFiL 2015 41

Example

After transclusion:

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">dies</wordasword><phrase ling:case="dat"> <phrase outputformat="PDF" ling:gen="n" ling:num="sg"> <phrase ling:type="head" ling:case="nom">Dokument</phrase> <phrase ling:type="head" ling:case="acc">Dokument</phrase> <phrase ling:type="head" ling:case="gen">Dokuments</phrase> <phrase ling:type="head" ling:case="dat">Dokument</phrase> </phrase> <phrase outputformat="CHM" ling:gen="f" ling:num="sg"> <phrase ling:type="head" ling:case="nom">Hilfedatei</phrase> <phrase ling:type="head" ling:case="acc">Hilfedatei</phrase> <phrase ling:type="head" ling:case="gen">Hilfedatei</phrase> <phrase ling:type="head" ling:case="dat">Hilfedatei</phrase> </phrase></phrase>nicht enthalten.</para>

After parametrised transclusion, all possible grammatical forms of all possible document types are included in the document, at every point where the document type is required.

mFiL 2015 42

Example

After head transformation:

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">dies</wordasword><phrase ling:case="dat"> <phrase outputformat="PDF" ling:gen="n" ling:num="sg"> <phrase ling:type="head" ling:case="dat">Dokument</phrase> </phrase> <phrase outputformat="CHM" ling:gen="f" ling:num="sg"> <phrase ling:type="head" ling:case="dat">Hilfedatei</phrase> </phrase></phrase>nicht enthalten.</para>

After the head transformation, the correct grammatical form of the term is kept at every point where the term is required, and the other grammatical forms are removed.

mFiL 2015 43

Example

After conditional processing:

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">dies</wordasword><phrase ling:case="dat"> <phrase outputformat="PDF" ling:gen="n" ling:num="sg"> <phrase ling:type="head" ling:case="dat">Dokument</phrase> </phrase></phrase>nicht enthalten.</para>

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">dies</wordasword><phrase ling:case="dat"> <phrase outputformat="CHM" ling:gen="f" ling:num="sg"> <phrase ling:type="head" ling:case="dat">Hilfedatei</phrase> </phrase></phrase>nicht enthalten.</para>

After conditional processing, only one of the alternative forms is selected at every point where the product name is required, and the rest are removed.

mFiL 2015 44

Example

After dependent transformation:

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">diesem</wordasword><phrase ling:case="dat"> <phrase outputformat="PDF" ling:gen="n" ling:num="sg"> <phrase ling:type="head" ling:case="dat">Dokument</phrase> </phrase></phrase>nicht enthalten.</para>

<para>Die Einstellung der IP-Adresse ist in<wordasword ling:type="depend">dieser</wordasword><phrase ling:case="dat"> <phrase outputformat="CHM" ling:gen="f" ling:num="sg"> <phrase ling:type="head" ling:case="dat">Hilfedatei</phrase> </phrase></phrase>nicht enthalten.</para>

Finally, all dependent terms are transformed to match their syntactic and phonetic environment as required.

mFiL 2015 45

Questions?

Thank you very much for listening.

Are there any questions?

mFiL 2015 46

References● [Nelson] Theodor Holm Nelson. 1981. Literary Machines. Mindful Press, Sausalito, California.

● [XML] Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau, editors. 26 November 2008. Extensible Markup Language (XML) 1.0 (Fifth Edition). World Wide Web Consortium (W3C).

● [DocBook] DocBook Technical Committee. 1 November 2009. The DocBook Schema Version 5.0. Organization for the Advancement of Structured Information Standards (OASIS).

● [DITA] OASIS DITA Technical Committee. 1 December 2010. Darwin Information Typing Architecture (DITA) Version 1.2. Organization for the Advancement of Structured Information Standards (OASIS).

● [TEI] TEI Consortium, eds. 20 January 2014. TEI P5: Guidelines for Electronic Text Encoding and Interchange, 2.6.0. TEI Consortium.

● [HTML] Ian Hickson, Robin Berjon, Steve Faulkner, Travis Leithead, Erika Doyle Navara, Edward O’Connor, Silvia Pfeiffer, editors. 28 October 2014. HTML5. World Wide Web Consortium (W3C).

● [XInclude] Jonathan Marsh, David Orchard, and Daniel Veillard, editors. 15 November 2006. XML Inclusions (XInclude) Version 1.0 (Second Edition). World Wide Web Consortium (W3C).

● [XSLT] James Clark, editor. 16 November 1999. XSL Transformations (XSLT) Version 1.0. World Wide Web Consortium (W3C).

● [Ant] Stephane Bailliez, et al. December 29, 2013. Apache Ant™ 1.9.3 Manual. The Apache Software Foundation.

● [XProc] Norman Walsh, Alex Milowski, and Henry S. Thompson, editors. 11 May 2010. XProc: An XML Pipeline Language. World Wide Web Consortium (W3C).

● [XLIFF] OASIS XLIFF Technical Committee. 1 February 2008. XML Localisation Interchange File Format (XLIFF) Version 1.2. Organization for the Advancement of Structured Information Standards (OASIS).

● [GOLD] Scott Farrar and D. Terence Langendoen. 2003. A linguistic ontology for the Semantic Web. GLOT International. 7 (3), pp.97-100.

● [ISOcat] M. Kemps-Snijders, M.A. Windhouwer, P. Wittenburg, S.E. Wright. November 2009. ISOcat: Remodeling Metadata for Language Resources. International Journal of Metadata, Semantics and Ontologies (IJMSO), 4(4), pp 261-276.

● [ICU] ICU Project Management Committee. 7 October 2015. ICU 56. ICU — International Components for Unicode.

(References)

Technology

Linguistic markup and processing of transclusion in XML documents (Notes)