31
1– 1 DITA, XLIFF and Translation: Truths, Myths and Misconceptions www.oasis-open.org Andrzej Zydroń MBCS CITP CTO XML-INTL Leaders in Translation Technology September 2008 [email protected]

DITA, XLIFF and Translation: Truths, Myths and Misconceptions

  • Upload
    dangdat

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 1

DITA, XLIFF and Translation:Truths, Myths and Misconceptions

www.oasis-open.org

Andrzej Zydroń MBCS CITPCTOXML-INTLLeaders in Translation TechnologySeptember [email protected]

Page 2: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 2

DITA is cheaper to translate: Truths

Separation of content and format

Component based architecture

You only translate changed topics

Page 3: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 3

Translation is the main cost of localization: Myths

Page 4: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 4

DITA Granularity: a double edged swordDITA without a CMS – you must be NUTS!You DO NOT NEED a native XML CMSLinks, links, links, links…..Translate topics as soon as they are availableIncreased project management costsNecessitate web services based exchangeNeed to establish long term relationship with Localization Service Providers

Page 5: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 5

DITA Translation Pitfalls

DITA comes ready packed with some very dangerous optionsTranslatable acronymsBeware the CONREF for it may TRIPLE your translation costsSpecialize if you dareDITA Translation TC Best Practices

Page 6: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 6

DITA + XLIFFOnly part of the picture

XML1.0

Unicode 5.0

XML Vocabulary, e.g. DITA

xml:tm

Author Memory Translation Memory

SRX

GM

X

W3C ITS

Unicode TR29

XLIFF

TMX

Page 7: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 7

OAXAL:OASIS Reference Architecture TC

xml:tm

Unicode TR 29

SRX

W3C ITS

GMX-V

DITA/XML

TMXXLIFF

Page 8: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 8

OAXAL TC

http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=oaxal

Page 9: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 9

Collaborativereview andapproval

Pre-translate content and alert translators

How does it work?

Check content into DocZone server

XMLXMLXML

XMLXMLXMLXMLXMLXML

Vendor localizes XML content with DocZone translation tool

PDF

Write/edit XML content

Publish to all output formats for all markets

HTML

Create graphicsStore in DocZone

Link with XML content

DocZone.com example

Page 10: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 10

OAXAL: Why is DITA + XLIFF not enough

Process Automation50% Translation costs process management

MatchingStrange commercial model for translation companiesAutomation, automation, automation

Page 11: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 11

DITA + OAXAL putting it together:

DITA/XML+

xml:tm

Unicode TR 29

SRX

W3C ITS

DITA/XML

Page 12: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 12

xml:tm namespace

Example of the use of tm namespace in an XML document:

<document xmlns:tm="urn:xml-intl-tm" te="9"><tm:tm><section>

<para><tm:te id="e1">

<tm:tu id="u1.1">Namespace is very flexible.

</tm:tu><tm:tu id="u1.2">

It is very easy to use.</tm:tu>

</tm:te></para>

Page 13: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 13

xml:tm namespace

docdoc

titletitle

sectionsection sectionsection

parapara

tmtm

tete sentencesentence sentencesentencetutu tutu

tete sentencesentence sentencesentencetutu tutu

tete sentencesentence sentencesentencetutu tutu

Source document tm namespace

viewtete texttexttututexttext

tete sentencesentence sentencesentencetutu tutu

parapara texttext

parapara texttext

parapara texttext

parapara texttext

parapara texttext

tete sentencesentence sentencesentencetutu tutu

tete sentencesentence sentencesentencetutu tutu

texttext

Source document view

Page 14: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 14

Author memoryMaintain memory of source textAuthoring statisticsAuthoring tool input

Translation memoryAutomatic alignmentMaintain exact link of source and target textReduce translation costs

xml:tm namespace

Page 15: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 15

xml:tm differencing

tu id=”1”

tu id=”2”

tu id=”3”

tu id=”4”

tu id=”5”

tu id=”6”

Original Source Document

tu id=”1”

tu id=”2”

tu id=”3”

tu id=”4”

tu id=”7”

tu id=”6”

deleted

tu id=”8”

modified

new

Updated Source Document

DOMDifferencing

Page 16: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 16

xml:tm author memory

Namespace aware DOM differencingIdentify changes from the previous versionUnique text unit identifiers are maintainedModification historyText units can be loaded into a databaseAuthoring environment integration

Page 17: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 17

xml:tm author memory

Namespace aware DOM differencingIdentify changes from the previous versionUnique text unit identifiers are maintainedModification historyText units can be loaded into a databaseAuthoring environment integration

Page 18: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 18

Page 19: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 19

XLIFF + xml:tm :

DITA/XML+

xml:tm

GMX/V

XLIFF

Page 20: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 20

DITA/OAXAL to XLIFF

tu id=”1”

tu id=”2”

tu id=”3”

tu id=”4”

tu id=”5”

tu id=”6”

Original Source Document

tu id=”1”

tu id=”2”

tu id=”3”

tu id=”4”

tu id=”5”

tu id=”6”

Translated Target Document

Trans-unit id=”1”

XLIFF File

Trans-unit id=”2”

Trans-unit id=”3”

Trans-unit id=”4”

Trans-unit id=”5”

Trans-unit id=”6”

Page 21: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 21

xml:tm exact matching

Updated Source Document

tu id=”1”

tu id=”2”

tu id=”3”

tu id=”4”

tu id=”7”

tu id=”6”

deleted

tu id=”8”

modified

new

Matched Target Document

tu id=”1”

tu id=”3”

tu id=”4”

tu id=”7”

tu id=”6”

tu id=”8”

Exact Matching

requires translation

requires translation

Exact match

Exact match

Exact match

Exact match

Page 22: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 22

xml:tm matchingUpdated Source

Document

tu id=”1”

tu id=”2”

tu id=”3”

tu id=”4”

tu id=”7”

tu id=”6”

non trans

tu id=”8”new:same

Matched Target Document

tu id=”1”

tu id=”3”

tu id=”4”

tu id=”7”

tu id=”6”

tu id=”8”

requires translation

requires proofing

fuzzy match origid="5"

doc leveraged match

tu id=”9” tu id=”9”

DB

requires proofing

DB leveraged match

tu id=”2”requires no translation

non translatable

Exact match

Exact match

Exact match

Exact match

modified

Page 23: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 23

xml:tm translated document

docdoc

titletitle

sectionsection sectionsection

parapara

tmtm

tete zdaniezdanie zdaniezdanietutu tutu

tete zdaniezdanie zdaniezdanietutu tutu

tete zdaniezdanie zdaniezdanietutu tutu

Translated docuemnt tm

namespace viewtete tekstteksttututeksttekst

tete zdaniezdanie zdaniezdanietutu tutu

parapara teksttekst

parapara teksttekst

parapara teksttekst

parapara teksttekst

parapara teksttekst

tete zdaniezdanie zdaniezdanietutu tutu

tete zdaniezdanie zdaniezdanietutu tutu

teksttekst

translated document view

Page 24: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 24

Translation without OAXAL:

source text

source text extract extracted text tm process

prepared text

translatetranslated text

target texttarget text

mergetarget text

QA

Page 25: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 25

OAXAL in action

xml:tm source text

extract extracted text

tm process

XLIFFfile

translate

xml:tm target text

merge

Internet

exact matching

leveraged matching

Automated Workflow

web browserweb browserQA

Automated Workflow

Page 26: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 26

Page 27: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 27

Normal DITA document

Page 28: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 28

DITA Document with xml:tm namespace

Page 29: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 29

xml:tm version encoded

DITA Document with xml:tm namespace embeded as a Base64 encoded Processing Instruction

Page 30: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 30

XLIFF File version after matching

Page 31: DITA, XLIFF and Translation: Truths, Myths and Misconceptions

1– 31

Contact Details

Postal address:PO Box 2167Gerrards CrossBucks SL9 8XFUnited Kingdom

Phone: +44 1753 480 467 Fax: +44 1753 480 465 Andrzej Zydroń – [email protected]