DITA, XLIFF and Translation: Truths, Myths and Misconceptions

  • View
    213

  • Download
    0

Embed Size (px)

Text of DITA, XLIFF and Translation: Truths, Myths and Misconceptions

  • 1 1

    DITA, XLIFF and Translation:Truths, Myths and Misconceptions

    www.oasis-open.org

    Andrzej Zydro MBCS CITPCTOXML-INTLLeaders in Translation TechnologySeptember 2008azydron@xml-intl.com

  • 1 2

    DITA is cheaper to translate: Truths

    Separation of content and format

    Component based architecture

    You only translate changed topics

  • 1 3

    Translation is the main cost of localization: Myths

  • 1 4

    DITA Granularity: a double edged swordDITA without a CMS you must be NUTS!You DO NOT NEED a native XML CMSLinks, links, links, links..Translate topics as soon as they are availableIncreased project management costsNecessitate web services based exchangeNeed to establish long term relationship with Localization Service Providers

  • 1 5

    DITA Translation Pitfalls

    DITA comes ready packed with some very dangerous optionsTranslatable acronymsBeware the CONREF for it may TRIPLE your translation costsSpecialize if you dareDITA Translation TC Best Practices

  • 1 6

    DITA + XLIFFOnly part of the picture

    XML1.0

    Unicode 5.0

    XML Vocabulary, e.g. DITA

    xml:tm

    Author Memory Translation Memory

    SRX

    GM

    X

    W3C ITS

    Unicode TR29

    XLIFF

    TMX

  • 1 7

    OAXAL:OASIS Reference Architecture TC

    xml:tm

    Unicode TR 29

    SRX

    W3C ITS

    GMX-V

    DITA/XML

    TMXXLIFF

  • 1 8

    OAXAL TC

    http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=oaxal

  • 1 9

    Collaborativereview andapproval

    Pre-translate content and alert translators

    How does it work?

    Check content into DocZone server

    XMLXMLXML

    XMLXMLXMLXMLXMLXML

    Vendor localizes XML content with DocZone translation tool

    PDF

    Write/edit XML content

    Publish to all output formats for all markets

    HTML

    Create graphicsStore in DocZone

    Link with XML content

    DocZone.com example

  • 1 10

    OAXAL: Why is DITA + XLIFF not enough

    Process Automation50% Translation costs process management

    MatchingStrange commercial model for translation companiesAutomation, automation, automation

  • 1 11

    DITA + OAXAL putting it together:

    DITA/XML+

    xml:tm

    Unicode TR 29

    SRX

    W3C ITS

    DITA/XML

  • 1 12

    xml:tm namespace

    Example of the use of tm namespace in an XML document:

    Namespace is very flexible.

    It is very easy to use.

  • 1 13

    xml:tm namespace

    docdoc

    titletitle

    sectionsection sectionsection

    parapara

    tmtm

    tete sentencesentence sentencesentencetutu tutu

    tete sentencesentence sentencesentencetutu tutu

    tete sentencesentence sentencesentencetutu tutu

    Source document tm namespace

    viewtete texttexttututexttext

    tete sentencesentence sentencesentencetutu tutu

    parapara texttext

    parapara texttext

    parapara texttext

    parapara texttext

    parapara texttext

    tete sentencesentence sentencesentencetutu tutu

    tete sentencesentence sentencesentencetutu tutu

    texttext

    Source document view

  • 1 14

    Author memoryMaintain memory of source textAuthoring statisticsAuthoring tool input

    Translation memoryAutomatic alignmentMaintain exact link of source and target textReduce translation costs

    xml:tm namespace

  • 1 15

    xml:tm differencing

    tu id=1

    tu id=2

    tu id=3

    tu id=4

    tu id=5

    tu id=6

    Original Source Document

    tu id=1

    tu id=2

    tu id=3

    tu id=4

    tu id=7

    tu id=6

    deleted

    tu id=8

    modified

    new

    Updated Source Document

    DOMDifferencing

  • 1 16

    xml:tm author memory

    Namespace aware DOM differencingIdentify changes from the previous versionUnique text unit identifiers are maintainedModification historyText units can be loaded into a databaseAuthoring environment integration

  • 1 17

    xml:tm author memory

    Namespace aware DOM differencingIdentify changes from the previous versionUnique text unit identifiers are maintainedModification historyText units can be loaded into a databaseAuthoring environment integration

  • 1 18

  • 1 19

    XLIFF + xml:tm :

    DITA/XML+

    xml:tm

    GMX/V

    XLIFF

  • 1 20

    DITA/OAXAL to XLIFF

    tu id=1

    tu id=2

    tu id=3

    tu id=4

    tu id=5

    tu id=6

    Original Source Document

    tu id=1

    tu id=2

    tu id=3

    tu id=4

    tu id=5

    tu id=6

    Translated Target Document

    Trans-unit id=1

    XLIFF File

    Trans-unit id=2

    Trans-unit id=3

    Trans-unit id=4

    Trans-unit id=5

    Trans-unit id=6

  • 1 21

    xml:tm exact matching

    Updated Source Document

    tu id=1

    tu id=2

    tu id=3

    tu id=4

    tu id=7

    tu id=6

    deleted

    tu id=8

    modified

    new

    Matched Target Document

    tu id=1

    tu id=3

    tu id=4

    tu id=7

    tu id=6

    tu id=8

    Exact Matching

    requires translation

    requires translation

    Exact match

    Exact match

    Exact match

    Exact match

  • 1 22

    xml:tm matchingUpdated Source

    Document

    tu id=1

    tu id=2

    tu id=3

    tu id=4

    tu id=7

    tu id=6

    non trans

    tu id=8new:same

    Matched Target Document

    tu id=1

    tu id=3

    tu id=4

    tu id=7

    tu id=6

    tu id=8

    requires translation

    requires proofing

    fuzzy match origid="5"

    doc leveraged match

    tu id=9 tu id=9

    DB

    requires proofing

    DB leveraged match

    tu id=2requires no translation non translatable

    Exact match

    Exact match

    Exact match

    Exact match

    modified

  • 1 23

    xml:tm translated document

    docdoc

    titletitle

    sectionsection sectionsection

    parapara

    tmtm

    tete zdaniezdanie zdaniezdanietutu tutu

    tete zdaniezdanie zdaniezdanietutu tutu

    tete zdaniezdanie zdaniezdanietutu tutu

    Translated docuemnt tm

    namespace viewtete tekstteksttututeksttekst

    tete zdaniezdanie zdaniezdanietutu tutu

    parapara teksttekst

    parapara teksttekst

    parapara teksttekst

    parapara teksttekst

    parapara teksttekst

    tete zdaniezdanie zdaniezdanietutu tutu

    tete zdaniezdanie zdaniezdanietutu tutu

    teksttekst

    translated document view

  • 1 24

    Translation without OAXAL:

    source text

    source text extract extracted text tm process

    prepared text

    translatetranslated text

    target texttarget text

    mergetarget text

    QA

  • 1 25

    OAXAL in action

    xml:tm source text

    extract extracted text

    tm process

    XLIFFfile

    translate

    xml:tm target text

    merge

    Internet

    exact matching

    leveraged matching

    Automated Workflow

    web browserweb browserQA

    Automated Workflow

  • 1 26

  • 1 27

    Normal DITA document

  • 1 28

    DITA Document with xml:tm namespace

  • 1 29

    xml:tm version encoded

    DITA Document with xml:tm namespace embeded as a Base64 encoded Processing Instruction

  • 1 30

    XLIFF File version after matching

  • 1 31

    Contact Details

    Postal address:PO Box 2167Gerrards CrossBucks SL9 8XFUnited Kingdom

    Phone: +44 1753 480 467 Fax: +44 1753 480 465 Andrzej Zydro azydron@xml-intl.com