33
Can I Have a Word: Managing Shared Glossaries and References to Terms With DITA Eliot Kimber Contrext Tekom 2017

Can I Have a Word: Managing Shared Glossaries and References to Terms With DITA

Embed Size (px)

Citation preview

Can I Have a Word:

Managing Shared Glossaries and

References to Terms With DITA

Eliot KimberContrext

Tekom 2017

About the Author

• Independent consultant focusing on DITA analysis, design, and implementation

• Doing SGML and XML for cough 30 years cough• Founding member of the DITA Technical

Committee• Founding member of the XML Working Group• Co-editor of HyTime standard (ISO/IEC 10744)• Primary developer and founder of the DITA for

Publishers project• Author of DITA for Practitioners, Vol 1 (XML Press)

Tekom 2017

Agenda

• DITA glossary markup

• Glossary challenges

• Managing and using glossary entries

• Glossary processing

Tekom 2017

INTRODUCTION TO

GLOSSARIES

Tekom 2017

Glossary is…

• Terms and their definitions

• For presentation to readers

• May include definitions of acronyms and abbreviations

• May include lexicographic details: part of speech, etc.

• Source for use-by-reference of <term>elements in content

Tekom 2017

Glossary is not…

• Formal term list as used in terminology management tools like Congree or Acrolinx

– Terminology management is a separate concern from glossary authoring and presentation

Tekom 2017

General Requirements

• Provide glossary of terms in publications

• Get terms by reference in content (mentions of terms)

• Links from uses of terms to their glossary entries

• Show expansions of acronyms and abbreviations on first use

• Reuse glossary entries in multiple publications

• Publish master glossary with links to it from other publications

Tekom 2017

GLOSSARY MARKUP

Tekom 2017

<glossentry>

• Topic type for glossary entries

• Captures:

– Term

– Definition

– Abbreviated forms

– Parts of speech

– Surface form

– Other details

Tekom 2017

<glossgroup>

• Topic type for grouping glossary entries together into one source document

• Allows nested <glossentry> elements

Tekom 2017

<glossref>

• Topicref type for referring to glossary topics

• DO NOT USE

• Sets @toc to “no”

• Sets @print to “no”

– Nobody knows why

• Requires @keys attribute

Tekom 2017

<abbreviated-form>

• Reference to a glossary entry

– Specialization of <term>

• Intended to produce abbreviation and expansion on “first use”

• Produces just abbreviation on other occurrences

• Challenge: When is a use the “first use”?

Tekom 2017

<term>

• Can use @keyref to use a glossary term by reference

• Reflects the term if no local content

• Should be a link to the glossary entry

• Example:<p>The <term keyref="gloss-framitz"/>

…</p>

Tekom 2017

<sort-as>

• Can be used in topic prolog to provide sorting key

– Often required for Japanese

– May be required for Simplified Chinese

– Other languages, terms with special characters, etc.

Tekom 2017

MANAGING AND USING

GLOSSARIES

Tekom 2017

Glossary Entries as Resources

• Manage glossentry topics as individual docs

– Typical DITA practice for topics in general

• Must have associated keys

• Challenges:

– Where to define the keys?

– Defining naming conventions for keys

Tekom 2017

Maps for Glossaries

• Glossary entries MUST be part of the publication navigation tree

• <keydef> is either not appropriate or not sufficient

– <keydef> has processing role of “resource-only”

– Does not put referenced topic in the navigation tree

• Need normal-role topicrefs to glossary entries

Tekom 2017

Grouping Entries

• Obvious approach is to use topicheads to group entries:

<topichead><topicmeta>

<navtitle>Glossary</navtitle></topicmeta><topichead>

<topicmeta><navtitle>A</navtitle>

</topicmeta><topicref keys="gloss-apple"href="glossary/apple-gloss.dita"/>

…</topichead>…

</topichead>

• Doesn’t always work the way you might expect

Tekom 2017

Topichead Chunking Rule

• @chunk="to-content" on <topichead>makes topic act like reference to a title-only topic– DITA Spec: Clause 2.4.5.1 “Using the @chunk

attribute”

• Unfortunately, includes all child topics in the resulting chunk– Probably not what you want for glossaries– Have to specify @chunk on each subordinate topicref– Very annoying

• Bugs in Open Toolkit as of 2.5.4 produce incorrect results in both HTML and PDF

Tekom 2017

Workaround for Grouping

• Create title-only topics for what would otherwise be topicheads– Glossary top-level topic

– Each group

• Will need these for each language-specific group for localized glossaries

• Easy enough to generate– Could do as extension to Open Toolkit

preprocessing

Tekom 2017

Challenge:

How to Define Glossaries in Maps?

• Two basic options:

1. Use normal-role topicrefs only

2. Use both resource-only topicrefs and normal-role topicrefs that refer to the resource-only topicrefs by key

• Depends on your reuse requirements

Tekom 2017

Map Organization Option 1:

Just Normal-Role Topicrefs

• Publication map has normal topicrefs to the glossary entries

• Can have a single reusable submap

• Or can author separately for each publication

• Advantage: Keeps it simple

• Disadvantage: May have redundant or duplicate authoring in different publications

Tekom 2017

Map Organization Option 2:

Keydefs + Normal Topicrefs• Have a master map that uses <keydef> to refer to glossary entry topics

– These <keydef> keys are NOT to be used as target of <term> and <abbreviated-term> elements

– Reflects “exactly one topicref with URI reference to a given topic” policy

• In each publication:

– Grouping topicrefs

– Normal-role topicrefs with keys and keyref to <keydef> keys

• Advantage: Makes reuse easier to manage

• Disadvantages:

– Two keys where there were one

– May still have per-publication navigation structures for glossaries

Tekom 2017

Master Glossaries

• Separate publication that is just the glossary• Cross-deliverable links from other publications to

glossary entries• Cross-deliverable links are always a challenge• DITA 1.3 provides cross-deliverable linking feature

– Probably not implemented in your tools as of November 2017

• Can use deliverable-specific topicrefs– Requires that you know how glossary entries will be

delivered– Would expect to generate them automatically

Tekom 2017

GLOSSARY PROCESSING

Tekom 2017

Processing Challenges

• Determining “first use” for abbreviated form references

• Automatic grouping and sorting

• Producing minimum glossary for a given publication

Tekom 2017

First Use Problem

• What is the scope?

– Single topic?

– “Chapter”?

– Entire publication?

• Scope may be different for different deliverable types

• May have different editorial rules

• Difficult to have a general solution

Tekom 2017

Automated Grouping and

Sorting• Nothing in standard-defined map markup that

says unambiguously “this branch of the map is a glossary”

• Need locale-specific configuration for grouping

• Need local-specific configuration for sorting

• Simplified Chinese needs special support– DITA Community i18n project provides necessary

features

– Somebody needs to implement Open Toolkit plugin for doing glossary sorting

Tekom 2017

Generating Glossary Based on

Terms Used

• Possible to generate a glossary that reflects only those terms actually used in the topics included in a publication

• Requires synthesizing normal-role topicrefs so key references will work properly

• Could be implemented as an extension to Open Toolkit preprocessing

• Could be a separate process that generates otherwise-normal map and topic components

Tekom 2017

Demo

If time permits

Tekom 2017

Questions?

Tekom 2017

Resources

• Me: [email protected]• DITA specification: http://docs.oasis-

open.org/dita/dita/v1.3/dita-v1.3-part0-overview.html

• DITA Community i18n project: https://github.com/dita-community/org.dita-community.i18n

• Sample files: https://github.com/dita-community/dita-test-cases/tree/master/glossaries/realistic-glossary/wipo-glossary

Tekom 2017

Your opinion is important to us! Please tell us what you thought of the lecture. We look forward to your feedback via smartphone or tablet under

http://ta10.honestly.deor scan the QR code

The feedback tool will be available even after the conference!