46
Planning and implementing conversion of legacy files to XML/DITA compliancy Bernard Aschwanden www.publishingsmarter.com [email protected] Migrating to XML with FrameMaker Conversion Tables 23:39 1 @publishsmarter

Moving unstructured FrameMaker content to structure

Embed Size (px)

Citation preview

Page 1: Moving unstructured FrameMaker content to structure

06:42

Planning and implementing conversion of legacy files to XML/DITA

compl iancy

Bernard Aschwanden

www.publ ishingsmarter.com

bernard@pub l i sh ingsmar te r. com

Migrating to XML withFrameMaker Conversion

Tables1

@publishsmarter

Page 2: Moving unstructured FrameMaker content to structure

06:42

The agenda

@publishsmarter

2

Convert content from unstructured to structuredEDD, conversion table, and a structured templateUsing basic examples to get you started, this

session: Convert files with content such as character tags and

paragraph tags Add support for images and tables Demo converting unstructured to structured using

conversion tablesSamples are easy to recreate, but complex and

powerful in functionality

Page 3: Moving unstructured FrameMaker content to structure

06:42

Housekeeping and note taking

@publishsmarter

3Not all slides or topics are

equally weightedUse some, discard othersSlide speed varies as this is

a QUICK sessionQuestions? Ask along the

way!

I’d love to claim errors/typos is on purpose… they isn’t, ain’t, and weren’t never; however, I’ll fix ‘em as I can…

Page 4: Moving unstructured FrameMaker content to structure

06:42

About your speaker

@publishsmarter

4Publishing Smarter: PresidentContent strategist, publishing

technologies expert, author, and geek-enough

Certified Technical Trainer DITA Content management Topic-based writing

Society for Technical Communication (www.stc.org) President STC Associate Fellow

Page 5: Moving unstructured FrameMaker content to structure

06:42

Standard disclaimer

@publishsmarter

5

In the interest of brevity I will make some blanket statements to keep it simple

It’s not all 100% “the truth”, but I’ll stay close

Purists may complain And they are wrong! (except when they are

right)

Page 6: Moving unstructured FrameMaker content to structure

06:42

Major disclaimer

@publishsmarter

6

This is a quick sessionThere are LOTS of

samples in slides or FrameMaker

Simple samplesStill complex ideasTricky to set things up

Happy to share files

To review/apply this Watch the recording Jot down “time

stamps” Cool item at 17:23 Excel formulas 18:57 Word updates 26:33

Then watch it again Pause it, rewind, try it Do this at your own

pace Slowly test with your

content

Page 7: Moving unstructured FrameMaker content to structure

06:42@publishsmarter

Before you structure content

7

Page 8: Moving unstructured FrameMaker content to structure

06:42

Legacy content and document review

@publishsmarter

8

Include analysis of legacy files Identify what can stay and what needs to go. Approach with flexibility

What structure to use? Decide on the overall structural environment you want

to work with Could include S1000D, DocBook and DITA Can also build your own

Develop your FrameMaker support materials EDD, conversion table, and a template at the least

Page 9: Moving unstructured FrameMaker content to structure

06:42

ID a rule set

@publishsmarter

9

Use existing rules If rules already exist, you have a solid starting point Learn the rules and adapt your content to them

Build your own rules If no rules exist, you can set your own from the start Learn how to create the rules and build all the components

Hybrid approach If you see a set of rules that look promising, learn about

them Find out how you need to adapt your content to match the

rules If that does not work, then consider adapting the rules

Page 10: Moving unstructured FrameMaker content to structure

06:42

Not chaos, but it’s at least unstructured and needs work

@publishsmarter

Create structure from chaos10

Page 11: Moving unstructured FrameMaker content to structure

06:42

Method 1—Manually, element by element

@publishsmarter

11

Apply structural rules to your contentManually wrap content such as text ranges

and tablesContinue to manually wrap contents of

paragraphs together in Para elementsThen wrap sequences of Head and Para

elements in Section elementsAnd so on until entire document is wrapped in

single highest-level element

Page 12: Moving unstructured FrameMaker content to structure

06:42

Method 2—Automatically

@publishsmarter

12

Similar to adding structure manually Apply rules to document objects below paragraph level Then at paragraph level and through successively higher

levels Stops at root element, or no more rules exist

Automatic wrapping requires a conversion table Provides table of mappings to automate task of adding

structure to unstructured documents Uses paragraph and character tags, and object types (such

as equations or footnotes), to identify how to wrap document components in elements

Also specifies how to wrap child elements in parent elements

Page 13: Moving unstructured FrameMaker content to structure

06:42

Let’s dive into it

@publishsmarter

Conversion tables13

Page 14: Moving unstructured FrameMaker content to structure

06:42

Conversion Table—Overview

@publishsmarter

14

Conversion table: rules for mapping content in unstructured files to structured content. Conversion table can be split up into several tables

with text or graphics in between for comments Cannot have any tables other than conversion tables Must be saved at least once before it can be used Allows for iterative testing though Can be in structured or unstructured document

Page 15: Moving unstructured FrameMaker content to structure

06:42

Conversion Table—Organization

@publishsmarter

15

Organization of conversion table: Regular table, with at least 3 columns and 1 body row Additional columns and heading/footing rows can hold

comments Each body row holds 1 ruleColumn 1 Column 2 Column 3specifies document object, child element, or sequence to wrap

specifies element in which to wrap

specifies optional qualifier (“nickname”) to use as temporary label

Page 16: Moving unstructured FrameMaker content to structure

06:42

Conversion Table—Sample

@publishsmarter

16

Wrap this object In this element With this qualifierP:Bullet Item UnorderedP:Numbered Item Ordered

Column 1 Column 2 Column 3specifies document object, child element, or sequence to wrap

names the element in which to wrap

specifies optional qualifier (“nickname”) to use as temporary label

Page 17: Moving unstructured FrameMaker content to structure

06:42

Manually or automatical ly

@publishsmarter

Ways to create conversion tables

17

Page 18: Moving unstructured FrameMaker content to structure

06:42

Conversion Table Production: Manual

@publishsmarter

18

You have full control. No automatically inserted content. All the rules are specific to what you tell the system. However, you have to be explicit. (I am not a fan)Wrap this object In this element With this qualifierP:Head1 Head1P:Head2 Head2P:Body BodyP:Code CodeSV:Current Date CurrentDateC:Code cCodeTC: CELLTR: ROW

Page 19: Moving unstructured FrameMaker content to structure

06:42

Conversion Table Production: Automatic

@publishsmarter

19

Autogenerated content , then develop more rules or tweak as needed. Rules based on content used in source files. (I like this a lot more)

Use if you already have an unstructured document Scans body page flows to ID every object that can be structured Lists object type and format tag (if any) used in document Maps object to element Element tag named same as format tag If object does not have format, element tag is a default name

for example: CELL or BODY Removes parentheses and other characters to create valid element tag

Object type identifier in lowercase is prepended to duplicate tagsDeveloper adds additional rules to:

Wrap elements in higher-level elements Set attributes as elements are created Wrap all elements in root element (by using root RE or by making elements wrap

up properly)

Page 20: Moving unstructured FrameMaker content to structure

06:42

Number of conversion tables you need

@publishsmarter

20

Based on types of high level elements and amount/quality of content

If documents are clear and short, with a single highest level Create unique conversion table for each document type and convert in bulk

For example if your documents are already clearly defined as task, reference or concept you can apply one of three conversion tables to groups of files

If documents are clear, but long and with multiple highest level Create a single conversion table that covers as much as possible and then

divide up content as required, or; Reorganize first, then you have clear, short files with one highest level

If documents are scattered with content Create a single conversion table that does initial work and then manually

rework the structure as needed, or; Rewrite and reorganize first to have clear, short files with one highest level

Page 21: Moving unstructured FrameMaker content to structure

06:42@publishsmarter

Generate conversion tables21

Page 22: Moving unstructured FrameMaker content to structure

06:42

Your first conversion table

@publishsmarter

22

1. Open document with objects you want to structure

2. Structure Tools > Generate Conversion Table3. From the Generate Conversion Table dialog

box, select Generate New Conversion Table4. Click Generate

Page 23: Moving unstructured FrameMaker content to structure

06:42

Expected results

@publishsmarter

23

Unnamed conversion table appears with rules based on objects in document and element tags based on format tags (tags used in the file, not all in catalog)Wrap this object In this element With this qualifierP:Title TitleP:Body BodyP:Heading1 Heading1P:Heading2 Heading2P:Heading3 Heading3C:Emphasis EmphasisX:See Heading SeeHeadingM:Index IndexM:Cross-Ref Cross-Ref

Page 24: Moving unstructured FrameMaker content to structure

06:42

Update a conversion table

@publishsmarter

24

Do so for a more complete list of objects (for example, after a chapter is parsed, a more complete one is found)1. Open document with objects you want to structure2. Structure Tools > Generate Conversion Table3. From the Generate Conversion Table dialog box,

select Update Conversion Table4. From Update Conversion Table popup menu, choose a

previously saved and open conversion table to update

5. Click Generate

Page 25: Moving unstructured FrameMaker content to structure

06:42@publishsmarter

Rules to be aware of25

Page 26: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Character Restrictions

@publishsmarter

26

Case-Sensitivity in Tags Format and element tags must be specified as defined in catalogs Qualifier tags are case-sensitive; two occurrences of one qualifier

must match exactlySpecial characters in Tags include ( ) & | , * + ? % [ ] : \

In format tags and qualifier tags—allowed but must be preceded by backslash (\) in table

In element tags—not allowedA space character in tags does not need to be preceded

with backslash (you can write tag Format A)Wildcard character (%) in Tags

Use % as in format or element tag to match zero, one, or more characters (similar to * in general rule)(you can write P:%Body matches paragraphs with format tag Body, FirstBody, or BulletBody)

Page 27: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Specifying What to Wrap

@publishsmarter

27

In Column 1 of the Conversion Table 1 or 2 letter

code to ID item type

Type format name to narrow definitions

Object Code Additional Info After Code ExampleParagraph P: Paragraph format tag P:BodyText range C: Character format tag C:EmphasisTable T: Table format tag T:Format ATable title TT: (none) TT:Table heading TH: (none) TH:Table body TB: (none) TB:Table row TR: (none) TR:Table cell TC: (none) TC:System variable SV: Variable format name SV:Current DateUser variable UV: Variable format name UV:CompanyNameGraphic G: (none) G:Footnote F: Location of footnote3 F:FlowMarker M: Marker type M:IndexCross-reference X: Cross-reference format X:Heading OnlyText Inset TI: (none) TI:

Equation Q:Equation size: Small, Medium, or Large Q:Medium

Page 28: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Specifying the Wrapper

@publishsmarter

28

In Column 2 of the conversion table Type object

identifier E: (optional)

Followed by element tag

Wrap this object In this element With this qualifierP:Body ParaC:ReportName ReportT:Format Part PartsTableTT: TableTitleTH: TableHeadingTB: TableBodyTR: PartsRowTC: PartNameSV:Current Date \(Long\) DateUV:Customer CustomerG: GraphicF:Flow FootnoteM:Index IndexEntryX:ElemNumTextPage XRefTI: ParaQ:Large EQ

Page 29: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Specifying a Qualifier

@publishsmarter

29

In Column 3, type qualifier (optional) for new element tag

Wrap this object In this element With this qualifierP:Bullet Item BulletP:StepRestart Item Step1P:Step Item Step

Page 30: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Identifying Sequence to Wrap

@publishsmarter

30

In Column 1 of the conversion table Type E: for

element, then the element tag

Type qualifier (optional) in brackets

Add more element tags with code identifiers and connectors (as in EDD)

Symbol MeaningPlus sign (+) Item is required and can occur more than onceQuestion mark (?) Item is optional and can occur onceAsterisk (*) Item is optional and can occur more than onceComma (,) Items must occur in order givenVertical bar (|) Any one of items in sequence can occurParentheses Beginning and end of sequence

Wrap this object In this element With this qualifierP:Bullet Item BulletP:StepRestart Item Step1P:Step Item StepE:Item[Bullet]+ ListE:Item[Step1], E:Item[Step]+ ListE:Head, (Para | List)+ Section

Page 31: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Adding Attributes to Elements

@publishsmarter

31

Optional in Column 2 of the Conversion Table Type attribute name and value in brackets after

element tag Separate name and value with equal sign, and enclose

value in double quotation marksWrap this object In this element With this qualifierP:Bullet Item BullP:StepRestart Item Step1P:Step Item StepE:Item[Bull]+ List [Type = “Bulleted”]E:Item[Step1], E:Item[Step]+ List [Type = “Numbered”]E:Head, (Para | List)+ Section

Page 32: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Promoting Anchored Object

@publishsmarter

32

When user adds structure to document, table or graphic becomes child of paragraph with anchor

FrameMaker can break table or graphic out of its paragraph and promote element to be sibling of paragraphs:

In Column 2: Type element tag for table or graphic Add keyword “promote” in parentheses after element

tagWrap this object In this element With this qualifierT:Format A ProcedureTable (promote)

Page 33: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Flagging Format Overrides

@publishsmarter

33

Provides a valuable set of elements related to instances when the Paragraph or Character Designer was used to make formatting changes without saving to catalog format. This adds an attribute called Override with value Yes.

In Column 1: Add rule “flag paragraph format overrides” Add rule “flag character format overrides”Wrap this object In this element With this qualifierflag paragraph format overridesflag character format overrides

Page 34: Moving unstructured FrameMaker content to structure

06:42

Rule Syntax—Wrapping Untagged Text

@publishsmarter

34

To wrap untagged formatted text: In Column 1, add rule “untagged character formatting” In Column 2, add element tagWrap this object In this element With this qualifieruntagged character formatting UntaggedText

Page 35: Moving unstructured FrameMaker content to structure

06:42

Structuring a file (or set of files) with a conversion table

@publishsmarter

Converting files35

Page 36: Moving unstructured FrameMaker content to structure

06:42

Procedure: Structuring Current Unstructured Docs

@publishsmarter

36

1. Open conversion table and unstructured document2. In unstructured doc, import element definitions from existing

structured template or EDD Makes elements available in Element Catalog If you do not perform this step, next steps produce elements in Element

Catalog defined by rules specified in conversion table Can always import element definitions after generating structure

3. In unstructured file, StructureTools > Utilities > Structure Current Document

4. From Conversion Table Document popup menu, choose open conversion table file

5. Click Add Structure. A new document appears with content wrapped into elements as defined in rules of conversion table

6. Validate, correct errors, save file

Page 37: Moving unstructured FrameMaker content to structure

06:42

Procedure: Structuring Group of Unstructured Files

@publishsmarter

37

1. Place files to convert in separate directory2. Open a conversion table file3. StructureTools > Utilities > Structure Documents and

the Structure Documents dialog box appears4. From Conversion Table Document popup menu, choose

the conversion table5. Under Input Unstructured Files, set directory to

structure6. Optionally, if files have unique extension, in Suffix text

box, type extension (otherwise, all files in directory will be structured)

7. Under Output Structured Files, set directory to write to

Page 38: Moving unstructured FrameMaker content to structure

06:42

(continued)

@publishsmarter

38

1. Turn on Allow Existing Files to Be Overwritten As documents are structured, resulting files might have same names as

some existing files in directory specified for storing structured files When on, overwrites older versions When off, skips over files with existing matching filenames and presents

log file2. Click Add Structure3. When the “Operation completed normally” alert appears, click

OK to dismiss alert (structured files appear in output directory with filenames matching those in input directory)

4. Open each file and import element definitions from any existing structured template or EDD (makes elements in Element Catalog match those in structured template or EDD)

5. Validate, correct errors, save files

Page 39: Moving unstructured FrameMaker content to structure

06:42

Structuring book documents with a conversion table

@publishsmarter

Converting books39

Page 40: Moving unstructured FrameMaker content to structure

06:42

Procedure: Structuring Unstructured Book

@publishsmarter

40

1. Open saved conversion table file2. Open unstructured book3. In unstructured book, import element definitions

from any structured template or EDD Makes elements available in Element Catalog If you do not perform this step, next steps produce elements

in Element Catalog defined by rules specified in conversion table

Can always import element definitions after generating structure

4. Select StructureTools > Utilities > Structure Current Book (the Structure Book dialog box appears)

Page 41: Moving unstructured FrameMaker content to structure

06:42

(continued)

@publishsmarter

41

1. From Conversion Table Document popup menu, choose saved conversion table file

2. In Output Directory text box, type directory for saving structured files or choose from Browse

3. Turn on Allow Existing Files to Be Overwritten As you add structure to documents, resulting files might have same names

as some existing files in specified directory for storing structured files When on, overwrites older versions When off, skips over files with existing matching filenames and presents

log file4. Click Add Structure (structured book and files appear in output

directory with filenames matching those in input directory)5. Validate, correct errors, save

Page 42: Moving unstructured FrameMaker content to structure

06:42

Summing up the discussion,and options to continue it.

@publishsmarter

42

Conclusion and contact

Page 43: Moving unstructured FrameMaker content to structure

06:42@publishsmarter

43

About this session

Convert content from unstructured to structuredEDD, conversion table, and a structured templateUsing basic examples to get you started, this

session: Convert files with content such as character tags and

paragraph tags Add support for images and tables Demo converting unstructured to structured using

conversion tablesSamples are easy to recreate, but complex and

powerful in functionality

Page 44: Moving unstructured FrameMaker content to structure

06:42

My request

@publishsmarter

44

Please suggest this session to othersIf there are any problems with slides, please

let me knowRemember my disclaimer at the beginning

Not all slides are equal: Use some, discard others In the interest of brevity I make some blanket

statements It’s not all 100% “the truth”, but I’ll stay close Purists may complain

And they are wrong! (except when they are right)

Page 45: Moving unstructured FrameMaker content to structure

06:42

Solving business problems through communication

@publishsmarter

45

Page 46: Moving unstructured FrameMaker content to structure

06:42

Follow up contact information

@publishsmarter

46

905 833 8448 (Eastern Time)

[email protected]

www.linkedin.com/in/bernardaschwanden

@publishsmarter

www.publishingsmarter.com