Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Slide 1 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Publication Workflows for Standards
Presented by
Bruce D. Rosenblum
CEO
Inera Incorporated
SES, 17 April 2013
Slide 2 Copyright 2013 Inera Incorporated. All Rights Reserved
The Publisher’s Conundrum
Gutenberg: (Surveying the Frankfurt Book Fair)
“This commodity must be as precious as gold!”
Gates: “Cheap as dirt, actually. And on its way out.
It’s called print. You invented it, or so history claims”
John Updike
Print: A Dialog (1995)
Slide 3 Copyright 2013 Inera Incorporated. All Rights Reserved
Remember When…
Slide 4 Copyright 2013 Inera Incorporated. All Rights Reserved
This Is Not Your Father’s Book
Slide 5 Copyright 2013 Inera Incorporated. All Rights Reserved
Transformative Technologies…
Slide 6 Copyright 2013 Inera Incorporated. All Rights Reserved
…And Expanding Device Array…
Slide 7 Copyright 2013 Inera Incorporated. All Rights Reserved
…Demand New Product Features…
Automatic reflowable text
Richly hyperlinked
Dynamically updated
Accessible for visually impaired
Reading a standard PDF on a small touch-screen
device is not good enough!
Slide 8 Copyright 2013 Inera Incorporated. All Rights Reserved
… And A New Publication Foundation
XML:
Foundation for ePub, HTML5, Kindle
Automatic metadata sharing
PDF generation
Yuck!
(We know you’ve been trying to avoid it for years)
(Sorry… you can’t hide any longer)
Slide 9 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Facilitates Multi-Platform Publishing
Slide 10 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Is Not Easy
XML does not "happen"
XML requires
New publication workflow
New software tools
New production training
XML requires deliberate and thoughtful choices
Slide 11 Copyright 2013 Inera Incorporated. All Rights Reserved
Agenda
XML overview
ISO case study
XML workflow choices
XML quality
XML standard DTDs
Where to go next
Slide 12 Copyright 2013 Inera Incorporated. All Rights Reserved
What Is XML?
http://www.w3.org/TR/REC-xml/
Subset of ISO 8879
Describes a method to markup document structure
Independent of document format
Important companion standards
Unicode: http://www.unicode.org/
Standard for representing text characters in all languages
MathML: http://www.w3.org/Math/
Standard for markup of mathematics
Slide 13 Copyright 2013 Inera Incorporated. All Rights Reserved
What Does XML Look Like?
It looks like HTML
Because HTML is one "flavor" of XML <!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph includes a Greek beta:
β.</p>
</body>
</html>
Slide 14 Copyright 2013 Inera Incorporated. All Rights Reserved
HTML Code and View
Slide 15 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Versus HTML
HTML is mostly about format
What does the web page "look like"
No semantic tags available
XML: "eXtensible"
Define your own tags (Document Type Definition)
Semantic markup (e.g. terms and definitions)
Define your own business rules
Slide 16 Copyright 2013 Inera Incorporated. All Rights Reserved
Who Uses XML?
Journal publishers
Widely used (> 80% of journals)
Huge growth from 1995 to 2005
Book publishers
More recent adoption
Key drivers: ePub, production efficiency
Standards publishers
Not yet widely adopted for standards publication
Slide 17 Copyright 2013 Inera Incorporated. All Rights Reserved
Case Study: ISO
ISO before 2012
ISO STD template for authoring
In-house "clean-up" of STD template use
Word PDF conversion
Slide 18 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO PDF Workflow Issues
ISO STD template
Varying quality of use: 0 to 100% application
Increasing Word security issues
Editorial and production
Extensive time to format for PDF conversion
Word is not a page-layout application
Slow publication times
Published product
Not suitable for eReaders
No rich hyperlinks
Slide 19 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO XML Initiatives: 2012
XML-based production workflow
XML technology in-house
XML back-catalog conversion
Outsourced
XML-driven Online Browsing Platform (OBP)
https://www.iso.org/obp/ui/
Slide 20 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Search
Slide 21 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Results
Slide 22 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Look Inside
Slide 23 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Limited Content Access
Slide 24 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Hyperlinked Cites
Slide 25 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Link Out
Slide 26 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: Search Results
Slide 27 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP: View Snippets
Slide 28 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO OBP Value
Better user experience
Faceted search
More information
Richly hyperlinked
Broader use of standards through better access
Slide 29 Copyright 2013 Inera Incorporated. All Rights Reserved
Value of XML to ISO
Catalyst for positive change
Business rules codified
Refocuses team on high-value content editing
Permits some production outsourcing
Faster publication times
Significant cost savings
Slide 30 Copyright 2013 Inera Incorporated. All Rights Reserved
But How Do I Get XML?
OBP is cool
But how do I get there?
…rewind
Re-engineer publication workflow
Slide 31 Copyright 2013 Inera Incorporated. All Rights Reserved
Where In The Workflow?
You can introduce XML at:
Authoring
Before editing
Before composition
Post-publication
Each point has pros and cons
Slide 32 Copyright 2013 Inera Incorporated. All Rights Reserved
The Original XML Dream
Authors create XML manuscripts
Editors edit XML manuscripts
XML single-source publication
eBooks
Derivative products
Slide 33 Copyright 2013 Inera Incorporated. All Rights Reserved
The Author Reality
Authors use
Microsoft Word
Word Perfect
LaTeX
Slide 34 Copyright 2013 Inera Incorporated. All Rights Reserved
The Author Reality
Most Authors
Do not think structure
Do not like production tasks
Outside Authors
Brilliant subject matter experts
Hard to train and support
Even harder to control
Can’t get IT to install XML editing tools
Slide 35 Copyright 2013 Inera Incorporated. All Rights Reserved
The Word Macro Problem
Word macros are cool, but…
Hard to write
Smart authors out-smart idiot-proof macros
Macros grow ever-larger to plug holes
Larger macros require more training
Hard to support
Multiple versions of Word
Hard to install
Ever-greater IT security requirements
Slide 36 Copyright 2013 Inera Incorporated. All Rights Reserved
Post-Publication XML
Author submits Word manuscript
Edited in Word
Typeset (InDesign/Quark/FrameMaker)
Proof and typeset corrections
Publish print and PDF
Create XML/ePub from PDF
Slide 37 Copyright 2013 Inera Incorporated. All Rights Reserved
Post-Publication XML Issues
Advantages
No workflow changes
Disadvantages
Quality of XML unchecked
Extra production time and cost
Errors discovered in XML creation
It’s not an integrated workflow
Almost essential to outsource
Slide 38 Copyright 2013 Inera Incorporated. All Rights Reserved
Managing Outsource XML Vendors
Develop XML markup standards
Test several vendors
Compare results to tag same standard
Select on quality, not cost
Provide vendors with QA tools
Recheck ongoing vendor work
Slide 39 Copyright 2013 Inera Incorporated. All Rights Reserved
Post-Publication ePub Issues
ePub created from PDF often lacks
Rich metadata of XML file
Internal hyperlinks to footnotes, references, etc.
Section 508 accessibility compliance (float positions,
table scope attributes, etc.)
Broken hyperlinks
Especially extracted from multi-column text
Slide 40 Copyright 2013 Inera Incorporated. All Rights Reserved
XML First Workflow
Accept Word manuscript from author
Convert manuscript to XML
Edit XML manuscript
Typeset XML
Proof and typeset corrections in XML
Create final PDF, ePub, etc.
Slide 41 Copyright 2013 Inera Incorporated. All Rights Reserved
Advantages and Disadvantages
Advantages
Only one file conversion
File is continually validated to DTD
Disadvantages
Requires XML editing software for all editors
Training is expensive
Freelance editors not practical
Editors work amidst XML tags or XML editing customization is expensive
Slide 42 Copyright 2013 Inera Incorporated. All Rights Reserved
XML “Middle” Workflow
Accept Word manuscript from author
Clean up manuscript and style paragraphs
Edit in Microsoft Word
Convert Word to XML
Typeset from XML
Proof and typeset corrections
Create final PDF, ePub, etc.
Slide 43 Copyright 2013 Inera Incorporated. All Rights Reserved
Advantages and Disadvantages
Advantages
Editors work in Microsoft Word
Lower training costs
Freelance editors are practical
Structure enforced prior to final pages
Disadvantages
Requires running application in-house for XML
Slide 44 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Quality
XML is not free
Neither is XML quality
Create it, proof it, publish it
XML
Create it, proof it, publish from it
XML-first and XML-middle facilitate XML quality
PDF created from XML
Slide 45 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Quality +
What's between the tags is important
What's not visible (metadata) is more important
Create proofing tools
False color proofing
Schematron scripts
Check valid year
Check valid designation
Check copyright element not empty
Check anything that can be checked
Slide 46 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Quality Procedures
XML requires constant quality checks
Develop tools
Run tools on every XML file
Provide tools to third party vendors
"Trust but verify"
Slide 47 Copyright 2013 Inera Incorporated. All Rights Reserved
A Quick Word on DTDs
There is no standard DTD for standard
publication
Four main choices for "text" documents
TEI
DITA
DocBook
JATS and BITS (aka NLM DTD)
How do I choose?
Slide 48 Copyright 2013 Inera Incorporated. All Rights Reserved
DTD Selection Criteria
Your content
Content structures
Metadata
Sections, paragraphs, annexes, notes, bibliography
Lists, tables, figures, equations, footnotes
Special: Normative references, terms and definitions
Current vs. historical content
Your XML use-cases
Tools you may want to use
Slide 49 Copyright 2013 Inera Incorporated. All Rights Reserved
TEI DTD
Origins: Academic community (Brown University)
Widely used in humanities
Great for historical materials
E.g. preserving line break/pagination information
Poetry
Least-known by suppliers
Weakest commercial tool support
Slide 50 Copyright 2013 Inera Incorporated. All Rights Reserved
DITA DTD
Origins: IBM
Widely used in corporations
DITA is designed to write discrete, typed topics
for reuse in multiple publications
Great for proposals, internal documentation
Somewhat known by suppliers
Excellent commercial tool support
OASIS standard
Slide 51 Copyright 2013 Inera Incorporated. All Rights Reserved
DocBook DTD
Origins: Technical publication (O’Reilly)
Great for technical and trade books
Very good commercial tool support
FrameMaker, ArborText
Well-known by suppliers
OASIS standard
Slide 52 Copyright 2013 Inera Incorporated. All Rights Reserved
JATS and BITS
Origins: Scholarly journal archiving & publication
Widely used by journal publishers
Great for scholarly journals and books
Well-known by suppliers
Very good commercial tool support
NISO standard
Slide 53 Copyright 2013 Inera Incorporated. All Rights Reserved
DTD Commonalities
Any of these DTDs work well for simple
standards
All of these DTDs are designed for customization
Any of these DTDs will need customization for
use by standards publishers
Slide 54 Copyright 2013 Inera Incorporated. All Rights Reserved
ISO DTD
Reviewed TEI, DocBook, JATS, DITA
Adopted JATS and modified
Metadata for standards
TBX namespace for terms and definitions
Links to standards
http://www.iso.org/schema/isosts/
Slide 55 Copyright 2013 Inera Incorporated. All Rights Reserved
A Few Other Important Points
Figures
Math
Inter-standard links
Slide 56 Copyright 2013 Inera Incorporated. All Rights Reserved
Figures in an XML Workflow
Graphics are separate from the XML
Image files referenced from XML
Figure standardization
Format (type, quality, size)
Automated Proofing tools
Names (e.g. ISO-12083-1994-F1)
Keep figures outside of Word
Quality problems with extracted figures
Slide 57 Copyright 2013 Inera Incorporated. All Rights Reserved
Math in an XML Workflow
Author with MathType
Word 2007+ "Equation Builder"
Convert to MathType
Gets most consistent XML results
Include MathML2 in XML
And also math images with XML reference
Allows easy math rendering on any platform
Slide 58 Copyright 2013 Inera Incorporated. All Rights Reserved
Inter-Publisher Links
Slide 59 Copyright 2013 Inera Incorporated. All Rights Reserved
Creating Links
Bi-lateral agreements
Imagine creating an agreement with every SDO or
publisher your standards might cite
Technical implementation: n unique APIs
This is a lot of work
But wait… there’s a better solution
Slide 60 Copyright 2013 Inera Incorporated. All Rights Reserved
Digital Object Identifier
ISO 26324
Persistent discoverable authoritative links
Example for a journal
Thompson, M., et al.: The International Harmonized Protocol
for the Proficiency Testing of Analytical Chemistry
Laboratories (IUPAC Technical Report); Pure Appl. Chem., Vol.
78, No. 1, pp. 145-196, 2006
doi:10.1351/pac200678010145
Resolved at http://dx.doi.org/10.1351/pac200678010145
Slide 61 Copyright 2013 Inera Incorporated. All Rights Reserved
Resolved DOI
Slide 62 Copyright 2013 Inera Incorporated. All Rights Reserved
DOI Advantages
Supported objects Journals
Books
Conferences
Standards
Reports
Data
Single source for links
Publisher controlled
Guides read to authoritative version
More information at CrossRef: www.crossref.org
Slide 63 Copyright 2013 Inera Incorporated. All Rights Reserved
Where To Go Next…
Evaluate and set business goals
Business requirements drive technical decisions
Learn more about XML
Mulberry Technologies (http://www.mulberrytech.com/)
Talk to XML-savvy publishing organizations
Hire an XML expert or consultant
Don't reinvent the wheel
Slide 64 Copyright 2013 Inera Incorporated. All Rights Reserved
XML Project Startup
Develop and Document XML markup standards
Based on business goals
Select an XML workflow
Based on business goals
Build XML QA tools
Based on business rules
Start a pilot project
Re-evaluate results and fine-tune
Start XML workflow
Slide 65 Copyright 2013 Inera Incorporated. All Rights Reserved
XML For Standards Publishers
Brings new production efficiencies
Brings new products to customers
Brings new business opportunities
XML solutions are improving daily
Now is the time for XML
Slide 66 Copyright 2013 Inera Incorporated. All Rights Reserved
Questions?
Bruce Rosenblum
Inera Incorporated
+1 (617) 932-1932
www.inera.com