Upload
scott-abel
View
1.943
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Presented by Eliot Kimber at Documentation and Training East 2008, October 29-November 1, 2008 in Burlington, MA. XML applications for publishers have largely failed to realize the full potential inherent in the technology. While larger publishers could make the investment necessary to realize significant return on the use of XML technology, smaller enterprises simply could not, for a number of reasons, but fundamentally because the startup costs and ongoing costs of ownership were simply too high. The DITA standard fundamentally changes the equation, bringing several unique features that, together, serve to lower both the startup cost and ongoing costs, making the use of XML for publishers much more affordable than it ever has before. At the same time, advances in supporting technologies important to Publishers, such as improved support for XML in Adobe Creative Suite and Microsoft Office, powerful new XML search and retrieval systems such as MarkLogic, and a new generation of lower- cost XML editors, as serve to make the use of XML for Publishing applications more attractive than it ever has been before.
Citation preview
Sustainable XML for Publishing Applications:
DITA Makes It Possible
Eliot Kimber, Really Strategies, Inc.
DocTrain East 2008
Preliminaries
DITA for Publishers DocTrain East 2008
Who Is This Talk For?
Publishers who want to implement XML-based solutions
Publishers who have XML-based solutions that need to be enhanced, refined, or upgraded
Creators of XML-aware tools applicable to publishing use cases
Service providers who support the development and use of XML-based solutions for Publishers: Integrators Data conversion houses Consultants
DITA for Publishers DocTrain East 2008
About Me
Senior Concept Prover at Really Strategies Inc. 20+ years experience with generalized markup
(GML, SGML, XML, etc.) Career focus on large-scale hyperdocument creation
and management Focus for last 8+ years on Publishing use cases
around XML-based publishing workflows Active member of the DITA Technical Committee Founding member of the XML Working Group Long-time member of the XSL-FO Working Group Co-editor of the ISO/IEC HyTime standard
DITA for Publishers DocTrain East 2008
Audience Survey
Who is here? Publishing for profit? Publishing as a cost? Technical Documentors? Service providers? People just interested in DITA?
DITA knowledge: No idea what DITA is? Know about DITA a little? Familiar with DITA concepts and details? Using DITA now or implementing DITA-based solution?
Brief Overview of DITA
DITA for Publishers DocTrain East 20087
What Is DITA?
OASIS Open Standard: Darwin Information Typing Architecture
An XML architecture standard for representing human-consumed information
Some distinguishing aspects of DITA as an XML architecture: Formal mechanism for controlled definition of new
vocabularies (“specialization”) Optimized for information modularity (“topics”, “maps”) and
blind interchange Standardized document type implementation design patterns Growing off-the-shelf processing infrastructure Sophisticated hyperlinking features (“relationship tables”)
Currently at version 1.1, version 1.2 in final stages of review and approval
DITA for Publishers DocTrain East 2008
Key DITA Concepts Briefly Explained
Topics Topic content is paragraphs and stuff Standalone units of information Topics may directly contain other topics
Maps Hierarchical sets of links to topics Establish organizational hierarchies for sets of topics May have many maps over the same topics Can impose metadata onto topics Can impose topic-to-topic hyperlinks (relationship tables)
Specialization New element types are “subclasses” of existing types
DITA for Publishers DocTrain East 2008
Map Two
Map One
Oooh, A Picture
TopicsMaps
TopicA
TopicB
TopicC
TopicD
TopicE
TopicF
DITA for Publishers DocTrain East 2008
Output Results
Map Two
Map One
Map to PDF
I. Topic C 1.1 Topic B 1.1.1 Topic A 1.1.2 Topic D 1.2 Topic F
I. Topic F 1.1 Topic B 1.2 Heading 1.1.1 Topic A 1.1.2 Topic E
DITA for Publishers DocTrain East 2008
Specialization
DITA standard defines a set of base element types: Topic, map, section, paragraph, figure, table, phrase, data
All other elements based on these base types Establishes a formal class hierarchy for all element
types in any DITA document Every element type maps back to some standard-
defined base type Declaration mechanism is formal and simple:
Uses element attributes (“class=“) Can be processed by almost any XML tool, including CSS
selectors Even works for DTD-less documents
DITA Compared to Other XML Options
DITA for Publishers DocTrain East 200813
Compare DITA With…
DocBook Book-focused (not inherently modular)
Can use XInclude to manage information in modular fashion No facility comparable to DITA maps
Mature standard Very large tag set reflecting union of wide set of requirements No formal vocabulary extension mechanism Blind interchange not really possible Deep off-the-shelf infrastructure
DITA for Publishers DocTrain East 200814
Compare DITA With…
NLM Optimized for journals not books No formal vocabulary extension mechanism Little off-the-shelf infrastructure
DITA for Publishers DocTrain East 200815
Compare DITA With…
PRISM/PAM Essentially XHTML with sophisticated metadata Optimized for serialization, not authoring and archiving Little off-the-shelf processing infrastructure
DITA for Publishers DocTrain East 200816
Compare DITA With…
Custom XML application Expensive to develop and maintain Can be optimized for local requirements Processing infrastructure must be built from scratch
Content management Authoring tool configuration and customization Publishing pipelines Interchange transforms
No blind interchange possible
DITA for Publishers DocTrain East 200817
About That…XML Application Development Costs
Information requirements analysis is always required Using a standard XML application still requires that
you determine how to apply it to your requirements All useful standard XML applications…
…Provide more stuff than you need …Fail to provide some things specific to your requirements
Amount of analysis required reflects your business problem, not standard chosen
Thus: cost of analysis is essentially invariant regardless of implementation choice
Main variable is cost of system implementation: Implementation of XML document types (DTDs) Implementation of management and processing
DITA for Publishers DocTrain East 200818
XML System Cost Analysis
Three distinct cost domains: Initial system development Cost of use (training, skills required, cost of tools) Maintenance and refinement over long time scale
Ideal implementation base minimizes all costs: Low cost of acquisition and implementation Low cost of use, skills and knowledge common in user
population, tools are appropriately priced Low cost of refinement, extension, interchange, management
Cost evaluated in terms of value: Short term: ability to meet immediate requirements with
lowest initial cost Long term: ability to support new requirements with lowest
cost of maintenance and extension
DITA for Publishers DocTrain East 200819
DITA Largely Meets the Ideal
Lowest possible cost of initial solution development Implementing custom doctypes very low cost Many off-the-shelf tools “just work” with little or no
customization or configuration Large and growing body of use-case-specific DITA modules
Large and growing body of DITA knowledge Standard is well written Many service providers with solid DITA knowledge Growing body of published DITA how-to information
Controlled extension (“specialization”) means: Knowledge about one DITA application transfers to all other DITA applications Extensibility and interchange are optimized Implementations can optimize their own modularity and flexibility
DITA for Publishers DocTrain East 200820
My Assertion: DITA Is Almost Always Best Fit
DITA can be easily and practically applied to almost all documentation use cases (not just tech docs)
DITA’s unique features minimize initial cost of ownership and implementation
DITA’s unique features optimize interchange of content
DITA’s unique features maximize flexibility and stability of supporting tools
Therefore: DITA provides maximum value compared with other
alternatives Main cost is acceptance of a few constraints that enable
DITA’s value
DITA Myths Busted
DITA for Publishers DocTrain East 200822
DITA Myth One: DITA Is Only For Tech Docs
DITA is a layered, flexible standard Originally driven by technical documentation
requirements… …but, core features are completely generic DITA has been used for:
Government reports Financial standards Test preparation books Travel guides
No inherent restrictions on the kind of publications DITA will work well for
DITA for Publishers DocTrain East 2008
Forest for the Trees: It’s Still Just XML
DITA has lots of cool features, some quite sophisticated
This sophistication can be scary But... …It’s still just XML You don’t have to use any particular feature of DITA Users don’t necessarily need to know it’s DITA If it being DITA-based doesn’t help at the moment,
don’t talk about it To the non-DITA-aware it looks like any other custom
XML application
DITA for Publishers DocTrain East 200824
DITA Myth Two: DITA Requires Topic-Based Writing DITA standard is optimized for modularity But it does not require that content be stored or written as
modules Use of DITA maps is entirely optional Topics can physically contain other topics An entire book could be marked up as a single XML document
consisting of one root topic and many child topics Such a topic would be indistinguishable from any other similar
XML document (e.g., an NLM article, a DocBook document)
DITA for Publishers DocTrain East 200825
DITA Myth Three: DITA Is Hard
DITA has lots of features, some quite sophisticated Making full use of all these features requires understanding
those features, of course But at its simplest, DITA is just like any other XML document
type for publications: sections, paragraphs, lists, figures, tables, and inlines.
Thus, a DITA application need only be as sophisticated as you need it to be to satisfy your specific requirements
Complexity and “difficulty” of DITA is concentrated in the data processing requirements, not in authoring
Ability to easily define custom vocabularies means you can optimize markup names and structures to reflect local culture and practice
DITA for Publishers DocTrain East 200826
In Short: Why DITA? Why Not DITA?
DITA can be applied where any other applicable XML standard can be applied At lower absolute cost With greater flexibility With greater potential value
Cost of using DITA at worst no greater than using XML generally
So why not use it?
DITA for Publishers DocTrain East 2008
Yeah, But…
I’ve said a lot of stuff What do you need from me to be convinced?
DITA As Applied to Publishing
DITA for Publishers DocTrain East 2008
Publishing-Specific Challenges
Existing vendor solutions and community knowledge focused on tech doc requirements
Many vendors don’t really get DITA Many tools don’t yet fully support specialization Many older tools limited by architecture and
implementation choices made years ago Many service providers still building understanding of DITA
Publishing requirement for high-quality composition always a challenge for any XML-based solution
Publishers have different business drivers from tech doc
DITA for Publishers DocTrain East 2008
Note to Vendors and Service Providers
Potential market for DITA as a publishing solution orders of magnitude larger than potential market for DITA as a tech doc solution
Many more publishers and units of publication than tech doc producers
Tech doc is a cost center Publishing is profit center In many ways, the value of DITA to publishing is
more compelling than it is for tech doc Just saying…
DITA for Publishers DocTrain East 200831
Publishing: Open Toolkit Alone Won't Cut It
Pages are and will always be important Need a path from DITA XML to publishing tools
InDesign Quark Etc.
No technical barrier to a generic DITA-to-InDesign process
Products like Typefi could add significant value Several advantages for Publishers:
Uses existing layout design skills, tools, and methods Can be 100% automatic or include human tweaking Can leverage Toolkit for preprocessing
DITA for Publishers DocTrain East 200832
Where Could Publishers Go From Here?
DITA as specialized in the DITA spec not always appropriate for Publishers Too constrained in some areas Needs more ways to capture format intent May not match existing publishing practice or conventions
well Might be useful to define a separate publishing-
specific specialization family rooted at DITA topic rather than at concept/task/reference
Can a novel be single topic or small set of topics? [Yes]
Does that help? Does it it hurt?
DITA for Publishers DocTrain East 200833
Business Process Improvement Implications
Base cost of using XML is essentially unchanged Same challenges for legacy conversion Applying XML at end of process Using XML as input to revision process
Cost of developing initial markup design can be significantly lower
Can have more generic, reusable processing components
DITA encourages and enables small modules Makes recombination at small granularity possible and
manageable Adapts well to delivery to portable devices.
DITA for Publishers DocTrain East 2008
Business Improvement Implications (Cont.)
Incremental cost of DITA-based systems should go down over time as infrastructure acretes
Enables local optimization of markup without impeding interchange within an enterprise or across enterprises
Provides controlled, formal framework for defining common components used across parts of enterprise or communities of interest
Enables use of more sophisticated features as needed
DITA for Publishers DocTrain East 200835
Potential Information Economy Improvements
Reduce tight coupling between suppliers and consumers (aggregators, republishers, etc.)
No need to agree on rigid, overly-general standards to enable interchange
Supplier need not have full publishing infrastructure in order to supply high-quality content
Information consumers can apply a generic DITA processing infrastructure to content from many suppliers
Increases value of information in DITA form Reduces impedance of interchange
Wrap Up
DITA for Publishers DocTrain East 2008
In Conclusion
DITA has lots of unique goodness of direct and compelling value to publishers
DITA can be used in simple ways to good effect with low cost of entry
DITA’s low cost and strong features represent a compelling value for almost all XML-based publishing use cases
Full DITA infrastructure still being developed Off-the-shelf DITA-to-InDesign processes Standard publishing-specific DITA modules Community knowledge of how best to apply DITA to
publishing use cases
DITA for Publishers DocTrain East 2008
Questions?
?