Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets

Preview:

DESCRIPTION

Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets. Dan O'Brien, ACS Publications Presented at JATS-Con, 1-Nov-2010. What We'll Cover. Intro ACS, Products, Processes Framework & terminology for discussing customizations ACS Pubs' Use of NLM Tagsets - PowerPoint PPT Presentation

Citation preview

American Chemical Society

Journals and Magazines and Books, Oh My!

A Look at ACS' Use of NLM TagsetsDan O'Brien, ACS Publications

Presented at JATS-Con, 1-Nov-2010

American Chemical Society 2

What We'll Cover

•Intro

– ACS, Products, Processes

– Framework & terminology for discussing customizations

•ACS Pubs' Use of NLM Tagsets

– Overall Approach

– Journals

– Books

– Magazine

• Successes & Lessons Learned

3 American Chemical Society

Character Introductions

• ACS & ACS Pubs

• Journals• Books• Magazine

• Processes• Terminology

American Chemical Society 4

Introductions: ACS

•Professional membership organization

– Chartered by U.S. Congress in 1876

– Non-profit

– Over 160,000 members

•ACS Publications Division ("ACS Pubs")

– Journals

– Magazine

– Books

– On a quest

American Chemical Society 5

Introductions: ACS Journals

•40 peer-reviewed titles

•300,000 annual published pages

•~50% volume published weekly

•Among highest ISI impact factors

•"King" of publishing forest

American Chemical Society 6

Introductions: Books

•Symposium Series

•Around 30 titles published annually

•Around 25 chapters per book

•Hard covers, rigid content format

American Chemical Society 7

Introductions: C&EN Magazine

•Chemical & Engineering News

•Weekly Print & Web issues

•Daily Online News

•"BusinessWeek" for chemists

•Flexible format, loose content definitions

•More than meets the eye

American Chemical Society 8

Introductions, cont.

•Pressure for product innovation: Wicked Which of the West

•NLM Tagsets – has the answers: Wizard of Oz

American Chemical Society 9

Introductions: Processes

•Journals & Books:

– Standard scholarly publishing model

– XML-first article/chapter based production

• Automated Pre-Editing (Inera AutoRedact)

• Technical Editing

• Automated Post-Editing & Validations

– Article ASAP publication (Journals)

– Issue/Book publication (Journals & Books)

•Magazine:

– Staff writers vs. authors

– Feature articles, Thematic issues

– Story Online News? Issue?

– Edit-to-Fit

American Chemical Society 10

Introductions: Journal Process

American Chemical Society 11

Introductions: Books Process

American Chemical Society 12

Introductions: Magazine Process

American Chemical Society 13

Terminology

•Tag – a bit of XML markup: an element, attribute, etc.

•Tag Definition – the coding (in DTD or XSD syntax) that declares the tag name and what its allowed to do.

•Module – a way of logically organizing tag definitions, allowing reuse for multiple schemas.

•Tagset – a collection of related tag definitions forming a complete vocabulary, usually stored within a set of interrelated modules

•Schema – an application of a tagset to form a specific content model

American Chemical Society 14

Terminology

Module Module Module Module

Module Module

ModuleSchema (DTD, XSD, etc.)

Tagset

Tag definition dependencies

Schema (DTD, XSD, etc.)

Module

Tag definition A

Tag definition B

Tag definition C

Tag definition D

American Chemical Society 15

Terminology – "Customization Levels"

Tagset is used "As-Is" without customizations

Tagset not directly used; just "informs" your approach

American Chemical Society 16

Terminology – "Customization Levels"

As-Is Extended Reduced Customized Built From Informed By

American Chemical Society 17

Terminology – "Customization Levels"

As-Is Extended Reduced Customized Built From Informed By

Public version is used without changes or modification

Superset of public tagset is used

Subset of the public schema is used

Combo of Extensions + Reductions

Substantial changes: renamed tags, altered tag hierarchies, etc.

Only the design philosophy of public tagset is used

<xyz> <a/> <b/> </xyz>

<xyz> <a/> <b/> <c/></xyz>

<xyz> <a/> <b/> </xyz>

<xyz> <a/> <b/> <c/></xyz>

<abc> <a> <b/> </a></abc>

<abc> <aa/> <bb/> <cc/></abc>

XML is compatible

Public Custom

Public Custom

XML not compatible?

XML not compatible!

XML not compatible!

American Chemical Society 18

Terminology – "Customization Implementation Methods"

Overrides, leaving original public tag definitions versions intact

Modified original public tag definitions

American Chemical Society 19

Terminology – "Customization Implementation Methods"

Overrides Mixed Modified

Module

Module

Module

Module

Module

Module

ModuleCustom Schema (DTD,

XSD, etc.)

Tagset

Tag definition dependencies

Public Schema (DTD, XSD, etc.)

Module

Tag definition A

Tag definition B

Tag definition C

Tag definition D

American Chemical Society 20

Terminology – "Customization Implementation Methods"

Overrides Mixed Modified

Module

Module

Module

Module

Module

Module

Custom Schema (DTD, XSD, etc.)

Tagset

Tag definition dependencies

Public Schema (DTD, XSD, etc.)

Module

Tag definition A

Tag definition B

Tag definition C

Tag definition D

American Chemical Society 21

Terminology – "Customization Profile"

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

Extended

Reduced

Customized

Built from

Informed by

The Journey: ACS Pubs' Use of NLM Tagsets

American Chemical Society 23

ACS Pubs' Use of NLM Tagsets – Overview & Approach

•Leverage a public schema, or develop one from scratch?

•If use a public schema, would customization be needed? (i.e., where on the "Customization Levels” spectrum?)

– Product drivers !!

– Process drivers !!

– ACS Terminology !?

•If customization would be needed:

– How much customization was needed? (scoping)

– What customizations are needed? (details)

– How to implement the customizations? (i.e., where on the "Implementation Methods" spectrum?)

24 American Chemical Society

ACS Journals' Use of NLM Tagsets

• Production vs. Delivery• What we use and why • Customization Profile • Highlights of Customizations

American Chemical Society 25

ACS Journal Production: What we use

•Custom-built DTD based loosely on NLM Journal Archiving & Interchange v2.2

•~2005, as NLM tagset was beginning to increase in prominence for STM publishing

•Pre-2010: Monolithic tagset & schema used for editing, page composition, interchange with web delivery and 3rd parties

•Late 2010: New version of tagset supporting multiple schema flavors:

– "X" – External & Delivery Interchange

– "P" – Internal Production

– "L" – Page Layout

American Chemical Society 26

ACS Journal Production: What we use

Core tagset modulesExternal/Interchange

DTD

ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset

American Chemical Society 27

ACS Journal Production: What we use

Production-specific tagset features extend core modules

Core tagset modulesExternal/Interchange

DTD

Production DTD

ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset

Overrides of tag definitions

American Chemical Society 28

ACS Journal Production: What we use

Production-specific tagset features extend core modules

Core tagset modules

Page layout specific tagset features extend production-specific modules

External/Interchange DTD

Production DTD

Layout DTD

ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset

Overrides of tag definitions

American Chemical Society 29

ACS Journal Production: Why

•No public tagset met the minimum requirements for

– ACS Journal Product – without undesirable product limitations

– ACS Journal Process – without increasing costs

– Allowing ACS Pubs Terminology

• Without significant staff training & documentation updates

• Without risking rejection

•NLM's Journal tagset came closest

– Could have used massive extensions?

– ACS Pubs Terminology pushed us into "Built From"

American Chemical Society 30

ACS Journal Production: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

Extended

Reduced

Customized

Built fromACS Journal Production

Informed by

American Chemical Society 31

ACS Journal Production: Customizations – Terminology

NLM ACS

<fig> with @fig-type <fig>, <chart>, <scheme>

<abstract> with @abstract-type <abstract>

<synopsis>

<dek>

<graphic> with @content-type <abstract-graphic>

<toc-graphic>

<title-page-graphic>

<bio-pic>

<media> <weo>, <toc-weo>

American Chemical Society 32

ACS Journal Production: Customizations – Process

NLM ACS

<article>

  <front>

     <journal-meta>

     <article-meta>

  <body>

  <back>

<document>

  <metadata>

     <journal-meta>

     <document-meta>

     <processing-meta>

  <body>

  <back>

<sub-article>, <response> <sec> beefed up to act as quasi "sub-article"

American Chemical Society 33

ACS Journal Production: Customizations – Product

NLM ACS

<nlm-citation> (v2.3),<element-citation> (v3.0)

<acs-titles>, <acs-no-titles>, <acs-biochem>

n/a <chemical-name>, <chemical-process>, <caution>

<live-change> and related tags

<tie-bar-start/>, <tie-bar-end/>

American Chemical Society 34

ACS Journal Production: Customizations – Product, cont.

NLM ACS

n/a MathML 2 extensions:    

<ACS:marker>   

<object-group>

(now available in MathML 3)

n/a CALS Table extensions   

@row-type = list of types to receive special handling   

@indent-left = amount + unit    

@indent-left-style = {"full", "first-line", "hanging"}    

@spacing-before, @spacing-after

American Chemical Society 35

ACS Journal Delivery: What we use

• Online delivery system: based on Literatum from Atypon

• Literatum speaks "NLM Journal Archive & Interchange"

• Common base tagset ≠ XML content compatibility

– Differing schemas

– Differing tagging expectations

...see Figure <xref rid="xfca3"/>.

vs.

...see Figure <xref rid="xfca3">4</xref>.

American Chemical Society 36

ACS Journal Delivery: What we use

• Two-part content interface

1. Production system: "ACS-Delivery-Prep" (export )

2. Delivery system: "ACS2NLM" lexer ( import)

Both advantages & disadvantages

+ Insulates Production developers from Delivery intricacies

+ Delivery system tagging can evolve without Production

- Occasional failure point

- New products, production tagging changes = ACS2NLM lexer changes

American Chemical Society 37

ACS Journal Delivery: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

ExtendedACS Journal

Delivery System

Reduced

Customized

Built fromACS Journal Production

Informed by

38 American Chemical Society

ACS Books' Use of NLM Tagsets

• What we use and why • Customization Profile • Highlights of Customizations

American Chemical Society 39

ACS Books: What we use and why - Drivers

•Delivery System: Leverage our new Literatum-based delivery platform.

•Composition: Leverage Arbortext Publishing Engine for highly-automated XML-based page composition.

•Like Journals: Don't re-invent the XML wheel.

•Unlike Journals: Books had unique product characteristics of their own; different type of wheel.

•Book + Chapter production:

– Individual Chapter level: production editing and some composition

– Whole Book level: final book composition, indexing

– Delivery: combination of both book and chapter XML & PDF deliverables.

American Chemical Society 40

ACS Books: What we use and why - Answers

•Delivery System:

– Literatum already supported an Extended version of NLM Book v2.3

– Production & Delivery could share a common tagset!

•Composition: Extended NLM Book v2.3 fit the bill

•Like Journals:

– Extended NLM Book v2.3 had CALS table model

– Many elements & structures were similar to ACS Journal tagset, easing adoption

•Unlike Journals: Extended NLM Book v2.3 addressed almost all book-specific metadata & processing needs

•Book + Chapter production: gap! Solution: Xinclude

– Allows "link book to chapter" instead of "copy chapter into book"

American Chemical Society 41

ACS Books: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

ExtendedACS Journal & Book Delivery

System

Reduced

CustomizedACS Book Production

Built fromACS Journal Production

Informed by

American Chemical Society 42

ACS Books: Customization Highlights

•Addition of XInclude

– Allows a chapter XML to be processed both as stand-alone document AND within context of entire book

•Use of OASIS Table Model

(instead of default XHTML Table model)

•Addition of DocBook <index> Model

•Addition of <book-series-meta> section

(similar to <journal-meta>)

American Chemical Society 43

ACS Books: Customization Highlights - XInclude

Book XML

<book> <book-series-meta>… <book-meta>… <body> <book-part>… <book-part>… <book-part>…

Book DTD

Chapter XMLs

Book XML

<book> <book-series-meta>… <book-meta>… <body> <xi:include hef="ch1.xml"/> <xi:include hef="ch2.xml"/> <xi:include hef="ch3.xml"/>

Book DTD

<book-part>…

<book-part>…

<book-part>…

44 American Chemical Society

ACS C&EN Magazine's Use of NLM Tagsets

• What we use and why • Customization Profile • Highlights of Customizations

American Chemical Society 45

ACS Magazine: What we use and why

•What: A customized version of the ACS Journal Tagset

– (Which was "informed by" NLM Journal Tagset)

•Drivers:

– Ability to archive a "content of record" that is format independent

– Ability to serve as technology-neutral "content interchange format"

• Automated web delivery

• External content syndication

•Other contenders: DITA for Publications, DocBook, EPUB, PRISM, NewsML,

American Chemical Society 46

ACS Magazine: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

ExtendedACS Journal & Book Delivery

System

Reduced

CustomizedACS Book Production

Built fromACS C&EN Magazine

ACS Journal Production

Informed by

American Chemical Society 47

ACS Magazine: Customization Highlights

•Amorphous, modular content structures: XInclude

– Same content produced as

• Single article in print

• Several distinct pages online

– Web-only articles & article components

– Blur between articles & subarticles

– Graphics, tables, media have separate production lifecycles, joined later

•Non-contiguous Pagination

•Ads

American Chemical Society 48

ACS Magazine: Customization Highlights

•Flexible, recursive categorization model

– Print/web name, internal code, source/type

• "CO2 Sequestration" vs. "Carbon Dioxide Sequestration"

– RSS feeds

– Alternate topic-oriented TOCs

•Special content constructs

– Dek

– Eyebrow

– Pull quotes

American Chemical Society 49

ACS Magazine: Customization Highlights

50 American Chemical Society

ACS Pubs' Use of NLM Tagsets – Summary

Tagset Lineage & Content Interchange Map

51 American Chemical Society

Successes & Lessons Learned

• Tagging & Technology• People

American Chemical Society 52

Successes & Lessons - Technical

1. Monolithic vs. specialized schemas

2. Use of XInclude for Books & Magazine

American Chemical Society 53

Successes & Lessons - Technical

3. ACS Pubs' hosted "Validations service"

• Internal staff

• Internal systems

• External vendors

4. Use of XML for ACS Mobile

American Chemical Society 54

Successes & Lessons - People

1. Busting the NLM DTD "compatibility" myth

2. "XML as a product" mentality

American Chemical Society 55

Successes & Lessons - People

3. Specifying XML requirements via "Three-legged stool" or package:

a) XML DTD/Schema

b) Documentation: Tagging Conventions & Rendering Expectations

c) XML Samples

56 American Chemical Society

Q & A