Upload
contrext-solutions
View
722
Download
0
Embed Size (px)
Citation preview
05/01/2023 Contrext, LLC 1
No Ki Magic
OrHyperdocument Authoring Link Management Using Git and XQuery in Service of an Abstract Hyperdocument Management Model Applied to DITA Hyperdocuments
Eliot KimberContrext, LLC
Balisage 2015
05/01/2023 Contrext, LLC 2
WHAT AM I TALKING ABOUT?
05/01/2023 Contrext, LLC 3
Link Management and Configuration Management
05/01/2023 Contrext, LLC 4
DITA
05/01/2023 Contrext, LLC 5
Solution Implementation
05/01/2023 Contrext, LLC 6
LINK AND CONFIGURATION MANAGEMENT
05/01/2023 Contrext, LLC 7
The Problems• As an author: What can I link to and how do I address
it?• As an authoring tool: What does this indirect address
point to?• As a deliverable producer: What is the set of resources
I require in order to produce a deliverable from the input publication source?
• As a manager: What is the version-specific configuration of this publication in a specific repository access context?
05/01/2023 Contrext, LLC 8
The Essential Issue• Given a collection of source components with
links among them and managed through asynchronous revision processes, what is the time-specific configuration of those components at any moment in time as viewed by a given agent for a specific purpose?
• In DITA terms: When I process a map in a specific access context, what do I see and what can I see?
05/01/2023 Contrext, LLC 9
Interlude: A (bit of a) PoemTime present and time pastAre both perhaps present in time future,And time future contained in time past.If all time is eternally presentAll time is unredeemable.What might have been is an abstractionRemaining a perpetual possibilityOnly in a world of speculation.What might have been and what has beenPoint to one end, which is always present.…
—T.S Eliot, "Four Quartets 1: Burnt Norton"
05/01/2023 Contrext, LLC 10
BACKGROUND
05/01/2023 Contrext, LLC 11
Aikido• A defensive martial art based on blending with an
attacker's energy, capturing their balance, and redirecting their energy in order to return them to harmony
• Goal of Aikido is ultimately universal peace and harmony
• There is no one true way to do Aikido– Aikidosa are expected to develop their own expression
and interpretation of Aikido as they develop their skills• It's all about connection
05/01/2023 Contrext, LLC 12
DITA• A standard XML application architecture for human-consumed
documents• Optimized for interchange and interoperation of content,
processing, and DITA-specific knowledge• Distinguishing architectural features:
– Specialization: enables controlled extension from base DITA markup vocabulary
– Use-by reference: Content components can be used in multiple contexts (DITA maps, content reference)
– Indirect addressing: keys and key references– Designed to work entirely from a file system
• DITA is all about connection
05/01/2023 Contrext, LLC 13
Another Poem
If you have notLinked yourselfTo true emptiness,You will never understandThe Art of Peace.
—Morihei Ueshiba, The Art of Peace, translated by John Stevens.
05/01/2023 Contrext, LLC 14
Direct vs. Indirect Addressing
• Blend and redirect to appropriate target • Harder to learn and execute but more
effective• Many options at time of action• Death does not result
Indirect addressing
• Quick, effective, fragile.• Relatively easy to learn and execute• Predetermined response to a given attack• Death results
Direct addressing
05/01/2023 Contrext, LLC 15
Indirection Is Necessary For Survival• Direct addressing is preferred for delivery
– Fast, uncomplicated, reliable,• Indirect addressing is required for authoring
– Flexible, robust, complicated– The link must live to link another day
• Allows binding same address to different targets in different use contexts
• Without indirection many authoring and configuration use cases cannot be satisfied
• Prefer a standard, interoperable way to do indirect addressing
05/01/2023 Contrext, LLC 16
Different Use Contexts• Same component used
multiple times in the same hyperdocument
• Same component used in different hyperdocuments
• Same component used in different versions in time of a given hyperdocument
Map 1
Topic A
Topic A
Map 1
Topic A
Map 2
Map 1V1
Topic AV1
Map 1V2
05/01/2023 Contrext, LLC 17
DITA Maps and Topics• Topics: XML documents that contain content– All content is contained by topics– Topics are intended to be more-or-less context
independent• Maps: XML documents containing nothing but
links– Links to other maps– Links to topics– Links to non-DITA things
Map 1
Topic A
Topic B
05/01/2023 Contrext, LLC 18
DITA Keys (No Magic)• Keys are defined in maps• Key definition binds a key name to a resource• Resource can be a topic, a map, or a non-DITA
thing (image, Web site, etc.)• Same key name can have different bindings in
different maps• A key reference can be used any place a direct
URI reference is allowed
05/01/2023 Contrext, LLC 19
ABSTRACT VERSION AND LINK MANAGEMENT MODEL
05/01/2023 Contrext, LLC 20
Snapshot-Based Configuration Management (SnapCM)
• First formulated around 1999 by Heintz, Kimber, et. al.
• Combines Heintz' version management insights with Kimber's hyperdocument representation and management insights
• Driven in large part by experience with legislative document management workflows and business requirements (bill drafting)– Arguably the hardest set of requirements one could have
05/01/2023 Contrext, LLC 21
Branches and Snapshots• A Repository contains Resources• Resources have Versions• A Repository has one or more
Branches• A Branch is a linear sequence of
Snapshots• A Snapshot points to zero or more
Versions– Constraint: no two Versions have the
same Resource
05/01/2023 Contrext, LLC 22
Configuration Management• By default, can only see versions on Branch– Current Snapshot– Earlier Snapshots
• A link to a Resource is resolved using a "resolution policy""
• Default policy is "on Snapshot" • Thus, a Snapshot represents a version-specific
configuration of a set of Resources
05/01/2023 Contrext, LLC 23
SOLUTION: DITA FOR SMALL TEAMS
05/01/2023 Contrext, LLC 24
DITA for Small Teams (DFST)• Show how open-source tools can be combined to create a
reasonable DITA authoring and production support system• Four main parts:
– Versioned content storage: git, mercurial, etc.– Authoring: oXygen XML, etc.– Production and delivery: Continuous integration + DITA Open
Toolkit– Link Management: Under development
• Link management is the one missing piece• I'm implementing link management for use in the DFST
context
05/01/2023 Contrext, LLC 25
Git-Based DFST
GitRepository Git Hooks
Link management
Processing
Authoring Environment
Link Management
Repository
Web App
Link Management Deliverable Production
CI Server
GitRepository
DITA OT
Git push
Deliverable
Deliverable
Deliverable
05/01/2023 Contrext, LLC 26
Git As the Repository • Git's versioning model close match to the
abstract model• Does not, by itself, provide branch-specific
access control• Can get the effect by having multiple clones
with different branches exposed• Git hooks feed updates to Link Manager
05/01/2023 Contrext, LLC 27
Link Management• DITA-specific XQuery application: BaseX, XQuery 3.1• Maintains where-used index based on links in the source
documents• Implements DITA key space construction and key resolution• Fundamentally just data processing• Some tricky bits due to DITA features:
– Map trees– Conditional key definitions and map references– Key scopes (DITA 1.3)– Branch filtering (DITA 1.3)
05/01/2023 Contrext, LLC 28
Git For Versioning Model• Git branch = SnapCM Branch• Git commit = SnapCM Snapshot• Link management repository mirrors git
repository/branch organization• Current implementation only reflects current
commit– Could reflect any commits, just costs storage
• Git atomic commit of multiple objects allows consistent link management state
05/01/2023 Contrext, LLC 29
No Key Magic:Link Management (LM) Database
• XQuery database (BaseX)– Heavy dependence on XPath 3.1 (maps)– Would be much less convenient without maps
• One top-level collection per git repo/branch pair• Parallel link metadata database with link
management "index"• Functions to encapsulate the git nature of the
database organization
05/01/2023 Contrext, LLC 30
Where-Used Index• Each target doc has a directory in the LM
database• Directory contains one or more use records
recording details of the linking element• Where-used query:– Is there a directory for the target doc? • No: Not used• Yes: get use records
05/01/2023 Contrext, LLC 31
Direct Links• Find all links: //*[@href]• Resolve the addresses• Record use records
<dfst:useRecord xmlns:dfst="http://dita-for-small-teams.org" resourceKey="bL1LeEVFr4lAgv77oEaECA==^1.2" targetDoc="dfst^dfst-sample-project^master/docs/topic-01.dita" usingDoc="dfst^dfst-sample-project^master/docs/pub-02.ditamap" linkType="topicref" linkClass="- map/topicref " linkContext="navtree" format="dita" scope="local"> <title>Publication Two</title></dfst:useRecord>
05/01/2023 Contrext, LLC 32
Indirect Links• Find all maps: /*[contains(@class, ' map/map ')]• Generate resolved maps that reflect directly-referenced
submaps—store in LM database.• Construct key space documents from resolved maps.• Use generated IDs to correlate key definitions in
resolved maps and keys spaces to content key definitions
• Find all indirect links: //*[@keyref]• Resolve indirect links to targets• Record use records
05/01/2023 Contrext, LLC 33
Link Management Web App• RESTXQ Web app– Web pages– REST API – Quick and easy to implement
• Report on whatever is interesting about the link nature of the content– Where is something used?– What are the links?– Map structures– Dependencies emanating from a given object
05/01/2023 Contrext, LLC 34
Demo• Oops, out of time
05/01/2023 Contrext, LLC 35
CONCLUSIONS AND FUTURE WORK
05/01/2023 Contrext, LLC 36
What Was Easy?• Git for versioned hyperdocument source
management: direct match to SnapCM model• BaseX: Easy to set up and use for DITA content– Direct support for XML catalogs– RESTXQ implementation– Lightweight installation
• Direct address resolution• DITA map resolution (ignored harder bits for
now)
05/01/2023 Contrext, LLC 37
What Was Hard• Key space construction– I struggled to work with XQuery 3.1 maps– No code authoring support for complex maps• I miss my Java IDE (I am weak and feeble from my
dependence on strongly-typed language programming)– Scoped keys add data processing complexity
aggravated by my weak map fu• XQuery update does not allow naïve
approaches to LM database population
05/01/2023 Contrext, LLC 38
Future Work• Finish out DITA key space construction (key scopes,
branch filtering, dynamic conditional processing)• Finish out basic link management reporting features• Implement basic REST API for accessing link management
information• Docker container packaging for ease of deployment• Tighter integration with authoring tools• Better error reporting for link management data
processing• Oh, yeah, documentation…
05/01/2023 Contrext, LLC 39
Questions?
05/01/2023 Contrext, LLC 40
Resources• DITA for Small Teams:
https://github.com/dita-for-small-teams• Me: [email protected],
http://contrext.com