New Publishing and the Semantic Web
Building an Open Publishing ArchitectureMid-Year Postgraduate Research Conference 2006
School Of Global Studies Social Science & Planning, RMIT University
Liam Magee and Gus Gollings
Background
ARC Linkage Project – impact of new technologies on publishing
Partners: RMIT, Australian Government, FujiXerox
Gus and Liam – researchers, doing PhD’s on technology, information, classification systems and society…
One aspect of research – impact of new technologies on information communities (producers, brokers and consumers of information)
Hypothesis of an information architecture for new publishing – ‘Open Publishing Architecture’
Building an Open Publishing Architecture
To define a case for open document publishing - basic form of infrastructure for:
• individuals
• corporations
• communities
• government
To construct a high-level architecture that supplies:
• longevity
• openness
• metadata capacity
across the publishing activities
The Argument and Philosophy
Publishing is an important social practice: formalizing and firming ideas.
It is currently in the hands of gatekeepers.
It is largely agnostic of new technological horizons.
Can publishing really be open?
Simply using open documents format itself might limit innovation, for the standard constrains functionality?
Threats to ‘Open Information’ in a Digital Age
Cost of participation in digital content creationRegulation / control of content production (content creation
tools)Regulation / control of content distribution (historical
consolidation of publishers)Regulation / control of content consumption (search engines,
browsers, devices)
Threats to ‘Open Information’ in a Digital Age
Obsolescence of contentInteroperability of content and content descriptions
(metadata)Endless replication of digital divide – superceded
technologies ‘trickle down’ to developing communitiesPerceived dilution of quality from ‘mass vanity publishing’
(blogging)Loss of historicity of textual production with digital
production / reproduction
What do we mean by ‘Architecture’?
‘Architecture’ as metaphor – used to describe both:• an information system• a technology system
That is, how information is created, transmitted, aggregated, edited, composed, retrieved, presented, disseminated (social practices)…
And what technical mechanisms are used to accomplish this (technical practices)…
Project straddles social and computer sciences…
Building an Open Publishing Architecture
Holy trinity of:
- standards (language constraints),
- social practice (how you do it),
- tools (software/hardware)
Anyone is free to pursue this architecture:
- companies
- communities
- individuals
- 3rd party operators
Arriving at an Open Publishing Architecture
History: By looking back at how publishing and knowledge management has been organized in the past and present
Technology: By looking at the technological horizons for extensive networked computers
History: Publishing & Knowledge Mastery
Desire to Know Everything
– Aristotle's argument with Plato: that knowledge is empirical not abstract.
– Massive documentation projects from the past: Aquinas, Diderot, Samual Johnson Dictionary, Project Gutenberg, "Thesaurus Linguae Graecae”, "Thesaurus Linguae Latinae”, Buddhist Pali Canon, Emperor's Library and Chinese Classics
History: Publishing & Knowledge Mastery
– Paul Otlet, 1935Man would no longer need documentation if he were assimilated
into an omniscient being - as with God himself. But to a less ultimate degree, a technology will be created acting at a distance and combining radio, X-rays, cinema and microscopic photography. Everything in the universe, and everything of man, would be registered at a distance as it was produced. In this way a moving image of the world will be established, a true mirror of his memory. From a distance, everyone will be able to read text, enlarged and limited to the desired subject, projected on an individual screen. In this way, everyone from his armchair will be able to contemplate creation, as a whole or in certain of its parts. (p. 390)
History: Publishing & Knowledge Mastery
Tim Berners-Lee and Robert Cailliau:1992-4 World Wide Web
As we experience it today, the internet is crashing into one billion fixed line users.
Technology: Network Transmission
Web has attracted spectacular amount of content
Managing that content is hard (problems which Google tackles)
Another approach to answering these problems is an arcane technology the Semantic Web
Technology: Network Transmission
Semantic Web is:
…is an extension of the current web, where information is given well defined meaning
…enables people and computers to work in cooperation
…where the current web is a web of pages, the Semantic Web is a web of data
Technology: Network Transmission
Current World Wide Web has HTML to markup its pages
Semantic Web has markup technologies of its own, which allow for the computer readable descriptions of the relationships between things
Technology: Network Transmission
Whereas the Existing Web simply defines ‘links to’ generic resources, the Semantic Web specifies the meaning of the
relationships between, and the nature of, resources.
What does an architecture look like?
Document: blueprints for a digital publishing environment:
Description of business processes – best practices
Outline of components – tools for content creation, management, archival, recomposition, distribution
Recommendations for standards
Goals of an Architecture (‘SCAM’ Principles)
Content data, and data about content (metadata) to be transparent and human-readable (Principle of Standardisation)
Ease, low-cost and replaceable components (Principle of Commoditisation)
Content to be storable and retrievable (Principle of Archivability)
Content to be editable (Principle of Manipulation)
Building an Open Publishing Architecture
Questions:
- Fledgling theory (ready to be shot down…)
- Architecture – necessary metaphor? (connotations of ‘weighty’, ‘unweildy’, ‘top-heavy’)
- Is this a critical problem (or a solution in search of a problem)?
- How will such an architecture be devised? Disseminated? Implemented? Monitored?