Upload
cyrus-harvey
View
27
Download
0
Embed Size (px)
DESCRIPTION
TEI - why we need to keep it simple. The experience of the Diplomatarium Danicum project Mogens Devantier & Thomas Hansen, Society for Danish Language and Literature. Diplomatarium Danicum. Goal: Publish all documents pertaining to Denmark, AD 789-1450 - PowerPoint PPT Presentation
Citation preview
TEI - why we need to keep it simple
The experience of the Diplomatarium Danicum project
Mogens Devantier & Thomas Hansen, Society for Danish Language and Literature
Diplomatarium Danicum
Goal: Publish all documents pertaining to Denmark, AD 789-1450 Currently: 3-year Carlsberg Foundation project aiming at development of• Textbank - archive with standardized texts1. Web-application - consumer of standardized texts
Future: Textbank leverages1. publication of documents 1413-1450 - app 8500 texts– transformed material 1401-1412 - app 3000 texts [
http://diplomatarium.dk/]
– digitized annotated material 789-1400 - app 15 000 texts
Why TEI?
Two reasons 1. The most popular way of communicating data that are
o portableo fine-grained and structured
2. The XML modus operandi
o Specializationo Standardizationo Routinization
TEI - it gets complicated
Standardization "lite" - TEI has guidelines, not specifications, so• Specialization needed at all levels• Routinization difficult
When routinization is obstructed, portability is compromised• Format inconsistency • Tag-abuse• Missing information - non-existant or undetermined?
Serious problems • querying - low precision, low recall• rendering - maintaining stylesheets is difficult
Simplify: Controlling input with a TEI user interfaceStandardize! - develop your own standard and map it to TEI
Make it operable with • schema• stylesheet
Make it intelligible with • documentation
Make documentation transparent and accessible with• URIs
Simple uniform resources are strategic
Immediate advantages in terms of • usability• management - segmentation of work, enriching markup• easier implementation• support
Short-term advantages in terms of • preservation - attainable and should be promoted
Long-term advantages in terms of• interoperability - essential to the final vision, but not always
attainable right now
Short-term advantages of simple
Indications of an emerging market for text resources: • Centralization - more resources in fewer repositories
o EU-CLARIN o National research infrastructures
• Maximization - more texts, more consumers, more tools• Specialization - producers, preservers, consumers
Markets depend on standards in order to compare the goods - therefore, most infrastructure projects implement TEI
Long-term advantages of simple
Given the fact that• no single archive will ever hold all resources, and• no single XML markup schema will ever be imposed on all
resources- users will, at some point, depend on interoperation between different archives and resources Interoperation requires standardization - a set of shared semantics implemented by a service that may function as a single point of access to distributed resources
Conclusion - We need to keep it simple because...if the standard is observed• users will have immediate access to more resources• the resources will be better preserved, and • services will have easier access to more resources
After all... "a complex system that works is always based on a simple system that works" -freely adapted from John Gall, Systemantics, 1978
Contact: [email protected]