Upload
sebastian-mathis
View
28
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Automatic Evaluation of Migration Quality in Distributed Networks of Converters. ECDL 05 Doctoral Consortium. Miguel Ferreira [email protected] Supervisors Ana Alice Baptista José Carlos Ramalho. 2005-09-21. Contents. Introductory concepts Research problems Proposed system - PowerPoint PPT Presentation
Citation preview
Automatic Evaluation of Migration Quality in
Distributed Networks of Converters
Miguel [email protected]
SupervisorsAna Alice Baptista
José Carlos Ramalho
EC
DL
05D
octo
ral C
onso
rtiu
m
2005-09-21
Contents
• Introductory concepts• Research problems• Proposed system• Methodology• Topics for discussion
Introductory concepts
• Digital preservation– The set of processes and activities that
ensure the continued access to information and all kinds of cultural heritage existing in digital formats
• Digital object– An information object, of any type of
information or any format, that is expressed in digital form
– Text documents, digital photos, vector graphics, databases, Web pages, software
Strategies for digital preservation
• Emulation– Reproduction of the behaviour of a
hardware/software platform in a different technological environment
• Encapsulation– Storing information about how the objects
should be interpreted
• Migration– Periodic transfer of digital materials from one
hardware/software configuration to another
• Others– Computer museums, viewers, Universal Virtual
Computer
Migration
• Advantages– Updated formats that users can read and
edit
• Disadvantages– Requires a continuous diligence– Data loss
• Variants– Migration on request– Normalisation– Distributed migration
Distributed migration
• A network of remote conversion services supported by a semantic layer [Hunter et al.]
• Advantages– Platform independent– Redundancy– Multiple migration paths– Cost reduction– Compatible with other migration strategies
• Disadvantages– bandwidth– Slow
• Examples– PANIC– MyMorph (NLMed)– TOM
FormatB
FormatC
FormatD
FormatE
FormatA
ConversionA-C
ConversionB-C
ConversionC-E
ConversionA-E
How to choose a preservation strategy?
• Many preservation alternatives• Lack of universal acceptance• Distinct preservation
requirements– Satisfaction of the designated community– Characteristics of the collection– Budget
• Framework for evaluating preservation strategies [Rauber]– Utility Analysis
Evaluation of preservation strategies
1. Definition of objective tree2. Assignment of measurement units
(e.g. millimetre, Mb, Euro)
3. Identification of preservation alternatives4. Execution of preservation alternatives
and evaluation of the outcome5. Weighting of criteria in the objective tree6. Calculation of partial and total values7. Ranking of alternatives
Objective tree (example)
Research problems
• Automation of preservation processes
• Authenticity issues• Cost management• Evaluation of preservation
alternatives
Research questions
• Is it feasible to design and implement a system that is able to automatically:– determine the amount of data loss
occurred in a migration and generate detailed migration reports for inclusion in the objects’ preservation metadata?
– provide recommendations of migration paths or target formats that will best suit users’ requirements?
MigrationEvaluator
MigrationAdvisor
MigrationKnowledge
Base(MKB)
MetaConverter
Request Migration[Source object]
Store[Migration report]
[Migration data]
Invoke Migration[Source object]
Evaluate migration[Original object] [Migrated object] [Process metrics]
Request Advice[Criteria]
Request advice[Criteria]
[Migrated Object][Migration Report]
[Migration Advice]
[Migration report]
[Migration advice]
[Migrated object]
User
Migration Network
Query MKB
[Parameters]
Proposed System
MigrationEvaluator
MigrationAdvisor
MigrationKnowledge
Base(MKB)
MetaConverter
Request Migration[Source object]
Store[Migration report]
[Migration data]
Invoke Migration[Source object]
Evaluate migration[Original object] [Migrated object] [Process metrics]
Request Advice[Criteria]
Request advice[Criteria]
[Migrated Object][Migration Report]
[Migration Advice]
[Migration report]
[Migration advice]
[Migrated object]
User
Migration Network
Query MKB
[Parameters]
Proposed System
MigrationEvaluator
MigrationAdvisor
MigrationKnowledge
Base(MKB)
MetaConverter
Request Migration[Source object]
Store[Migration report]
[Migration data]
Invoke Migration[Source object]
Evaluate migration[Original object] [Migrated object] [Process metrics]
Request Advice[Criteria]
Request advice[Criteria]
[Migrated Object][Migration Report]
[Migration Advice]
[Migration report]
[Migration advice]
[Migrated object]
User
Migration Network
Query MKB
[Parameters]
Proposed System
MigrationEvaluator
MigrationAdvisor
MigrationKnowledge
Base(MKB)
MetaConverter
Request Migration[Source object]
Store[Migration report]
[Migration data]
Invoke Migration[Source object]
Evaluate migration[Original object] [Migrated object] [Process metrics]
Request Advice[Criteria]
Request advice[Criteria]
[Migrated Object][Migration Report]
[Migration Advice]
[Migration report]
[Migration advice]
[Migrated object]
User
Migration Network
Query MKB
[Parameters]
Proposed System
MigrationEvaluator
MigrationAdvisor
MigrationKnowledge
Base(MKB)
MetaConverter
Request Migration[Source object]
Store[Migration report]
[Migration data]
Invoke Migration[Source object]
Evaluate migration[Original object] [Migrated object] [Process metrics]
Request Advice[Criteria]
Request advice[Criteria]
[Migrated Object][Migration Report]
[Migration Advice]
[Migration report]
[Migration advice]
[Migrated object]
User
Migration Network
Query MKB
[Parameters]
Proposed System
MigrationEvaluator
MigrationAdvisor
MigrationKnowledge
Base(MKB)
MetaConverter
Request Migration[Source object]
Store[Migration report]
[Migration data]
Invoke Migration[Source object]
Evaluate migration[Original object] [Migrated object] [Process metrics]
Request Advice[Criteria]
Request advice[Criteria]
[Migrated Object][Migration Report]
[Migration Advice]
[Migration report]
[Migration advice]
[Migrated object]
User
Migration Network
Query MKB
[Parameters]
Proposed System
Methodology - proof of concept
The concepts1. Automatic quantification of data
loss occurred in a migration and generation of preservation metadata
2. Automatic recommendation of migration strategies as well as target formats
The proof (empirical validation)
1. Evaluator versus Human experts2. Advisor versus Evaluation
framework
Key contributions
• For individual preservers, digital archives and libraries: – Outsourcing and automation of digital preservation– Generation of preservation metadata (authenticity)– Ranking of migration alternatives
• For designers and programmers of converters: – Possibility of publishing their converters as services
• For metadata creators and users: – Increase adoption– Help to improve future versions – Accelerate the development of XML bindings
Round-up
• Service oriented architecture (SOA)– Automatic quantification of data loss– Provides recommendations on which
migration paths or target formats are best suited for each user
– Simplifies the creation of preservation metadata
– Based on migration
• Methodology– Proof of concept with empirical
validation• Evaluator versus Human experts• Advisor versus Evaluation framework
Topics for discussion
• Relevance of research • Research methodology • System architecture• Format registry vocabulary
– e.g. MIME types, TOM type descriptors, Global Digital Format Registry, PRONOM, etc.
• Preservation metadata schema– e.g. PREMIS data dictionary (event entity)