12
Evolution Management for Preservation PRELIDA Consolidation Workshop 17.10.2014 Giorgos Flouris (FORTH) [email protected]

DIACHRON Preservation: Evolution Management for Preservation

Embed Size (px)

DESCRIPTION

by Giorgos Flouris (FORTH), presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu

Citation preview

Page 1: DIACHRON Preservation: Evolution Management for Preservation

Evolution Management for Preservation

PRELIDA Consolidation Workshop 17.10.2014

Giorgos Flouris (FORTH)[email protected]

Page 2: DIACHRON Preservation: Evolution Management for Preservation

Evolution Management Problem

Preservation ↔ Evolution

Page 3: DIACHRON Preservation: Evolution Management for Preservation

Change Detection

• Change detection for evolution management

– Identifying changes between versions

• Challenges (in DIACHRON)

1. Diverse data models

2. Dynamic datasets

3. Recoverable versions

4. Changes as first-class citizens

5. Cross-snapshot queries

Page 4: DIACHRON Preservation: Evolution Management for Preservation

Evolution in DIACHRON

Pilot dataset DIACHRON

Ve

rsio

n 1

Pilot dataset DIACHRON

Ve

rsio

n 2

Page 5: DIACHRON Preservation: Evolution Management for Preservation

Change Types: Motivation

What a naïve diff will report

Add (Rec, diachron:subject, EFO_001927)Add (Rec, diachron:hasRecordAttribute, rAtt1)Add (rAtt1, diachron:predicate, rdfs:subClassOf)Add (rAtt1, diachron:object, ObsoleteClass)

What the pilot expects

Add_SuperClass (EFO_001927, ObsoleteClass)

Page 6: DIACHRON Preservation: Evolution Management for Preservation

Change Hierarchy: Low-level (1/3)

• Low-level changes

– DIACHRON model, for internal use

– Fixed: Add, Delete

– Just additions and deletions of triples

– Simple set difference

Page 7: DIACHRON Preservation: Evolution Management for Preservation

Change Hierarchy: Simple (2/3)

• Pilot terminology: – Add_SuperClass

Add_Dimension

• Fixed, pre-defined

• Comprising of low-level changes

• Partitioning is perfect– Complete and unambiguous

Page 8: DIACHRON Preservation: Evolution Management for Preservation

Change Hierarchy: Complex (3/3)

• Pilot terminology:

– Add_Synonym, Mark_As_Obsolete

• Totally custom, pilot-specific (defined at run-time)

Page 9: DIACHRON Preservation: Evolution Management for Preservation

Using Changes for Evolution Management

• DIACHRON data model contains all versions

• Detection based on SPARQL queries

– Provided at deployment time (for simple)

– Generated at creation time (for complex)

• Recoverability

– Allows moving back and forth between versions

Page 10: DIACHRON Preservation: Evolution Management for Preservation

Representation Requirements

• Interesting queries– Return the simple changes that dataset X underwent

between versions V1 and V2– Return the changes that resource X underwent in the first

semester of 2014– Give me all resources of type X that underwent change Y– Return all countries for which the unemployment rate of

their capital city increased at a rate higher than the average increase of the country as a whole, between versions V1 and V2

• Access to both the changes and the data is required– Changes are first-class citizens– Allowing preservation

Page 11: DIACHRON Preservation: Evolution Management for Preservation

DIACHRON

Data

Changes Ontology

C1

Add_SuperClass

V1

V2

asc_p1

asc_p2

Simple_Change

Change

prov:Activity

Data level

Schema level

EFO_001927

ObsoleteClass

old_version

new_version

diachron:Entity

Add_Synonym

Complex_Change

… …

Page 12: DIACHRON Preservation: Evolution Management for Preservation

Conclusion

• Main DIACHRON message – (Linked) data preservation is related to evolution management

• DIACHRON challenges1. Diverse data models2. Dynamic datasets3. Recoverable versions4. Changes as first-class citizens5. Cross-snapshot queries

• Solutions– DIACHRON data model (#1)– Appropriate change definition and detection (#2, #3)– Changes and data represented at the same level (#4, #5)