View
511
Download
0
Tags:
Embed Size (px)
Citation preview
co-funded by the European Union
Work Package 2: Interoperability Infrastructure
DM2E Final Event December, 11th 2014, Pisa
Kai Eckert
DM2E Architecture
DM2E Final Event: Work Package 2 2 11.12.2014
WP 1
WP 2
WP 3
WP2 Infrastructure
DM2E Final Event: Work Package 2 3 11.12.2014
Access to the data
Search and browse the data
DM2E Final Event: Work Package 2 4 11.12.2014
View and access the data
Data Model
DM2E Final Event: Work Package 2 5 11.12.2014
DM2E Model
The DM2E Model is an application profile refining the Europeana Data Model.
Reused vocabularies: Bibliographic Ontology, FaBiO, Publishing Roles Ontology, VIVO Ontology, VoID.
DM2E Final Event: Work Package 2 6 11.12.2014
edm:NonInfor mationResource
edm:Place edm:PhysicalThing
bibo:Book
dm2e:Manuscript
dm2e:Page
…
edm:Event skos:Concept
fabio:Chapter
dm2e:Work
…
edm:TimeSpan edm:Agent
foaf:Organization
foaf:Person
DM2E Model: Metalevel
• Levels of Abstraction in DM2E
DM2E Final Event: Work Package 2 7 11.12.2014
Class Uplink Metadata
edm:ProvidedCHO ore:isAggregatedBy About the content
ore:Aggregation ore:isDescribedBy About the provided metadata, providers perspective, record level
ore:ResourceMap dm2e:DataResource foaf:Document
void:inDataset
void:Dataset (Named Graph)
About the RDF data, DM2E perspective
Metalevel, managed by DM2E
Infrastructure
Core data, created by provider mappings
Center of the Infrastructure
• Core: DM2E Model
• OMNOM and TYPES vocabulary:
– Describe transformation and contextualisation workflows.
• Specifications for
– URI schemes,
– the organisation of CHOs on different levels,
– methods to link to external authority data.
11.12.2014 DM2E Final Event: Work Package 2 8
onto.dm2e.eu
Iterative Process (Ingestion Model)
1. Issues are tracked based on validation and test reports.
2. Changes to the model are collected and included in the current draft version of the model.
3. Providers adjust their mappings.
11.12.2014 DM2E Final Event: Work Package 2 9
Automatic validation
Ingestion Tests
Mapping creation
4. A new draft is published on a regular basis for additional feedback.
5. Based on the feedback, a new version of the model is released.
Evaluation
• The model evaluation took place in April/May 2014
• Basis:
– 10 datasets
– Delivered by eight data providers
– Mapped by six different institutions
– Altogether 61,365,146 triples
DM2E Final Event: Work Package 2 10 11.12.2014
Evaluation: Some Insights
• Many classes and properties are not mapped
– For example: edm:Event, dm2e:misattributed, edm:happenedAt, skos:hiddenLabel • Some of these were asked for by providers!
– Conclusion: Unused classes and properties could be removed to achieve a higher simplicity of the model
• Different providers have different mapping styles
– Conclusion: Mapping recommendations are important!
11.12.2014 DM2E Final Event: Work Package 2 11
Evaluation: Property Usage
• A few properties were used very often
• Most properties were rarely used (Long tail phenomenon)
• About a third of all properties were never used
DM2E Final Event: Work Package 2 12 11.12.2014
RDF Application Profiles and Validation
• Important questions beyond the limits of DM2E.
• Initiation of a task group within the Dublin Core Metadata Initiative.
• Currently around 30 participants from 11 countries.
• Collaboration with W3C.
• Goals: – Establish RDF Application Profiles to provide combinations and
refinements of existing vocabularies or application profiles globally, but with a local context.
– Develop mechanisms to support the access to data using different application profiles.
– Support constraint definitions and automatic validation of RDF data.
• http://wiki.dublincore.org/index.php/RDF-Application-Profiles
DM2E Final Event: Work Package 2 13 11.12.2014
Ingestion
DM2E Final Event: Work Package 2 14 11.12.2014
The DM2E Data Bridge
DM2E Final Event: Work Package 2 15 11.12.2014
This is YOUR data.
This is the void:Dataset
in DM2E.
Some more links are actually available...
DM2E Final Event: Work Package 2 16 11.12.2014
Workflow Example: XSLT-Transformation
• XML XSLT RDF/XML DM2E Store
DM2E Final Event: Work Package 2 17 11.12.2014
MINT (NTUA)
• XSLT mapping editor, alignment to DM2E data model
• User support by context-sensitive lists of available elements:
– Appropriate classes for resources
– Consistency checks using domain and range specifications
DM2E Final Event: Work Package 2 18 11.12.2014
Open Workflows
• Distributed infrastructure to ingest and create data in DM2E.
• 100% RDF, 100% REST
• Components: – Input services (File services, D2R instances, OAI-PMH, ...)
– Transformation services (Generic XSLT, MINT, R2R)
– Ingestion services (Output of an ingestion pipeline)
– Contextualization services (Silk)
– Configuration services (MINT and Silk act as editors)
DM2E Final Event: Work Package 2 19 11.12.2014
OmNom User Interface
• OmNom UI: Orchestration of web services
– Mapping and transformation
– Contextualisation
DM2E Final Event: Work Package 2 20 11.12.2014
UI Integration
DM2E Final Event: Work Package 2 21 11.12.2014
MINT
Silk
OmNom
Contextualisation
DM2E Final Event: Work Package 2 22 11.12.2014
Contextualisation
• Silk: Link Discovery Framework (UMA)
• Definition of linkage rules to create links between Linked Data resources.
• http://context.dm2e.eu
DM2E Final Event: Work Package 2 23 11.12.2014
Intergration of Silk
• Silk is integrated in OmNom as web service
DM2E Final Event: Work Package 2 24 11.12.2014
Use generated linkage rules
Generate links
Access to Contextualisation Results
• Contextualisation results (Linksets) are kept separate from ingested data.
• Linksets are further described and versioned (like datasets).
• Additional linkset properties:
– Automatically created,
– Manually created,
– Recall-oriented (exploratory, but with wrong links),
– Precision-oriented (incomplete, but high quality),
– ...
DM2E Final Event: Work Package 2 25 11.12.2014
Contextualisation Resources
DM2E Final Event: Work Package 2 26 11.12.2014
Geonames GND LCSH DBpedia
Freebase
Places Subjects
Agents
DDC Linked
Geodata
Information for Contextualisation
• Structured data, often shallow
• Rich, but unstructured data
DM2E Final Event: Work Package 2 27 11.12.2014
Contextualisation of Structured Data
DM2E Final Event: Work Package 2 28 11.12.2014
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
ONB Codices ONB ABO MPIWG Rara MPIWGHarriot
UBFFMSammlungen
GEI Digital BBAW DTA UIB WAB UBERDINGLER
New
Baseline
Potential
Statement-level Provenance
• Generally, all statements for a resource like a CHO stem from the same data provider in DM2E and from one single data ingestion.
• But: Data about contextualisation resources (agents, places, subjects) are combined from different sources and contain additional links from the contextualisation process.
• We therefore have to deal with the provenance differently for them.
DM2E Final Event: Work Package 2 29 11.12.2014
The "Oh, yeah?" Button
DM2E Final Event: Work Package 2 30 11.12.2014
At the toolbar (menu, whatever) associated with a document there is a button marked "Oh, yeah?". You press it when you loses that feeling of trust. It says to the Web, "so how do I know I can trust this information?". The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons.
Source: http://www.w3.org/DesignIssues/UI.html
Statement-level Provenance
• points you to the ingested dataset or linkset containing the statement.
• points you to the contextualisation resources which are linked to the same external resource. The latter is important because you can't provide the information under the URI of the external resource as you can't add data to its representation.
DM2E Final Event: Work Package 2 31 11.12.2014
DM2E Final Event: Work Package 2 32 11.12.2014
DM2E Final Event: Work Package 2 33 11.12.2014
Great, where do I get it?
• Access to our data: http://data.dm2e.eu
• Documentation, Downloads: http://dm2e.eu
• DM2E in a Box:
– Virtual Machine Image (Virtual Box).
– Provides the full DM2E stack.
– Load your own data, browse it, annotate it.
DM2E Final Event: Work Package 2 34 11.12.2014
Thank you.
DM2E Final Event: Work Package 2 35 11.12.2014