iLastic: Linked Data Generation Workflow and User Interface for iMinds Scholarly Data

Preview:

Citation preview

iLastic:Linked Data Generation

Workflow & User Interface for iMinds Scholarly Data

SAVE-SD 2017

Anastasia Dimou, Gerald Haesendonck, Martin Vanbrabant, Laurens De Vocht, Ruben Verborgh, Steven Latré, Erik Mannens

Anastasia.Dimou@ugent.be ● @natadimouGhent University – IDLab – imec

Publication is archived & published by the

event organizers where it was presentedpublisher who publishes the proceedingsauthors who co-edited itorganization(s) the authors are affiliated with

Publication

Dimou A. et al. (2015) Assessing & Refining Mappings to RDF to Improve Dataset Quality In: Arenas M. et al. (eds) The Semantic Web - ISWC 2015 Lecture Notes in Computer Science, vol 9367. Springer, Cham

Publication is archived & published by the

event ISWC2015http://iswc2015.semanticweb.org/sites/iswc2015.semanticweb.org/files/93670111.pdf

Publication is archived & published by the

event ISWC2015publisher LNCS, Springer

https://link.springer.com/chapter/10.1007/978-3-319-25010-6_8

Publication is archived & published by the

event ISWC2015publisher LNCS, Springerauthors multiple by 8

https://ruben.verborgh.org/publications/dimou_iswc_2015a/http://jens-lehmann.org/files/2015/iswc_rml_rdfunit.pdf

Publication is archived & published by the

event ISWC2015publisher LNCS, Springerauthors multiple by 8organization(s) multiple by 5

https://biblio.ugent.be/publication/8030828

Publication is archived & published 15 times!!

Dimou A. et al. (2015) Assessing & Refining Mappings to RDF to Improve Dataset QualityIn: Arenas M. et al. (eds) The Semantic Web - ISWC 2015Lecture Notes in Computer Science, vol 9367. Springer, Cham

Publication is published 15 times...

… if all agents publish its scholarly data as Linked (Open) Data

Publication is published N times...

… if N agents publish its scholarly data as Linked (Open) Data

Linked (Open) Data is generated with N different ways

Semantic Publishing

enhances the meaning of publications by enriching them with metadata

Semantic Publishing: ad-hoc solutions

different agents ownoverlapping or complementary scholarly data

use their own ad-hoc solutionsto generate and publish their own Linked (Open) Data

Semantic Publishing: fragmented datasets

different agents ownoverlapping or complementary scholarly data

focus on metadata or content, rarely on both

content annotations are rarely published as datasets

Semantic Publishing: currently leading to..

duplicate efforts for Linked (Open) Data generation:

(re-)implementing from scratch

non-negligible implementation & maintenance costs

Semantic Publishing: current

effort for Linked (Open) Data generation:

implementation & maintenance ↗

Semantic Publishing: our approach

effort for Linked (Open) Data generation:

implementation & maintenance ↘

model, semantic annotations, integration & cleansing ↗

How can we reduce implementation costsincrease Linked Data quality?

Semantic Publishing: our approach

general-purpose Linked (Open) Data generation and publication workflow

adjusted to each agent’s scholarly data

integrates metadata & content annotations

Semantic Publishing: iLastic

general-purpose Linked (Open) Data generation and publication workflowbased on our modular RML tool chain

adjusted to iMinds & Ghent university repositoryoverlapping and complementary scholarly data

integrates metadata & content annotationsbased on the RML tool chain & text enricher alignment

iLastic Workflow

RDF generation & publication service

Enrichment service

iLastic Workflow

RDF generation & publication service

Enrichment service

iLastic Workflow

RDF generation & publication service

Enrichment service

iLastic Workflow

RDF generation & publication servicegeneral purpose tool: distinct mapping rules definition & execution

Enrichment service

Mapping Module

Processor

Extraction Module

mapping rules

iLastic Workflow

RDF generation & publication servicegeneral purpose tool:distinct mapping rules definition & executionexecution: RML Processor

Enrichment service

https://github.com/RMLio/RML-Processor

iLastic Workflow

RDF generation & publication servicegeneral purpose tool:distinct mapping rules definition & executionexecution: RML Processordefinition

Enrichment service

iLastic Workflow

RDF generation & publication servicegeneral purpose tool:distinct mapping rules definition & executionexecution: RML Processordefinition: RML language

Enrichment service

A. Dimou et al. (2014) RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), Seoul, Korea.http://rml.io

iLastic Workflow

RDF generation & publication servicegeneral purpose tool:distinct mapping rules definition & executionexecution: RML Processordefinition: RML Editor

Enrichment service

Heyvaert P. et al. (2016) RMLEditor: A Graph-Based Mapping Editor for Linked Data Mappings. In The Semantic Web. Latest Advances and New Domains. ESWC 2016. LNCS, vol 9678. Springer, Chamhttps://www.youtube.com/watch?v=0lPDaghlZoQ

iLastic Workflow

RDF generation & publication servicegeneral purpose tool:execution: RML Processordefinition: RML Editorvalidation

Enrichment service

iLastic Workflow

RDF generation & publication servicegeneral purpose tool:execution: RML Processordefinition: RML Editorvalidation: RML Validator

Enrichment service

Dimou A. et al. (2015) Assessing and Refining Mappingsto RDF to Improve Dataset Quality. In: Arenas M. et al. (eds) The Semantic Web - ISWC 2015. Lecture Notes in Computer Science, vol 9367. Springer, Cham

iLastic Workflow

RDF generation & publication service

Enrichment service

iLastic Workflow

RDF generation & publication service

Enrichment service

iLastic Workflow

RDF generation & publication service

Enrichment servicePDF Extraction: CERMINE

http://cermine.ceon.pl/

iLastic Workflow

RDF generation & publication service

Enrichment servicePDF Extraction: CERMINENER: DBpedia Spotlight

https://github.com/dbpedia-spotlight/dbpedia-spotlight

iLastic Workflow

RDF generation & publication service

Enrichment service

iLastic Dataset

59,462 entities12,472 researchers22,728 publications81 organizations3,295 projects765,603 triples

iLastic Workflow

RDF generation & publication servicedata dumpsLinked Data Fragments

Enrichment service

http://linkeddatafragments.org/

iLastic Workflow

RDF generation & publication servicedata dumpsLinked Data FragmentsSPARQL endpoint - Virtuoso

Enrichment service

https://github.com/openlink/virtuoso-opensource

iLastic Workflow

RDF generation & publication servicedata dumpsLinked Data FragmentsSPARQL endpoint - VirtuosoThe DataTank

Enrichment service

http://thedatatank.com/

iLastic User Interface

iLastic User Interface

iLastic User Interface

iLastic User Interface

https://www.youtube.com/watch?v=ZxGrHnOuSvw

iLastic:Linked Data Generation

Workflow & User Interface for iMinds Scholarly Data

Anastasia.Dimou@ugent.be ● @natadimou

Recommended