LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 1 http://lod2.euCreating Knowledge out of Interlinked Data
LOD2 Presentation . 02.09.2010 . Page http://lod2.euFreie Universität Berlin
Robert Isele
WP4: Reuse, Interlinking and Knowledge Fusion
LOD2 Plenary Meeting 2012Vienna
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 2 http://lod2.eu
WP4 Goals
Translate heterogeneous data from the Web of Linked Data into a clean local target representationProvide open-source software components for:– Link Generation– Vocabulary Mapping– Linked Data quality assessment– Linked Data Fusion
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 3 http://lod2.eu
WP4 in the LOD Stack
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 4 http://lod2.eu
Task 4.1: Semi-Automatic Data Interlinking
Partners: ULEI, NUIG, FUB, KAIST Goals: – Develop a Linking Assist, which guides the knowledge
engineer through the linking process (FUB, ULEI).– (New) Provide a platform for automatic linking with Korean,
Chinese, Japanese RDF resources (KAIST).
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 5 http://lod2.eu
Task 4.1: Progress
First Linking Assist/Silk Workbench (D4.1.1) has been delivered in February 2012– Define Data Sources (e.g. SPARQL endpoint, RDF dump)– Specify the types of resources which should be interlinked– Build linkage rules supported by maching learning– Evaluate the quality of linkage rules
Preliminary work on Korean Resource Linking Assist– Transformed test datasets into RDF.– This data will be an input to Korean resource linking module. – Finished preliminary design of the Korean resource linking
module
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 6 http://lod2.eu
Task 4.1: Improving Silk Workbench (1/2)
Use Active learning to reduce the manual effort and required expertise to interlink data sources– Automating the generation of a linkage rule.– The user only confirms or declines a set of example links.
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 7 http://lod2.eu
Task 4.1: Improving Silk Workbench (2/2)
Improving the usability based on user-feedbackFirst results for the Y2 review meetingFinal deliverable D4.1.2 (Second Linking Assist Release) in February 2013
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 8 http://lod2.eu
Task 4.2: Data Interlinking Environment
Partners: NUIGGoals:– To research and develop LATC well beyond 2012 into 2014– Interlinking recommendations– Interaction with data linkage validator from WP3
Progress:– First version of Data Interlinking Environment (D4.2.1)
submitted in December 2011– Combines Analytics Graph produced from Sindice data
sources and the Silk Link Discovery Framework
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 9 http://lod2.eu
Task 4.2: Silk Workbench Extension
New Sindice datasource for the linking of datasets.Dataset suggestion based on keywords, classes, and datasets Autocompletion for data types when executing linking tasks.A retrieval method for entity properties to also aid in the execution of linking tasks.
Dataset suggestion
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 10 http://lod2.eu
Task 4.3: Linked Data Quality Assessment
Partners: FUB, NUIG, ULEI, SWCG Goals:– Research into recent advances in quality assessment of
Linked Data– Develop design metrics for quality assessment– Release a Linked Data Quality Assessment Component
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 11 http://lod2.eu
Task 4.3 Progress
Survey on the State of the Art in Mapping, Quality Assessment and Data Fusion (D4.3.1) finished in February 2011Conceptual Design and Implementation of Metrics (D4.3.2) finished in February 2012Released first prototype of Sieve, a Linked Data Quality Assessment and Fusion framework– Allows Web data to be filtered according to different data quality
assessment policies – Provides for fusing Web data according to different conflict
resolution methods.– http://sieve.wbsg.de– D4.3.2: Release of the data quality assessment tool (August 2012)
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 12 http://lod2.eu
Task 4.4: Schema Mapping Publication and Discovery
Partners: FUB, ULEI, OGL, SWCG, UEPGoals:– Specification of the vocabulary mapping publication and
discovery language– Implementation of the Vocabulary Mapping Component
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 13 http://lod2.eu
Task 4.4 Progress
Specification of the Mapping Publication and Discovery Language (D4.4.1) finished in June 2011Implementation of the Mapping Publication and Discovery Framework (D4.4.2 ) finished in February 2012.– Adapted the R2R Framework based on the use cases in LOD2. – Conducted various experiments to demonstrate the
performance and scaling behavior for translating data sets (http://www.assembla.com/spaces/ldif/wiki/Benchmark)
– Implementation published under the terms of the Apache License
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 14 http://lod2.eu
Task 4.4: Future Work
Integration of the Mapping Publication and Discovery Framework into the LOD2 stack (D4.4.3)Deadline: February 2013
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 15 http://lod2.eu
Task 4.4a: Schema Mapping Robust to Modeling Style
Partners: UEPGoal: Extend the methods and tools of schema matching discovery (from the original Task 4.4) by ontology transformation methods implemented within the (enhanced) PatOMat framework Start: March 2012First deliverable in December 2012
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 16 http://lod2.eu
Task 4.5: Linked Data Fusion
Partners: FUB, ULEIGoal:– Build a Data Fusion Component which fuses data from
multiple sources– Fuse multiple entities representing the same real-world object
into a single, consistent and clean representationFirst deliverable:– Initial release of Data Fusion Component (D4.5.1). – Deadline: 31.08.12– Integrating the data quality assessment module (Sieve)
developed in Task 4.3 with a data fusion module.
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 17 http://lod2.eu
Task 4.5a: Multilingual Linked Data Fusion
Involved: KAIST, ULEIGoal: Fusion of multilingual datasets– DBpedia dataset as the pivot multilingual dataset, since it is
extracted from various kinds of languages – First step: Bilingual fusion between the Korean DBpedia and the
English Dbpedia– Next: Include other languages such as Chinese and Japanese
First deliverable in February 2013: Korean Data Fusion Assistant – The component will support Korean data fusion into English LOD
by combining Deliverable 4.5.1 with the fused dataset of English and Korean DBpedia.
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 18 http://lod2.eu
Task 4.6: Tools for Cleansing Entity Data and Crowdsourcing of Cleansing
Involved: ZemantaGoals: – Adapt Google Refine for Linked Open Data based on the
existing Deri Plugin– Integrate crowdsourcing services such as Amazon Mechanical
Turk for LOD data cleansing. Progress:– D 4.6.1 (M18) Release of an LOD-Enabled Version of Google
Refine submitted.Next deliberable:– D 4.6.2 (M30) Release of Documentation and Software
Infrastructure for Using GR along with Amazon Mechanical Turk
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 19 http://lod2.eu
WP 4 Summary (M12 - M18)
5 Deliverables submitted in the last 6 months:ULEI and FUB submitted the First Linking Assist (D4.1.1)NUIG submitted the first version of the Data Linking Environment Release (D4.2.1)FUB finished the Conceptual Design and Implementation of Quality Assessment Metrics (D 4.3.2)FUB finished the Implementation of the Mapping Publication and Discovery Framework (D4.4.2)Zemanta submitted the first release of the LOD-enabled version of Google Refine for review (D4.6.1)
LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 20 http://lod2.eu
Contact
Address
Freie Universität BerlinSchool of Business & EconomicsWeb-based Systems Group
Garystr. 21 14195 BerlinGermany
Presenter
Robert [email protected]