Enrichment of Library Authority Files by Linked Open Data Sources Gerd Zechmeister Semantic Web Company – :

Embed Size (px)

Citation preview

  • Slide 1

Enrichment of Library Authority Files by Linked Open Data Sources Gerd Zechmeister Semantic Web Company http://www.semantic-web.athttp://www.semantic-web.at Slide 2 Presentation agenda 1.About us 2.LOD2 Project 3.Demonstration Scenario 4.Process & Results 5.Summary & Outlook Semantic Web Company http://www.semantic-web.at/2 Slide 3 About us Based in Vienna (privately held) 20 specialists from several fields Focus: Semantic (web) technologies & search applications 1st project based on semantic technologies in 2001 Foundation of Semantic Web School in 2004 Semantic Web Company GmbH since 2008 PoolParty development started in 2007, on the market since 2009 Semantic Web Company http://www.semantic-web.at/3 Slide 4 4 Slide 5 PP Thesaurus Manager Semantic Web Company http://www.semantic-web.at/5 1.Each concept in one or many concept schemes 2.Each concept has one URI 3.Each concept has one ore more labels 4.(Poly-)Hierarchical and non-hierachical relations 5.Matching between concepts from various sources 1.Each concept in one or many concept schemes 2.Each concept has one URI 3.Each concept has one ore more labels 4.(Poly-)Hierarchical and non-hierachical relations 5.Matching between concepts from various sources 1. 2. 3. 4. 5. Slide 6 SKOSsy Semantic Web Company http://www.semantic-web.at/6 Select DBPedia categories Choose extraction depth, data to extract and format (TTL, TriG etc.) Extract it and import it into PoolParty as Seed Thesaurus Slide 7 FP7 project (2010-2014) 15 partners (technology researchers, companies and service providers) from 11 European countries plus 1 associated partner from Korea Coordinated by the AKSW research group at the University of Leipzig Semantic Web Company http://www.semantic-web.at/7 Slide 8 LOD Life-Cycle Management Semantic Web Company http://www.semantic-web.at/8 Extraction of RDF from text, XML and SQL Querying and Exploration using SPARQL Authoring of Linked Data using a Semantic Wiki Semi-automatic link discovery between Linked Data sources Knowledge-base Enrichment and Repair Slide 9 Demonstration Scenario Alignment Example Data vs LOD resources in SKOS Identification of matching concepts Enrichment Addition of matches to Example Data dump Semantic Web Company http://www.semantic-web.at/9 Slide 10 Demonstration Scenario Applied tools and frameworks Semantic Web Company http://www.semantic-web.at/10 Tool/FrameworkFunction Using SKOS Thesauri as graph/SPARQL endpoint Creating example data as graph/SPARQL endpoint Comparing data to detect matching concepts Extracting categories from DBPedia to import it as Thesaurus into PoolParty Slide 11 Demonstration Scenario Example Data Schlagwortnormdatei (SWD = keyword authority file) from DNB data dump 166.414 concepts in German with alignments to LCSH, RAMEAU etc. Expressed in SKOS (hierarchical and associative relations) Semantic Web Company http://www.semantic-web.at/11 Slide 12 Demonstration Scenario SKOS vocabularies for alignment Standard Thesaurus Economy (STW) 6520 concepts with english/german prefLabel European Union Thesaurus (EUROVOC) 6797 concepts with multilingual prefLabel Extracted concepts from DBPedia via SKOSsy: Economy 13294 concepts in German Semantic Web Company http://www.semantic-web.at/12 Slide 13 Process & Results: preparational steps 1.Download SWD data dump from DNB server 2.Evaluation SKOS compatibility 3.Transformation SWD data as SPARQL endpoint 4.Vocabulary selection Focus on Economy vocabularies Semantic Web Company http://www.semantic-web.at/13 Slide 14 Process & Results: Alignment Specification in SILK workbench Define data sources: SWD & EUROVOC Define tasks: compare all skos:prefLabels and deliver all matching links Initiate process and create output file Semantic Web Company http://www.semantic-web.at/14 Slide 15 SILK Workbench Semantic Web Company http://www.semantic-web.at/15 Alignment SWD vs EUROVOC Slide 16 SILK Workbench Semantic Web Company http://www.semantic-web.at/16 Alignment SWD vs EUROVOC Slide 17 Process & Results: Alignment Semantic Web Company http://www.semantic-web.at/17 SWD 166414 cs. STW 6520 cs. EUROVOC 6797 cs. DPPedia Wirtschaft 13294 cs. 3440 matching links 2169 1318 Slide 18 Process & Results: Enrichment Semantic Web Company http://www.semantic-web.at/18 Upload of exactmatches to the SWD graph in Virtuoso Slide 19 Process & Results: Enrichment Semantic Web Company http://www.semantic-web.at/19 SubjectPredicateObject Slide 20 Semantic Web Company http://www.semantic-web.at/20 SWD DBPedia EUROVOC STW Slide 21 Process & Results: Enrichment Semantic Web Company http://www.semantic-web.at/21 Weiter als im Gabler definiert, auch fr ffentliche Abfallwirtschaft (DE-588)040001075 (DE-588c)4000107-6 Abfallwirtschaft Slide 22 Summary & Outlook Playground for future scenarios Linked Open Library Data LOD2 technology stack components Further applications Executing tasks for regular updates Link exchange with LOD providers Integration of data and cross-media (e.g. geo-references, images, AV files) Expansion of authority files for cataloguing (e.g. multilingual searches) Semantic Web Company http://www.semantic-web.at/22 Slide 23 Get in contact! Semantic Web Company http://www.semantic-web.at/23 Semantic Web Company GmbH Mariahilfer Strasse 70/8 1070 Vienna - Austria http://www.semantic-web.at/ http://poolparty.biz/ http://twitter.com/semwebcompany Gerd Zechmeister Research & Development Manager [email protected]