10
Profiling & Exploration of Linked Datasets Stefan Dietze, Besnik Fetahu, Davide Taibi L3S Research Center, DE, @stefandietze, http://purl.org/dietze Stefan Dietze 12/03/14

Demo: Profiling & Exploration of Linked Open Data

Embed Size (px)

DESCRIPTION

Talk at Open Data Seminar in St Petersburg, March 2014

Citation preview

Page 1: Demo: Profiling & Exploration of Linked Open Data

Motivation Data on the Web

Some eyecatching opener illustrating growth and or diversity of web data

Profiling & Exploration of Linked Datasets

Stefan Dietze, Besnik Fetahu, Davide Taibi L3S Research Center, DE,

@stefandietze, http://purl.org/dietze

Stefan Dietze 12/03/14

Page 2: Demo: Profiling & Exploration of Linked Open Data

Data curation and dataset profiling

LinkedUp

Dataset Catalog

Stefan Dietze 12/03/14

Catalog of data (LinkedUp Catalog): classification of datasets according to resource types, disciplines/topics, data quality, accessability, etc

Infrastructure for distributed/federated querying

describes

Which datasets are useful & trustworthy for case XY (eg „learning about the solar system“) ?

Which topics (eg „Astronomy“) are covered by dataset X?

Which datasets offer videos (slides, publications, statistics etc)?

Page 3: Demo: Profiling & Exploration of Linked Open Data

LinkedUp Data Catalog in a nutshell

http://datahub.io/group/linked-education

http://data.linkededucation.org/linkedup/catalog/

RDF (VoID) dataset catalog: browse & query distributed datasets

Live information about endpoint accessibility

Federated queries using type mappings

Stefan Dietze 12/03/14

Page 4: Demo: Profiling & Exploration of Linked Open Data

db:Astro. Objects

Extracting Topic Profiles of Linked Datasets

Dataset Metadata

Stefan Dietze 12/03/14

Schema mappings

BIBO

AAISO

FOAF

contains

Entity disambiguation

Topic profile extraction

db:Astronomy

db:Astro. Objects

LinkedUp

Dataset Catalog

yov:Video

po:Programme

BBC Programme

<po:Programme …>

<po:Series>Wonders of the Solar System</.>

<po:Actor>Brian Cox</…>

</po:Programme…>

<yo:Video …>

<dc:title>Pluto & the

Dwarf Planets</dc:title>

</yo:Video…>

Yovisto Video

bibo:Fil bibo:Fi

bibo:Film

Page 5: Demo: Profiling & Exploration of Linked Open Data

What’s all the data about: exploring topics of LOD in a nutshell http://data-observatory.org/lod-profiles/

Stefan Dietze 12/03/14

Visualisation & exploration of dataset-topic-graph (datasets, topics, relationships)

Includes all responsive datasets of LOD Cloud

A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles, Fetahu, B., Dietze, S., Nunes, B.P., Cassanova M., Nejdl, W. (2014), 11th Extended Semantic Web Conference, ESWC2014, Greece, May 2014.

Page 6: Demo: Profiling & Exploration of Linked Open Data

Stefan Dietze 12/03/14

dbp:Category:Royal_Medal_winners

dbp:Category:1955_births dbp:Category:People_from_London

dbp:Category:Buzzwords

dbp:Category:Web_Services

dbp:Category:HTTP

dbp:Category:Unitarian_Universalists

dbp:Category:World_Wide_Web

What have these categories in common?

Page 7: Demo: Profiling & Exploration of Linked Open Data

Stefan Dietze 12/03/14

Diversity of category profile for a single paper

Berners-Lee, Tim; Hendler, James, Ora Lassila (2001). "The Semantic Web". Scientific American Magazine.

person

document

dbp:Tim_Berners-Lee

dbp:Category:1955_births dbp:Category:People_from_London

dbp:Category:Buzzwords

dbp:Semantic_Web

dbp:Category:Semantic_Web

dbp:Category:Web_Services

dbp:Category:HTTP

dbp:Category:Unitarian_Universalists

first-level categories (dcterms:subject)

dbp:Category:World_Wide_Web

dbp:Category:Royal_Medal_winners

Page 8: Demo: Profiling & Exploration of Linked Open Data

DBpedia category graph not an ideal “topic” vocabulary:

Broad and noisy

“Categories” vs “topics” (for capturing disciplines, thesauri like UNESCO Thesaurus seem better suited)

Lack of clear hierarchy: graph, not a tree

Mixing categories across resource types (document, person etc) creates “perceived noise”

But: broadness is useful as general vocabulary for categorisation of all sorts of resource types

Stefan Dietze 12/03/14

Dataset profiling: some lessons learned

Page 9: Demo: Profiling & Exploration of Linked Open Data

Stefan Dietze 12/03/14

http://data-observatory.org/led-explorer/

Type specific views on datasets/ categories (ie “topics”)

“Document” (foaf:document)

“Person “ (foaf:person)

“Course” (aaiso:course)

Currently applied to datasets in LinkedUp Catalog only (as schema mappings already available here)

More precise profiles of educational datasets

Page 10: Demo: Profiling & Exploration of Linked Open Data

Thank you!

12/03/14 10 Stefan Dietze

http://data-observatory.org/led-explorer/

http://data-observatory.org/lod-profiles/