Upload
stefan-dietze
View
103
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Talk at Open Data Seminar in St Petersburg, March 2014
Citation preview
Motivation Data on the Web
Some eyecatching opener illustrating growth and or diversity of web data
Profiling & Exploration of Linked Datasets
Stefan Dietze, Besnik Fetahu, Davide Taibi L3S Research Center, DE,
@stefandietze, http://purl.org/dietze
Stefan Dietze 12/03/14
Data curation and dataset profiling
LinkedUp
Dataset Catalog
Stefan Dietze 12/03/14
Catalog of data (LinkedUp Catalog): classification of datasets according to resource types, disciplines/topics, data quality, accessability, etc
Infrastructure for distributed/federated querying
describes
Which datasets are useful & trustworthy for case XY (eg „learning about the solar system“) ?
Which topics (eg „Astronomy“) are covered by dataset X?
Which datasets offer videos (slides, publications, statistics etc)?
LinkedUp Data Catalog in a nutshell
http://datahub.io/group/linked-education
http://data.linkededucation.org/linkedup/catalog/
RDF (VoID) dataset catalog: browse & query distributed datasets
Live information about endpoint accessibility
Federated queries using type mappings
Stefan Dietze 12/03/14
db:Astro. Objects
Extracting Topic Profiles of Linked Datasets
Dataset Metadata
Stefan Dietze 12/03/14
Schema mappings
BIBO
AAISO
FOAF
contains
Entity disambiguation
Topic profile extraction
db:Astronomy
db:Astro. Objects
LinkedUp
Dataset Catalog
yov:Video
po:Programme
BBC Programme
<po:Programme …>
<po:Series>Wonders of the Solar System</.>
<po:Actor>Brian Cox</…>
</po:Programme…>
<yo:Video …>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video…>
Yovisto Video
bibo:Fil bibo:Fi
bibo:Film
What’s all the data about: exploring topics of LOD in a nutshell http://data-observatory.org/lod-profiles/
Stefan Dietze 12/03/14
Visualisation & exploration of dataset-topic-graph (datasets, topics, relationships)
Includes all responsive datasets of LOD Cloud
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles, Fetahu, B., Dietze, S., Nunes, B.P., Cassanova M., Nejdl, W. (2014), 11th Extended Semantic Web Conference, ESWC2014, Greece, May 2014.
Stefan Dietze 12/03/14
dbp:Category:Royal_Medal_winners
dbp:Category:1955_births dbp:Category:People_from_London
dbp:Category:Buzzwords
dbp:Category:Web_Services
dbp:Category:HTTP
dbp:Category:Unitarian_Universalists
dbp:Category:World_Wide_Web
What have these categories in common?
Stefan Dietze 12/03/14
Diversity of category profile for a single paper
Berners-Lee, Tim; Hendler, James, Ora Lassila (2001). "The Semantic Web". Scientific American Magazine.
person
document
dbp:Tim_Berners-Lee
dbp:Category:1955_births dbp:Category:People_from_London
dbp:Category:Buzzwords
dbp:Semantic_Web
dbp:Category:Semantic_Web
dbp:Category:Web_Services
dbp:Category:HTTP
dbp:Category:Unitarian_Universalists
first-level categories (dcterms:subject)
dbp:Category:World_Wide_Web
dbp:Category:Royal_Medal_winners
DBpedia category graph not an ideal “topic” vocabulary:
Broad and noisy
“Categories” vs “topics” (for capturing disciplines, thesauri like UNESCO Thesaurus seem better suited)
Lack of clear hierarchy: graph, not a tree
Mixing categories across resource types (document, person etc) creates “perceived noise”
But: broadness is useful as general vocabulary for categorisation of all sorts of resource types
Stefan Dietze 12/03/14
Dataset profiling: some lessons learned
Stefan Dietze 12/03/14
http://data-observatory.org/led-explorer/
Type specific views on datasets/ categories (ie “topics”)
“Document” (foaf:document)
“Person “ (foaf:person)
“Course” (aaiso:course)
Currently applied to datasets in LinkedUp Catalog only (as schema mappings already available here)
More precise profiles of educational datasets
Thank you!
12/03/14 10 Stefan Dietze
http://data-observatory.org/led-explorer/
http://data-observatory.org/lod-profiles/