Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
WT VU – Semantic Web Fundamentals January 7th 2019Vedran Sabol
Semantic Web Fundamentals
Web Technologies (706.704)
3SSt VU
WS 2018/19
Vedran Sabol
with acknowledgements to P. Höfler, V. Pammer, W. Kienreich
ISDS, TU Graz
Dec 2nd 2019
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Overview
• What is Semantic Web?
• Technology stack
• Linked Data (Cloud)
• Example applications
2
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Semantic Web
• “A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities.”
Tim Berners-Lee, Scientific American, 2001
Also described as “Web 3.0” (where Web 2.0 is the social Web)
• Semantic Web is a web of data
Addition to the classic Web of documents
• Goals
Give Web information an exact meaning (semantics)
Provide common formats for data integration/combination
• enabled by “being about the same thing” (semantics)
Empower computers to understand, process and integrate Web information
3
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Semantic Web Stack
4
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Resource Identification
• Unicode: encoding standard for text, includes many character sets (covering different languages)
• Used for encoding resources in the Semantic Web
• defines UTF-8 (preferred), UTF-16, UTF-32 character encodings
• Current version: 12.1
137,994 characters
from 150 scripts (currently in use and historic)
5
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Resource Identification
• URL: Uniform Resource Locator
Reference to a resource in the Web
Also defines the means for accessing the resource (e.g. http://)
• URN: Uniform Resource Name
Identifies a resource within a specific namespace (e.g. ISBN –International Standard Book Number)
• IRI/URI (Internationalized Resource Identifier/Uniform Resource Identifier): unique identification of resources
6
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Resource Identification
• URI: Uniform Resource Identifier
Standard for identifying resources in the Semantic Web
Generalisation of URLs
Can be a “virtual” pointer (e.g. not associated to content/document)
Consist of
• Scheme name: e.g. “http:”
• Authority: e.g. “//myhost.com:8080”
• Path: e.g. “/dir/subdir/file.db”
• Query: e.g. “?date=20141201&place=graz”
• Fragment: e.g. “#foto”
• IRI: Internationalized Resource Identifier
Extends URIs from ASCII to Universal Character Set
• Best practice: HTTP URIs
7
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
XML
• XML: meta-markup language
Used to define a syntax for creating documents containing structured data
Nested opening and closing tags define a hierarchy of elements
Attributes define element properties
Content belonging to an element stored between tags
• XML Namespaces: provide a possibility to use uniquely named elements and attributes from different sources (vocabularies)
Declared using reserved attribute xmlns:prefix=“namespaceURI”
• xmlns:xhtml="http://www.w3.org/1999/xhtml"
8
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Resource Description Framework (RDF)
• RDF is a framework for creating statements about resources in the form of “triples”
Triples are subject-predicate-object expressions
Graph representation of resource information
• RDF Schema (RDFS) provides basic vocabulary for RDF
Allows to define application-specific classes and properties
Resources defined as instances of classes and subclasses (like in OOP)
9
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Resource Description Framework (RDF)
• Language for representing information about Web resources
• Consists of triples:
<subject> <predicate> <object>
• Different serialisation formats - initially XML
• Designed to be understood by computers
Not indented for consumption by humans
• W3C recommendation since 2004
10
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
RDF - Examples
In English
• Moby-Dick was written by Herman Melville.
In RDF
• <http://dbpedia.org/resource/Moby-Dick>
<http://dbpedia.org/ontology/author>
<http://dbpedia.org/resource/Herman_Melville>.
11
In English
• Tim Berners-Lee likes Moby-Dick.
In RDF
• <http://www.w3.org/People/Berners-Lee/card#i>!
<http://www.w3.org/2000/10/swap/pim/contact#likes>
<http://dbpedia.org/resource/Moby-Dick>.
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
RDF - Graph
12
Tim Berners-Lee
Person
Herman Melville
Moby Dick
Book
is a
is a
likes
author is a
published on
„1815-10-18“
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
RDF Serialization Formats
RDF/XML
<?xml version="1.0" encoding="utf-8" ?>!
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dbo="http://dbpedia.org/ontology/">!
<rdf:Description rdf:about="http://dbpedia.org/resource/Moby-Dick">
<dbo:author rdf:resource="http://dbpedia.org/resource/Herman_Melville" />
</rdf:Description>
</rdf:RDF>
13
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
RDF Serialization Formats
N3
@prefix dbo: <http://dbpedia.org/ontology/> .
<http://dbpedia.org/resource/Moby-Dick>
dbo:author <http://dbpedia.org/resource/Herman_Melville> .
JSON-LD
{
"@id": "http://dbpedia.org/resource/Moby-Dick",
"http://dbpedia.org/ontology/author": {
"@id": "http://dbpedia.org/resource/Herman_Melville"
}
}
14
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
RDF Schema (RDFS) and Web Ontology Language (OWL)
• Data-modelling vocabulary with pre-defined semantics for RDF data
an RDFS document is a valid RDF document
• Contains elements for defining ontologies
Classes, properties, data types, sub-classing etc.
• Classes are sets of instances (rdfs:Class), e.g.
• rdfs:Resource (class of all URIs, all other classes are subclasses)
rdf:Property (class of all properties)
rdfs:Literal (class of all literals, e.g. strings or integers)
rdfs:Datatype (class of all datatypes)
rdfs:Container (super-class of rdf:Alt, rdf:Bag, rdf:Seq)
15
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
RDF Schema (RDFS) and Web Ontology Language (OWL)
• OWL extends RDFS with more advanced constructs
• Extended vocabulary
• Stating additional constraints
cardinality, value restrictions, characteristics of properties (e.g. transitivity).
• Brings reasoning power to the semantic web
• Entity names are IRIs
• RDF defines how to write statements, OWL defines what is valid to write
16
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Ontology
• Definition: naming and definition of the types, properties, and relationships of entities in a particular domain
“a formalization of a conceptualization”
• Encoded using ontology languages (RDF, RDFS, OWL etc.)
• An ontology has an IRI
• And may
have a version number
import other ontologies
Be described using metadata (e.g. with RDFS)
17
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Ontology
Consist of
• Instances (basic objects)
• Classes (define different types of instances)
• Attributes (describe classes and instances)
• Relations (put classes and instances into relation)
• Restrictions (to something being accepted as input)
• Rules (if-then statements describing logical inference)
• Axioms (assertions/statements, including rules)
Declarations: non-logical axioms to ensure IRIs are used for proper entity types (e.g. Declaration(NamedIndividual(:Frenkie)), ClassAssertion(:Manager :Frankie))
Assertional Axioms: assert facts or annotations about entities
18
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Triple Stores
• Databases for storing RDF statements
• Optimized for storage of triples and the resulting graph structures
Related to graph DBs (which are more general)
May be implemented on top of existing DB engines
• Retrieval using semantic queries
Query execution efficiency is the hard part
• Examples: AllegroGraph, OpenLink Virtuoso, Jena…
19
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
SPARQL Protocol and RDF Query Language (SPARQL)
• SPARQL is the RDF query language for a triple store
Interface is called SPARQL end point
• 4 query types
ASK: returns a true or false value
SELECT: matching resources returned in a table format
CONSTRUCT: returns results as valid RDF
DESCRIBE: returns RDF statements describing the matching resources
• it is up to the endpoint to decide which descriptions are included
20
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
SPARQL Protocol and RDF Query Language (SPARQL)
• SPARQL is the RDF query language for a triple store
Interface is called SPARQL end point
• Query exampleSELECT ?book ?releaseDate
WHERE {
<http://www.w3.org/People/Berners-Lee/card#i> con:likes ?book .
?book dbpedia:releaseDate ?releaseDate .
}
• Result(<http://dbpedia.org/resource/Moby-Dick>, “1851-10-18”)
21
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Linked Data
“The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.”
— Tim Berners-Lee
22
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Linked Data
• Publishing and interlinking structured data on the Web
• Using Semantic Web standards
• Discoverable using SPARQL
• 4 principles (by Tim Berners-Lee)
Use URIs as names for things
Use HTTP URIs so that people can look up those names
When someone looks up an URI, provide useful information, using standards (RDF, SPARQL)
Include links to other URIs, so that they can discover more things.
23
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Linked Open Data (LOD)
• W3C SWEO (Semantic Web Education and Outreach) Community Project
“The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources”
• Examples
DBpedia: data from Wikipedia, structured using semantic Web technologies
FOAF: dataset describing persons, their properties and relations
GeoNames: descriptions of geographical features worldwide
24
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
LOD Cloud
25
Explore the LOD cloud (SVG)
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Ontology Alignment
Problem: LOD cloud is far from perfect
• different domains/view-points diversity of vocabularies and conceptualizations
• no links between them
Ontology alignment: establish links between (equal) concepts from different ontologies to enable interoperability
• Multitude of techniques: statistical, linguistic, structural, rule-based etc.
Ontology Alignment Algorithms
External Knowledge
(WordNet…)
Statistical Methods
LinguisticMethods
Knowledge Base
Ontology A
Ontology B Aligned Ontologies
matchmatch
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
UI Layer - Example
• UI Layer enables humans to use semantic Web applications
• Example application: Query Wizard + Visualization Wizard for LOD
Query Wizard: Search for datasets like you search for documents
Visualization Wizard: automatically generate interactive visualizations
Rely on semantic information
27
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Query Wizard
• Google-like search in Linked Data
• Keyword search + attribute selection (columns)
• Search result: a tabular result set
28
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard
• 10+ different visualisations
time, geo-spatial, categorical, statistics…
• Automatic visualisation of tabular data sets
depending on data characteristics
and visualization capabilities
• Filtering and aggregation
• Interactive analysis with Multiple Coordinated Views
• Utilize Semantic Web technologies to achieve the goals
29
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard - Ontologies
30
• RDF Data Cube Vocabulary (W3C standard)
Represents statistical data as collections of observations
Dimensions: identify observations
Measures: are related to concrete values
• Visual Analytics Vocabulary
• Describes visualizations semantically as an OWL ontology
• Chart name
• Definition of
– visual channels of the visualization: available axes, colors, different icons, item size…
– And their data presentation capabilities: data type, cardinality, persistence
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard - Mapping
31
• The two ontologies define the relation between data and visualizations
Compute all valid mappings of a data set onto visualizations
These visualizations (and only these) can be created automatically with a single click
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard UI
32
Meaningful charts can be created automatically
• Bar Chart (selected)
• Pie Chart
• Parallel Coordinates
• Geo View
Other charts are disabled
• Cannot be created for that data set
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard – Selection/Filtering
• Multidimensional data
• Data elements - lines– Colours - property
(violet - EU countries)
• Exploration– Filter along multiple axes
(e.g. years, CO2 Tons/P)
– Read values on other axes
– Spot patterns, dependencies
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard – Coordinated Multiple Views
• Interactive analysis using coordinated brushing
Selection in one visualization (scatterplot) reflected in all others (geo)
• Visualising data sets in multiple visualisations
Each visualization specialized for a different data aspect
numerical values, categories, time and geo information…
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard - Aggregation
• Aggregate CO2 emissions and life expectancy– Average for countries over the years
• Correlation: CO2 emissions and life expectancy– Outliers: Russia, Saudi Arabia
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard – Aggregation
aggregate value (sum) for countries over the years
• Data transformations - such as aggregation - create new data sets!
WT VU – Semantic Web Fundamentals December 2nd 2019Vedran Sabol
Visualization Wizard – Coordinated Multiple Views
• Problem: how to link different data sets to enable interactive analysis in multiple visualizations?
• Linking enabled by semantic information (through “being about the same thing”)
for data sets created by data transformations
and for any data set loaded from the web
• Examples of insights obtained through semantic coordinated brushing
Countries with lowest funding lie in Eastern Europe
Their funding is increasing over the years – but so is for the other counters too