Semantic Web questions we couldn't ask 10 years ago

Preview:

DESCRIPTION

Talk given at the SSSW 2013 Semantic Web Summerschool. Part 1: What is "Semantic Web" (in 4 principles and 1 movie) Part 2: What question can we ask now that we couldn't ask 10 years ago Part 3: Treat Computer Science as a *science*, not just as engineering! (this part a short version of http://slidesha.re/SaUhS4 )

Citation preview

Frank van Harmelen

All the questions we couldn’t ask

10 years ago

Creative Commons License: allowed to share & remix,but must attribute & non-commercial

The bad news: you’re going to get 3 talks

1.Where are we now?– The Semantic Web in 4 principles & a movie– Did we get anywhere?

2.Now what?– Questions we couldn’t ask 10 years ago

3.Methodological hobby horse– Science or engineering?

Semantic Web:What is it?

a web page in English

aboutFrank

And this page is about

LarKC

and another web page

aboutFrank

And this page is about

Stefano

This page is about the Vrije

Uniersitei

“The Semantic Web” a.k.a. “The Web of Data”

http://www.youtube.com/watch?v=tBSdYi4EY3s

P1. Give all things a name

P2. Relations form a graph between things

P3. The names are addresses on the Web

x T

[<x> IsOfType <T>]

differentowners & locations

<analgesic>

P1+P2+P3 = Giant Global Graph

P4. explicit & formal semantics

• assign types to things• assign types to relations• organise types in a hierarchy• impose constraints on

possible interpretations

Examples of “semantics”

Semantics = predictable inference

Frank Lyndamarried-to

• Frank is male• married-to relates

males to females

• married-to relates 1 male to 1 female

• Lynda = Hazel

lowerbound upperbound

Hazelmarried-to

Semantic Web:Where are we now?

Did we get anywhere?

• Google = meaningful search• NXP = data integration• BBC = content re-use• Wallmart= SEO (RDF-a)• data.gov = data-publishing

NXP: data integration about 26.000 products

Triple store

Triple store

Departments

Customers

Notice the 3-layer architecture

BBC

Notice the 3-layer architecture

Did we get anywhere?

• Google = meaningful search

• NXP = data integration

• BBC = content re-use

• BestBuy = SEO (RDF-a)

• data.gov = data-publishing

Oracle DB, IBM DB2

Reuters,New York Times, Guardian

Sears, Kmart, OverStock, Volkswagen, Renault

GoodRelations ontology,schema.org

Size Matters: 25-45 billion facts

The questionsthat we couldn’t ask

10 years ago

• Heterogeneity• Self-organisation, long tails• Distribution • Provenance & trust• Dynamics• Errors & Noise• Scale

heterogeneityis unavoidable

•Linguistic,•Structural,•Logical,•Statistical,

....

Socio-economic

first to market

market-share

Self-organisation

Self-organisation

Self-organisation

Self-organisation

Self-organisation

Bio-medical ontologies in Bio-portal > 5 links

Self-organisation

knowledge followsa long-tail

incidental or universal?

impact on mapping?

impact on reasoning?

impact on storage?

Distribution

Caching?

Subgraphs?

Payloadpriority?

query-planning?

Provenance

Representation?

From provenance to trust?(Re)construction?

knowledge about knowledge?

Dynamics

Streams? Incremental reasoning?

Non-monotonicity?

versioning?

Errors & noiseMaximally consistent subsets?

Fuzzy Semantics?

UncertaintySemantics?

RoughSemantics?

Modules?

Repair?

Argumentation?

Maximally consistent subsets?

Modules?

Repair?

Argumentation?

Fuzzy Semantics?

UncertaintySemantics?

RoughSemantics?

Streams?

Incremental reasoning?

Non-monotonicity?

versioning?

Representation?

From provenance to trust?

(Re)construction?

knowledge about knowledge?

Caching?

Subgraphs?

Payloadpriority?

incidental or universal?

impact on mapping?

impact on reasoning?

impact on storage?

Socio-economic

first to market

market-share

MethodologicalHobby horse

Laws about the physical universe

Laws about the information universe ?

knowledge followsa long-tail

Law: F = a-br

Law: |T|<< |A|

T = terminological knowledge

A = assertional knowledge

Dataset Closure of T

Closure of T + A

Ratio

LUBM 8sec 1h15min 562Linked Life Data 332sec 1h05min 11FactForge 89sec 2h45min 111

We don’t have any good laws on complexity

Recommended