35
The Web of Linked Data Information Universe Seongmin Lim [email protected] Dept. of Industrial Engineering Seoul National University

The Web of Linked Data Information Universe

  • Upload
    temima

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

The Web of Linked Data Information Universe. Seongmin Lim [email protected] Dept. of Industrial Engineering Seoul National University. contents. Foundations of Dataspaces and Linked Data Where do they overlap? The Web of Linked Data What data is out there? Linked Data Applications - PowerPoint PPT Presentation

Citation preview

Page 1: The Web of Linked Data Information Universe

The Web of Linked DataInformation Universe

Seongmin [email protected]

Dept. of Industrial EngineeringSeoul National University

Page 2: The Web of Linked Data Information Universe

2

contents Foundations of Dataspaces and Linked Data- Where do they overlap?

The Web of Linked Data- What data is out there?

Linked Data Applications- What is being done with the data?

Remarks on- Identity- Self-descriptive Data- Pay-as-you-go Integration

Page 3: The Web of Linked Data Information Universe

3

From data integration systems to dataspace In order to cope with growing number of data sources

Properties of dataspaces- may contain any kind of data (structured, semi-structured, unstructured)- require no upfront investment into a global schema- provide for data-coexistence- give best-effort answers to queries- rely on pay-as-you-go data integration

Page 4: The Web of Linked Data Information Universe

4

Linked data principles For publishing structured data on the general Web

Tim Berners-Lee1. Use URIs as names for things.2. Use HTTP URIs so that people can look up those names.3. When someone looks up a URI, provide useful RDF

information.4. Include RDF statements that link to other URIs so that

they can discover related things.

Page 5: The Web of Linked Data Information Universe

5

From classic web to web 2.0Single global information space No single global dataspace

1. Small set of simple standards 1. APIs have proprietary interfaces2. Hyperlinks to connect everything 2. Mashups from a fixed data sources

3. No hyperlinks within different APIs

Page 6: The Web of Linked Data Information Universe

Web APIs slice the Web into Walled Gardens

Page 7: The Web of Linked Data Information Universe

7

Can’t we just publish data as files? pdf- Easy to read and publish

Excel- Allows further processing and analysis

csv- Processing without need for proprietary tools

But…- Structure of data not explained- No connection between different data sets, silos- Static and fixed – can’t retrieve just slices relevant to problem

Page 8: The Web of Linked Data Information Universe

8

Linked data Extend the Web with a single global dataspace- By using RDF to publish structured data on the Web- By setting links between data items within different data sources

Page 9: The Web of Linked Data Information Universe

9

What is RDF? Resource Description Framework

RDF is the data format for linked data

It’s about writing down relations between things

What is RDF for?- For everyone to do same for data- To make the Web into a database

Page 10: The Web of Linked Data Information Universe

10

The essence of RDF: the ‘triple’ Typical database table

things

propertiess

Page 11: The Web of Linked Data Information Universe

11

Relations between ‘things’

Page 12: The Web of Linked Data Information Universe

12

Using the Web’s infrastructure Entities are identified with HTTP URIs- Specifically http://

Page 13: The Web of Linked Data Information Universe

13

Page 14: The Web of Linked Data Information Universe

14

contents Foundations of Dataspaces and Linked Data- Where do they overlap?

The Web of Linked Data- What data is out there?

Linked Data Applications- What is being done with the data?

Remarks on- Identity- Self-descriptive Data- Pay-as-you-go Integration

Page 15: The Web of Linked Data Information Universe

15

Properties of the Web of linked data Global, distributed dataspace built on a simple set of

standards- RDF, URIs, HTTP

Entities are connected by links- enables the discovery of new data sources.

Provides for data-coexistence- Everyone can publish data to the Web of Linked Data- Everyone can express their personal view on things- Everybody can use the schemata that they like for this

Page 16: The Web of Linked Data Information Universe

16

W3C linking open data project Publish existing open license datasets as linked data Interlink things between different data sources 2007

Page 17: The Web of Linked Data Information Universe

17

LOD datasets on the Web: July 2009

Page 18: The Web of Linked Data Information Universe

18

DBpedia community effort to extract structured information from Wikipedia. provides data about 3.4 million things- 312,000 persons- 140,000 organizations- 413,000 places- 94,000 music albums- 49,000 films- 146,000 species- …

provides identifiers for many common things- http://dbpedia.org/resource/Calgary

overlaps with many other data sources on the Web

Page 19: The Web of Linked Data Information Universe

19

Uptakes in many areas Uptake in life sciences- W3C linking open drug data effort- Bio2RDF project- Allen Brain Atlas

Governments, libraries, media industry, ……

Page 20: The Web of Linked Data Information Universe

20

The structural continuum The Web of linked data is interwoven with the classic Web.- Unstructured data: HTML- Semi-structured data: RDFa embed into HTML- Structured data: RDF/XML

Services using named entity recognition to annotate texts with Linked Data URIs- Open Calais (Thomsons Reuters) for news- Zemanta (startup) for blog posts

Page 21: The Web of Linked Data Information Universe

21

contents Foundations of Dataspaces and Linked Data- Where do they overlap?

The Web of Linked Data- What data is out there?

Linked Data Applications- What is being done with the data?

Remarks on- Identity- Self-descriptive Data- Pay-as-you-go Integration

Page 22: The Web of Linked Data Information Universe

22

Linked data browsers Provide for navigating between data sources in order to

explore the dataspace.- Tabulator Browser (MIT, USA)- Marbles (FU Berlin, DE)- OpenLink RDF Browser (OpenLink, UK)- Zitgist RDF Browser (Zitgist, USA)- Disco Hyperdata Browser (FU Berlin, DE)- Fenfire (DERI, Irland)

Page 23: The Web of Linked Data Information Universe

23

Page 24: The Web of Linked Data Information Universe

24

Mashups(DBpedia mobile)

Page 25: The Web of Linked Data Information Universe

25

Web of data search engines Crawl the dataspace and provide best-effort query

answers over crawled data.- Falcons (IWS, China)- Sig.ma (DERI, Ireland)- Swoogle (UMBC, USA)- VisiNav (DERI, Ireland)- Watson (Open University, UK)

Page 26: The Web of Linked Data Information Universe

26

Page 27: The Web of Linked Data Information Universe

27

What are the big players doing? Yahoo! and Google have started to crawl Linked Data in

its RDFa serialization as well as Microformats.

Yahoo!- provides access to crawled data through the Yahoo BOSS API- is using the data within Yahoo Search Monkey to make search

results more useful and visually appealing.

Google- uses crawled RDF data for its Social Graph API- uses crawled data to enhance search results snippets for reviews

and people.

Page 28: The Web of Linked Data Information Universe

28

Yahoo! Search monkey

Page 29: The Web of Linked Data Information Universe

29

contents Foundations of Dataspaces and Linked Data- Where do they overlap?

The Web of Linked Data- What data is out there?

Linked Data Applications- What is being done with the data?

Remarks on- Identity- Self-descriptive Data- Pay-as-you-go Integration

Page 30: The Web of Linked Data Information Universe

30

Identity Real world objects are identified with multiple URIs- Coupling of identification and retrieval- Data-coexistence: everybody can say everything about anything

Page 31: The Web of Linked Data Information Universe

31

Enable Clients to retrieve the Schema Clients can resolve the URIs that identify vocabulary

terms in order to get their RDFS or OWL definitions.

Page 32: The Web of Linked Data Information Universe

32

Reuse Terms from Common Vocabularies Common Vocabularies

- Friend-of-a-Friend for describing people and their social network- SIOC for describing forums and blogs- SKOS for representing topic taxonomies- Organization Ontology for describing the structure of organizations- GoodRelations for describing products and business entities- Music Ontology for describing artists, albums, and performances- Review Vocabulary provides terms for representing reviews

Common sources of identifiers (URIs) for real world objects- LinkedGeoData and Geonames: Locations- GeneID and UniProt: Life science identifiers- Dbpedia: Wide range of things

Page 33: The Web of Linked Data Information Universe

33

Somebody Pays-As-You-Go The overall data integration effort is split between the

data publisher, the data consumer and third parties. Data Publisher- publishes data as RDF- publishes data in a self-descriptive fashion- sets links and publishes mappings

Third Parties- set links pointing at your data- publish mappings to the Web

Data Consumer- has to do the rest

Page 34: The Web of Linked Data Information Universe

34

Summary Linked Data moves the dataspace vision to a global scale and

adds the social/community aspect to it.

The Web of Linked Data is growing rapidly- active deployment communities in different domains- might have exceeded the critical mass

Great playground for experimentation- dataspace profiling- probabilistic and approximate schema mapping- data fusion, data quality, and trust- What will the user interfaces look like?- Will search engines turn into answer engines?

Page 35: The Web of Linked Data Information Universe

End of Document

Seongmin [email protected]

Dept. of Industrial EngineeringSeoul National University