20
This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0 ) Dr. Harald Sack Hasso Plattner Institute for IT Systems Engineering University of Potsdam Spring 2013 Semantic Web Technologies Lecture 6: Applications in the Web of Data 07: Semantic Search

Open hpi semweb-06-part7

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Open hpi semweb-06-part7

This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0)

Dr. Harald Sack

Hasso Plattner Institute for IT Systems Engineering

University of Potsdam

Spring 2013

Semantic Web Technologies

Lecture 6: Applications in the Web of Data07: Semantic Search

Page 2: Open hpi semweb-06-part7

Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

2

Lecture 6: Applications in the Web of DataOpen HPI - Course: Semantic Web Technologies

Page 3: Open hpi semweb-06-part7

Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

3

07 - Semantic SearchOpen HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data

Page 4: Open hpi semweb-06-part7

Context

Pragmatics

Experience

Experience

Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam

4

Meaning

Symbol Objectstands for

sender

receiver

refers tosymbolizes

Concept

Armstrong

Ogden, Richards: The Meaning of Meaning: A Study of the Influence of Language upon Thought and of the Science of Symbolism (1923)

http://commons.wikimedia.org/wiki/User:McSmit

Page 5: Open hpi semweb-06-part7

Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam

Armstrong

Page 6: Open hpi semweb-06-part7

Neil Armstrong

Astronaut

is a

Person

is a

Science Occupation

subClassOf

Employment

subClassOf

Entities

Ontologies

has an

Kosmonautsame as

is NOT a

http://dbpedia.org/resource/Neil_Armstrong

Page 7: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

7

Classical Information Retrieval

(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)

Set of Documents

files of records

Page 8: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

7

Classical Information Retrieval

(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)

Set of Documents

files of records

Set of Queries

Information requests

Page 9: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

7

Classical Information Retrieval

(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)

Set of Documents

files of records

Set of Queries

Information requests

indexing language

similarity

indexingQueryFormulation

Page 10: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

8

Classical Information Retrieval(simplified version)

Set of documents

search index

searching, vb. , in allen ger n sprachen bezeugt: got.sokjan, ags. sēcan, as. sokian, an. Soekj

[Bd. 20, Sp. 835]

sēza, ahd. suohhan. aus idg. sprachen steht am nächsten lat. sāgiospüre, air. saigim gehe

einer sache nach, suche; zur weiteren verwandtschaft vgl. Walde-Pokorny 2, 449.

der umlaut des stammvokals erscheint im nd., er wird im md. verzeichnet vonCrecelius

oberhess. wb. 827; Spiess henneb. id. 248; Hertel Thüringen240; Gerbet Vogtland 425

und auf kolonialem boden bei Schröerdeutsche mundarten des ungrischen

berglandes 225. neben eigentlichem suchen 'einer sache

nachspüren, sich bemühen, sie aufzufinden' (dann auch 'jemanden

aufsuchen, ihn bedrohen, angreifen') steht eine reich bezeugte bedeutungsgruppe mehr

keywords

„search“?

search query

search term(s)

Page 11: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

9

relevant documents retrieved documents

relevant documents that have been retrieved

RP

Recall =| R ∩ P |

|R|

Precision =| R ∩ P |

|P|

Fα=(1+α)⋅(Recall ⋅ Precision )

α⋅(Recall + Precision )

Evaluation of Information Retrieval Systems

Page 12: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

10

(One of many Definitions...)

Semantic Search

• Annotation of (text-based) metadata with semantic entities

• Entity-based Information Retrieval

• Make use of semantic relations, as e.g. content-based similarities of relationships

• Interoperable metadata via semantic annotations

• for content-based description

• for structural / technical description (Multimedia Ontologies)

Overall Goal: Quantitative and qualitative improvement of Information Retrieval

Page 13: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamTurmbau zu Babel, Pieter Brueghel, 1563

Semantic metadata enable improvement of traditional keyword-based retrieval by

(1) Query String Extension/Refinementenables more precise or more complete search results

(2) Cross Referencingenables to complement search results with additional associated or similar information

(3) Exploratory Search enables visualization and navigation of the search space

(4) Reasoningenables to complement search results with implicitly given information

Semantic Search

Page 14: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

12

Semantic SearchQuery String Extension

• Keyword-based search does not deliver all search results that are relevant for a query, because synonyms and metaphors might describe the queried content.

• Extension of the original query string (Query Extension)• from dictionaries and thesauri

• extend query with synonyms, hyponyms, etc.• from domain ontologies

• extend query with meronyms, related concepts, etc.

Original query string: Bank

possible extensions: Bank ∨ depository financial institution ∨ credit union ∨ acquirer ∨ federal reserve ∨ ... increase recall

Page 15: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

13

Semantic SearchQuery String Refinement

• Keyword-based search does also deliver search results that are not relevant for a query, because query terms and document terms might be ambiguous.

• Refinement of the original query string (Query Refinement)• from dictionaries and thesauri

• disambiguate polysemic terms with hypernyms• from domain ontologies

• disambiguate polysemic terms with holonyms

Original query string: Bank

possible refinements: (1) Bank ∧ financial institution (2) Bank ∧ incline ∧ slope ∧ side (3) Bank ∧ container (4) Bank ∧ deposit ∧ repository increase precision

Page 16: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

14

Semantic SearchCross Referencing

• Provide search results that do not literally contain the query string but are closely related to the query by content

• Apply domain ontologies for determining related concepts

• Apply statistical analysis of large (text) document corpora

dbpedia:Neil_Armstrong

dbpedia:Apollo_11

dbprop:mission

Neil Armstrong NER

dbprop:mission

dbprop:mission

query string

dbpedia:Buzz_Aldrin

dbpedia:Michael_Collins

Page 17: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

15

Semantic SearchExploratory Search

• Provide additional search results that do not necessarely contain the query string but are related to the query by content or also are related to the search results achieved by the direct query

• Apply domain ontologies and heuristics to determine the relevance of facts

95

dbpedia:Apollo_11

category:Apollo_program

dbpedia:Apollo_13

dcterms:subject

yago:Space_accidents_and_incidents

rdf:type

rdf:type

dbpedia-owl:mission

dbpedia:Neil_Armstrong

dbpedia:Space_Shuttle_Challenger

dcterms:subject

Page 18: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

16

Semantic SearchReasoning

• Provide additional search results (and information) that do not necessarely contain the query string but are related to the query by content, whereby the relation may not be a direct one, but can be derived via entailment.

• Apply domain ontologies, reasoning algorithms and heuristics to find new facts and determine the relevance of facts

95

Page 19: Open hpi semweb-06-part7

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

17

Semantic SearchReasoning

95

dbpedia:Neil_Armstrong

dbpedia:Apollo_11

dbpedia-owl:mission

category:Missions_to_the_Moon

dcterms:subjectcategory:Exploration_of_the_Moon

dcterms:subject

category:Spaceflight

skos:broader

dbpedia:Moon category:Animals_in_Space

dcterms:subject skos:broader

Example: query string= Neil Armstrong

(Hard) questions to solve via reasoning:• Will there be the Moon or documents about the Moon in the search results?• How is Neil Armstrong related to the Moon? (is he?)• Was Neil Armstrong (really) on the Moon?• ...

category:Moon

skos:broader

Page 20: Open hpi semweb-06-part7

Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

18

08 - Exploratory Semantic SearchOpen HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data