View
214
Download
1
Embed Size (px)
Citation preview
AT, 26.02.2003 1
Anja Theobald
University of the Saarland, [email protected]
http://www-dbs.cs.uni-sb.de/
An Ontology for Domain-oriented Semantic Similarity Search
On XML Data
(BTW)February 25 – 28, 2003
Leipzig, Germany
AT, 26.02.2003 2
MotivationQuery on Web Data:
Ranking based on content data and structure (XML,…)
Grouping results by their topics Using Ontologies for similarity search
movie
astronomy
sports
AT, 26.02.2003 3
Outline
5. Similarity of Ontology Nodes
2. Ontologies - a Linguistic Challenge
3. Graph-based Ontology
4. Quantification: Edge Weights
6. Ontology-based Query Processing
0. Why we need Ranked Retrieval and Ontologies?1. XXL Search Engine
AT, 26.02.2003 4
XXL Search Engine
VisualXXL
WWW
CrawlerPathIndexer
ContentIndexer
NameOntologyIndexer
ContentOntologyIndexer
EPI
ECI
NOI
COI
QueryProcessor
EPIHandler
ECIHandler
NameOntologyHandler
ContentOntologyHandler
XXL Query:SELECT * FROM INDEXWHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“
… XML Document<galaxy> <object> <description>sun</> <appearance>…light and heat…</> <location>…</> … </object> <history> … </> …</galaxy>…
AT, 26.02.2003 5
Ontologies – a linguistic challenge
ontology: ...representational vocabulary of words including hier- archical relationships and associative relationships between these words [Gruber93]...
symboliz
ed
stands for
refers to
sense: ...a celestial body of hot gases...
word: star
object:
AT, 26.02.2003 6
Word – Sense – Synset
synset(s) = { w | (w,s) U}
U = {(w,s) | w Σ*, s S: word w has sense s}
words w Σ*+ word senses
+ synonym relationship
AT, 26.02.2003 7
Disambiguation: Synset – Category synset(s) = { w | (w,s) U}// U = {(w,s) | word w has
sense s}+ hypernym relationship
sense s:synset(s): sense 1: (astronomy) a celestial
body of hot gases…
starsense 4: a plane figure with 5 or more points…
star
category(s) = { synset(s‘) | synset(s‘) is hypernym of synset(s)}
celestial body, heavenly body
natural object
object, physical object
entity, physical thing
plane figure, 2-dim. figure
figure
abstraction
shape, form
attribute
AT, 26.02.2003 8
Disambiguation: Synset – Category
sense s:synset(s):
entity, physical thing
object, physical object
natural object
celestial body, heavenly body
sense 1: (astronomy) a celestial body of hot gases…
star
synset(s) = { w | (w,s) U}// U = {(w,s) | word w has sense s}
+ hypernym relationship
plane figure, 2-dim. figure
figure
abstraction
shape, form
attribute
sense 4: a plane figure with 5 or more points…
star
category(s) = { synset(s‘) | synset(s‘) is hypernym of synset(s)}
AT, 26.02.2003 9
Example Ontologyentity, physical thing
[entity, physical thing]
food[substance, matter]
milk[foodstuff, ...]
cows‘milk[milk]
group, grouping[group, grouping]
galaxy, ...[collection,...]
milky way[galaxy,...]
natural object[object,...]
sun[star]
universe, cosmos[collection,...]
star[celestial body,...]
Beta Centauri[star]
[0. 71]
abstraction[abstraction]
star[plane figure, 2-dim figure]
hexagram[star]
[0.83]
[0.94]
AT, 26.02.2003 10
Graph-based Ontology
Ontology G=(V,E)x = (synset(s), category(s))
Ve = (x,y, type, weight) E
Construction:
Use:
word: ... extracted from a document category, type:
... extracted from an existing thesaurus (interchangable!!!) weight: ... expresses semantic similarity of connected words
sim: ... expresses semantic similarity of ontology nodes
AT, 26.02.2003 11
Quantification: Edge Weight semantic similarity of connected synsets according to their concepts vector space measures / probabilistic measures
galaxy, extragalactic nebula[collection,aggregation,accumulation,assemblage]
star[celestial body,heavenly body]
sun[star]
DICE coefficient:||||
||2
YX
YX
…using web search engines
for word frequencies…
Y := (cel heav) (star)X := (coll … ass) (galaxy extr…)
X Y := X Y
172.0600.15600.70
410.72
||||
||2
YX
YX
[0.172]
[0.113]
113.0000.470.2600.15
000.1402
||||
||2
YX
YX
AT, 26.02.2003 12
Similarity of Ontology Nodesentity
[entity]
cows‘ milk[milk]
milk[liquid]
protein[macromolecule]
group[group]
galaxy[collection]
milky way[galaxy]
sun[star]
universe[collection]
star[celestial body]
Beta Centauri[star]
[0.2]
natural object[object]
[0.6][0.5]
[0.8]
[0.1]
[0.1]
[0.3]
[0.6]
1)(
01 ),(
)(
)()(
plength
mmmxy nnweight
plength
mplengthppathsim
sim(milky way, sun)
|p|=3: 3/3 0.6 + 2/3 0.5 + 1/3 0.8 = 1.2
AT, 26.02.2003 13
Similarity of Ontology Nodesentity
[entity]
cows‘ milk[milk]
milk[liquid]
protein[macromolecule]
group[group]
galaxy[collection]
milky way[galaxy]
sun[star]
universe[collection]
star[celestial body]
Beta Centauri[star]
[0.2]
natural object[object]
[0.6][0.5]
[0.8]
[0.1]
[0.1]
[0.3]
[0.6]
1)(
01 ),(
)(
)()(
plength
mmmxy nnweight
plength
mplengthppathsim
sim(milky way, sun)
|p|=3: 3/3 0.6 + 2/3 0.5 + 1/3 0.8 = 1.2
3/3 0.8 + 2/3 0.5 + 1/3 0.6 = 1.3
AT, 26.02.2003 14
Similarity of Ontology Nodes
)(2
)()(),(
plength
ppathsimppathsimyxsim yxxy
p
entity[entity]
cows‘ milk[milk]
milk[liquid]
protein[macromolecule]
group[group]
galaxy[collection]
milky way[galaxy]
sun[star]
universe[collection]
star[celestial body]
Beta Centauri[star]
[0.2]
natural object[object]
[0.6][0.5]
[0.8]
[0.1]
[0.1]
[0.3]
[0.6]
1)(
01 ),(
)(
)()(
plength
mmmxy nnweight
plength
mplengthppathsim
sim(milky way, sun)
|p|=3: 3/3 0.6 + 2/3 0.5 + 1/3 0.8 = 1.2
3/3 0.8 + 2/3 0.5 + 1/3 0.6 = 1.3
sim(milky way, sun) = 0.42
sim(milky way, cows‘ milk) = 0.2
AT, 26.02.2003 15
Ontology-based Query Processing XXL Query:
... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“
XXL Query Representation:
~universe
~appearance
% %
~ “star”
XML Documents:
…<galaxy> <object> <description>sun</> <appearance>…light and heat… </appearance> <location>…</> … </object> <history> … </> …</galaxy>…
AT, 26.02.2003 16
Ontology-based Query Processing XXL Query:
... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“
sim(universe,
galaxy)
0.94
1.0
sim(star, sun) * tfidf(sun)0.43
XXL Query Representation:
~universe
~appearance
% %
~ “star”
1.0
sim(app,
app)
1.0
XML Data Graph:
galaxy
object
“…light and heat…”
description
sun
appearance
location
history
AT, 26.02.2003 17
Ontology-based Query Processing XXL Query:
... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“
sim(universe,
galaxy)
0.94
1.0
sim(star, sun) * tfidf(sun)0.43
XXL Query Representation:
~universe
~appearance
% %
~ “star”
1.0
sim(app,
app)
1.0
XML Data Graph:
galaxy
object
“…light and heat…”
description
sun
appearance
location
history
(result graph) = 0.4
AT, 26.02.2003 18
- ENDE -
Vielen Dank!
Gibt es etwa noch Fragen?