Upload
claire-cross
View
215
Download
0
Embed Size (px)
Citation preview
1
Towards Fine-grained Service Matchmaking
by Using Concept Similarity
Alberto Fernández, Axel Polleres, Sascha Ossowski {alberto.fernandez,sascha.ossowski}@urjc.es
University Rey Juan Carlos (Madrid - Spain)DERI, National University of Ireland, Galway
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
2
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Outline
Introduction Concept Similarity Matching Semantic Web Services Towards a combined notion of similarity-based SM Conclusions
3
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Outline
Introduction Concept Similarity Matching Semantic Web Services Towards a combined notion of similarity-based SM Conclusions
4
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Introduction
Location and selection of services in SOA Service Descriptions
Provided services (advertisements) Service requests Both based on shared formal ontologies
Notions of match between advertisements and requests
Subsumption checking Boolean (or several degrees of) match
Concept similarity Numerical (fine grained)
Objective: Unified framework: Notions of match + concept similarity
5
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Outline
Introduction Concept Similarity Matching Semantic Web Services Towards a combined notion of similarity-based SM Conclusions
6
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Concept Similarity
Semantic distance approaches Rada et al.: Shortest path between two concepts in the
taxonomydist(c1, c2) = depth(c1) + depth(c2) − 2 × depth(lcs(c1, c2))
Leacock & Chodorow
Fernandez et al.
H
), cdist(c -) , cs(crelatednes
2log 21
21
otherwise
csubsumescife
csubsumescife
ccif
ccsimccdist
ccdist
02
12
1
2
1
1
),(
21),(
12),(
21
21
21
21
7
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Semantic distance: taking depth into account Wu & Palmer
Li et al.
Concept Similarity
t,lcs)length(roo),h,cdist(c,l,βα
otherwise
ccifee
eee
),csim(c βhβh
βhβhαl
21
2121
00
1
t,lcs)length(rooN
,lcs),length(c,lcs),Nlength(cN
NNN
N),csim(c
3
2211
321
321 2
2
8
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Concept Similarity
Feature-based approaches (Tversky) Contrast model
contrast(C,D) = f(ftrs(C) ftrs(D))−f(ftrs(C)\ftrs(D))−f(ftrs(D)\ftrs(C)) f(·) is usually the count of features, ftrs(C) set of features in
C number of common minus the number of non-common
features Ratio model
Which is commonly taken as
ftrs(C))f(ftrs(D)\ βftrs(D)) f(ftrs(C)\α ftrs(D)) f(ftrs(C)
ftrs(D))f(ftrs(C) sim(C,D)
f(ftrs(D))f(ftrs(C))
ftrs(D))f(ftrs(C) sim(C,D)
2
9
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Concept Similarity
Information Content approaches pr(c) = probability of an individual being described by a
specific concept c Resnik
sim(c1, c2) = IC(lcs(c1, c2)) = −log pr(lcs(c1, c2))
Jiang & Conrathsim(c1, c2) = IC(c1) + IC(c2) − 2 × IC(lcs(c1, c2))
Lin
) IC(c) IC(c
)), cIC(lcs(c),csim(c
21
2121
2
10
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Concept Similarity
Description Logics approaches Borgida et al.
Applyies distance, feature and information content models Very simple DL (A): only conjunctions
Di Noia et al. potential match (some requests in demand D are not specified in
S): the number of concepts names in D not in S, the number of number restrictions of D not implied by those of S add recursively rankPotential for each universal role quantification in
D Fanizzi & d’Amato
define a similarity measure between concepts in ALN DL. decompose the normal form of the concept descriptions:
Primitive concepts: ratio of common individuals wrt. either conjunct. Value restrictions: computed recursively, the average value is taken. Numeric restrictions: ratio of overlap, the average value is taken
11
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Concept Similarity
Information Retrieval approaches OWLS-MX (Klusch et al.)
logic-based reasoning is complemented by IR based similarity
four different token-based string metrics the cosine the loss of information the extended Jacquard Jensen-Shannon information divergence
applied to unfolded concepts: (and C (and B (and A))) corresponds to the concept C (C B
A).
12
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Concept Similarity: compound concepts
Rada et al. Disjunction
Conjunction
Ehrig et al. (cosine) = (sim(e, e1), sim(e, e2), . .
. , sim(e, f1), sim(e,
f2), . . .),
Sierra & Debenham
1 2
),(||||
1
0
),(
21
21
21
Vu Vv
otherwisevudistVV
VVif
VVdist
,C)}{dist(C C) C. . . dist(C ii
k min,1
FfEe
FfEe
fe
fe
FEsim ),(
)},({minmax),()()(
dcsimsimOdOc
e
13
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Outline
Introduction Concept Similarity Matching Semantic Web Services Towards a combined notion of similarity-based SM Conclusions
15
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Matching SWS: notions of match
Paolucci et al. An advertisement (S) matches a request (R) iff
for each output of R there is a matching output in S. for each input of S there is a matching input in R.
Degree of match for outputs (inverse for inputs) Exact: OUTR and OUTS are equivalent or OUTR subclass of OUTS Plug In: OUTS subsumes OUTR
Subsumes: OUTR subsumes OUTS
Fail: no subsumption relation If there are several outputs with different degree of match,
the minimum degree is used The set of service advertisements is sorted by comparing
output matches first
16
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Matching SWS: notions of match
OWLS-MX Hybrid: Logic based + Syntactic IR based similarity Matching filters
Exact: INS INR: INS= INR OUTR OUTS: OUTR= OUTS
Plug In: INS INR: INS INR OUTR OUTS: OUTS LSC(OUTR)
Subsumes: INS INR: INS INR OUTR OUTS : OUTR OUTS
Subsumed-by: INS INR: INS INR OUTR OUTS: (OUTS= OUTR OUTSLGC(OUTR))
SIMIR(S,R) Logic-based fail: above logic based filters fail Nearest-neighbour:
INS INR: INS INR OUTR OUTS: OUTR OUTS SIMIR(S,R) Fail
17
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Matching SWS: notions of match
Li & Horrocks One DL concept defines the inputs and one the outputs Extend the degree levels proposed by Paolucci
Exact: if S = R Plug In: if R S Subsume: if S R Intersection: if (S⊓R ) Disjoint: if S ⊓ R
18
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Outline
Introduction Concept Similarity Matching Semantic Web Services Towards a combined notion of similarity-
based SM Conclusions
19
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Towards a combined notion of simil.-based SM
Notion of similarity match (NoSM) Real number in [0..1]
Notion of match Logic-based, coarse grained Several levels of match NoM {exact, level1, level2, …, leveln,
fail} Refining with concept similarity (sim)
Real number in [0..1] Aggregation
Compound concepts (e.g. set of inputs) Components: Inputs, Outputs, Operations Maintaining NoM (logic-based) semantic
0
1
sim
1
0
NoSMNoM
level1
level2
leveln
exact
fail
.
.
.
20
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Outline
Introduction Concept Similarity Matching Semantic Web Services Towards a combined notion of similarity-based SM Conclusions
21
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Conclusions
Concept Similarity Distance is commonly used …
Assumes equally distributed instances over concepts Difficult to apply to DL
Adoption of canonical representation? Spanning tree of pre-classification, new atomic concept names for R.C, R.C, …
22
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Example
23
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Conclusions
Concept Similarity Distance is commonly used …
Assumes equally distributed instances over concepts Difficult to apply to DL
Adoption of canonical representation? Spanning tree of pre-classification, new atomic concept names for R.C, R.C, …
… but other approaches exist (features, IC, IR …) Concept definitions vs instances
Matching SWS Most current approaches based on inputs/outputs Logic based reasoning: subsumption Several (non-numerical) degrees of match
24
SMR2’07. ISWC, Busan.
Nov. 11 – 15, 2007.
Conclusions and further work
Notion of similarity-based service matching Using concept similarity to refine notion of match Fine-grained degree of match: facilitates service ranking
Open issues Which service description framework to focus on? OWL-S,
WSMO, etc, or a new one to which these approaches could be easily mapped?
Which concept similarity measure better fits our framework? Is there a single “best” measure? What are the conditions that it must fulfill?
How should values corresponding to different elements be combined?
Do different applications require the same framework or should it be adapted for each of them?
25
Thanks!!
Questions?