18
OWLS-MX: A Hybrid Semantic Web Service Matchmaker for OWL-S Services Matthias Klusch a,1,Benedikt Fries b Katia Sycara c,2 a German Research Center for Artificial Intelligence, Saarbruecken, Germany b Morgan Stanley Japan Securities Corporation, Tokyo, Japan c Carnegie Mellon University, Robotics Institute, Pittsburgh, USA Abstract In this paper, we describe the first hybrid semantic Web service matchmaker for OWL-S services, called OWLS- MX. It complements crisp logic-based semantic matching of OWL-S services with token-based syntactic similarity measurements in case the former fails. The results of the experimental evaluation of OWLS-MX provide strong evidence for the claim that logic-based semantic matching of OWL-S services can be significantly improved by incorporating non-logic-based information retrieval techniques. An additional analysis of false positives and false negatives of the hybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2. Key words: Semantic Web, OWL-S, Semantic Service Matching, Information Retrieval 1. Introduction Semantic service discovery is the process of lo- cating existing Web services based on the descrip- tion of their functional and non-functional seman- tics. Discovery scenarios typically occur when one is trying to reuse an existing piece of functionality (represented as a Web service) in building new or enhanced business processes. Both service-oriented computing and the Semantic Web envision intelli- Corresponding author. Phone: +49-681-302-5297 Email addresses: [email protected] (Matthias Klusch), [email protected] (Benedikt Fries), [email protected] (Katia Sycara). 1 Partial support provided by BMBF (German Ministry for Education and Research) grants MODEST 01-IWO-8001, SCALLOPS 01-IW-D02, European Commission grant CAS- COM IST-FP6-511632. 2 Partial support provided by the DARPA DAML program under contract F30601-00-2-0592. gent agents to proactively pursue this task on behalf of their users. Central to the majority of contemporary ap- proaches to Semantic Web service selection is that the functionality of Web services is logically defined in, for example, the standard first-order description logic-based ontology language OWL [6] or a rule language like SWRL, or a logic programming lan- guage like F-Logic. In anycase, intelligent agents can exploit standard means of logic reasoning to au- tomatically understand the Web service semantics, in particular to determine the degree to which the service is semantically relevant to a given service request. However, the representation of real-world seman- tics in logics only is known to be inadequate due to its limited expressivity. In addition, automated rea- soning on Web service semantics expressed in first- order logics turned out not to be sufficiently scalable to the Web in practice [4]. One pragmatic solution to this problem is hy- Preprint submitted to Elsevier 4 September 2008

OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

OWLS-MX: A Hybrid Semantic Web Service Matchmakerfor OWL-S Services

Matthias Klusch a,1,∗ Benedikt Fries b Katia Sycara c,2

aGerman Research Center for Artificial Intelligence, Saarbruecken, GermanybMorgan Stanley Japan Securities Corporation, Tokyo, Japan

cCarnegie Mellon University, Robotics Institute, Pittsburgh, USA

Abstract

In this paper, we describe the first hybrid semantic Web service matchmaker for OWL-S services, called OWLS-MX. It complements crisp logic-based semantic matching of OWL-S services with token-based syntactic similaritymeasurements in case the former fails. The results of the experimental evaluation of OWLS-MX provide strong evidencefor the claim that logic-based semantic matching of OWL-S services can be significantly improved by incorporatingnon-logic-based information retrieval techniques. An additional analysis of false positives and false negatives of thehybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2.

Key words: Semantic Web, OWL-S, Semantic Service Matching, Information Retrieval

1. Introduction

Semantic service discovery is the process of lo-cating existing Web services based on the descrip-tion of their functional and non-functional seman-tics. Discovery scenarios typically occur when oneis trying to reuse an existing piece of functionality(represented as a Web service) in building new orenhanced business processes. Both service-orientedcomputing and the Semantic Web envision intelli-

∗ Corresponding author. Phone: +49-681-302-5297Email addresses: [email protected] (Matthias Klusch),

[email protected] (Benedikt Fries),[email protected] (Katia Sycara).1 Partial support provided by BMBF (German Ministry forEducation and Research) grants MODEST 01-IWO-8001,SCALLOPS 01-IW-D02, European Commission grant CAS-COM IST-FP6-511632.2 Partial support provided by the DARPA DAML programunder contract F30601-00-2-0592.

gent agents to proactively pursue this task on behalfof their users.

Central to the majority of contemporary ap-proaches to Semantic Web service selection is thatthe functionality of Web services is logically definedin, for example, the standard first-order descriptionlogic-based ontology language OWL [6] or a rulelanguage like SWRL, or a logic programming lan-guage like F-Logic. In anycase, intelligent agentscan exploit standard means of logic reasoning to au-tomatically understand the Web service semantics,in particular to determine the degree to which theservice is semantically relevant to a given servicerequest.

However, the representation of real-world seman-tics in logics only is known to be inadequate due toits limited expressivity. In addition, automated rea-soning on Web service semantics expressed in first-order logics turned out not to be sufficiently scalableto the Web in practice [4].

One pragmatic solution to this problem is hy-

Preprint submitted to Elsevier 4 September 2008

Page 2: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

brid semantic service selection, that is the combina-tion of both logic-based and non-logic-based approx-imate reasoning on service semantics. Pioneeringwork in this direction include the first implementedhybrid semantic service matchmakers like LARKS[17], OWLS-MX, WSMO-MX [8], FC-MATCH [1]and OWLS-iMatcher2 [9].

In this paper, we describe our hybrid SemanticWeb service matchmaker for OWL-S services, calledOWLS-MX, in detail. Key to OWLS-MX is thatit tolerates logical subsumption-based signaturematching failures up to a specified extent by com-plementary approximate matching based on textsimilarity measurement. Of course, we acknowledgethat the adaptation to the latter eventually is onthe user’s end.

The remainder of this paper is structured as fol-lows. We provide background information on seman-tic services in OWL-S and semantic service selectionwith focus on logic-based approaches in sections 2and 3, respectively. The following section 4 presentsour approach to hybrid semantic service profile se-lection with OWLS-MX including its hybrid match-ing filters, the generic matching algorithm togetherwith variants and a simple application example. De-tails on the implementation of OWLS-MX are givenin section 5. The experimental results of measuringthe service retrieval performance and scalability ofOWLS-MX over a given test collection are presentedin section 6, followed by an experimental analysisof its false positives and false negatives in section 7.These results led to an improved version of OWLS-MX reported in section 8. We briefly present re-lated work on hybrid semantic service matchmakersin section 9, and conclude in section 10.

2. Semantic services in OWL-S

Our semantic service matchmaker OWLS-MX fo-cusses on semantic services that are described inOWL-S. In the following, we briefly introduce theessentials of OWL-S, and refer to, for example, [15]for more details.

2.1. Overview

OWL-S is an upper ontology used to describe thesemantics of services based on the W3C standardontology OWL and is grounded in WSDL. It has itsroots in the DAML Service Ontology (DAML-S) re-leased in 2001, and became a W3C candidate recom-

mendation in 2005. The OWL-S ontology consistsof three main components: the service profile for ad-vertising and discovering services; the process model,which gives a detailed description of a service’s op-eration; and the grounding, which provides detailson how to interoperate with a service, via messages.

In particular, the semantic service profile in OWL-S specifies the semantics of the service signature,that is the inputs required by the service and theoutputs generated. Furthermore, since a service mayrequire external conditions to be satisfied, and it hasthe effect of changing such conditions, the profilealso describes the preconditions to be satisfied be-fore, and the expected effects that result from theexecution of the service. The majority of existingOWL-S service matchmakers focusses on semanticservice profiles.

2.2. OWL-S service profile

The OWL-S profile ontology is used to describe whatthe service does, and is meant to be mainly used forthe purpose of service discovery. An OWL-S serviceprofile or signature encompasses its functional para-meters, i.e. hasInput, hasOutput, precondition andeffect (IOPEs), as well as non-functional parameterssuch as serviceName, serviceCategory, qualityRat-ing, textDescription, and meta-data about the ser-vice provider such as name and location 3 .

Inputs and outputs relate to data channels, wheredata flows between processes. Preconditions spec-ify facts of the world (state) that must be assertedin order for an agent to execute a service. Effectscharacterize facts that become asserted given asuccessful execution of the service in the physicalworld (state). Whereas, in OWL-S, the semanticsof each service input and output parameter is de-fined in terms of a referenced OWL concept in agiven ontology, typically in a decidable descriptionlogic OWL-DL or OWL-Lite, the preconditions andeffects can be expressed in any appropriate first-order logic (rule) language such as KIF (KnowledgeInterchange Format) or SWRL (Semantic WebRule Language). Besides, the profile class can besubclassed and specialized, thus supporting thecreation of profile taxonomies which subsequentlydescribe different classes of services.

3 Please note that, in contrast to OWL-S 1.0, in OWL-S1.1 the service IOPE parameters are defined in the processmodel with unique references to these definitions from theprofile.

2

Page 3: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

2.3. OWL-S service process model

An OWL-S process model describes the compo-sition (choreography and orchestration) of one ormore subservices of a service, that is the controlledenactment of constituent processes with respectivecommunication patterns, exposed IOPEs and para-meter bindings of linked subservices. The seman-tics of OWL-S service process models have not beendefined in the specification of OWL-S but variousexisting approaches to formalize the semantics ofthe standard Web service orchestration languageBPEL (Business Process Execution Language) canbe exploited for this purpose. Originally, the serviceprocess model was not intended for service discoveryby the so-called OWL-S coalition, that is the groupof researchers who developed OWL-S.

2.4. OWL-S service grounding

The grounding of an OWL-S service descriptionprovides a binding between the logic-based seman-tic service profile, the process model, and the XML-based Web service interface to facilitate service exe-cution. Such a grounding of OWL-S services can be,in principle, arbitrary but has been exemplified for agrounding in WSDL (Web Service Description Lan-guage) to concretely connect OWL-S to an existingWeb service standard.

In particular, the logic-based description of theservice signature is uniquely associated with that ofthe Web service, and an atomic semantic processmodel is mapped to a WSDL operation. WSDL 1.0does not allow to express pre-conditions or effects ofservices, nor has any formal semantics.

Though OWL-S allows only static and determin-istic aspects of the world to describe in the descrip-tion logic variants of OWL used for semantic anno-tation, the majority of semantic services available inthe public Web happens to be in OWL-S [11]. Refac-toring OWL-S to the standard for Semantic Webservice description, SAWSDL (Semantically Anno-tated WSDL), is ongoing work in the Semantic Webservices science community.

3. Semantic service selection

What is semantic service selection? Apart fromfinding available semantic services in the Web orcentral service directories, the quality of semanticservice discovery depends on the process of seman-

tic service selection: The pairwise semantic servicematching of a set of semantic services with a givenquery and respective relevance-based ranking of theresults returned to the user. Semantic service selec-tion tools are also called semantic service match-makers.

In the following, we classify existing SemanticWeb service matchmakers, and focus on what mostof them perform: Logic-based semantic service pro-file matching. Related work on hybrid semanticservice matchmakers are discussed in section 9. Fora more comprehensive survey, we refer to [10].

3.1. Classification of SWS matchmakers

Current semantic service matchmaker can beclassified according to (a) what kinds and parts ofservice semantics are considered for matching, and(b) how matching is actually performed in terms oflogic-based or non-logic-based reasoning within orpartly outside the service description framework, ora combination of both (cf. figure 1).

The majority of them performs logic-based se-mantic service profile matching, and is restricted toOWL-S. Only a few are available for alternativeslike WSML or the standard SAWSDL, and does alsotake process models and non-functional parametersinto account. Though, process model-based match-ing was not intended by the designers of OWL-S orWSML, while neither SAWSDL nor monolithic ser-vice descriptions offer any process model element.

Logic-based semantic service matchmakers per-form deductive reasoning on service semantics. Inorder to define these semantics, logical conceptsand rules are referenced in ontologies as logicalbackground theories. Different ontologies of serviceprovider and requester are matched or aligned ei-ther at design time, or at runtime as part of thelogic-based service matching process.

Non-logic-based semantic service matchmakers donot perform any logic-based reasoning to determinethe degree of a semantic match between a givenpair of service descriptions. Examples of non-logic-based semantic matching techniques are text sim-ilarity measurement, structured XML/RDF graphmatching, and path-length-based similarity of con-cepts. In particular, service matchmakers that donot at least logically verify given semantic relationsbetween ontological concepts used to describe ser-vice semantics classify as non-logic-based.

3

Page 4: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

Fig. 1. Categories of existing SWS matchmakers.

3.2. Logic-based semantic service profile matching

We distinguish between monolithic and struc-tured logic-based semantic profile matching. Inthe first case, the functionality of a Web serviceis exclusively - and agnostic to any parameterizeddescription structure like in OWL-S or WSML -represented by a single logical expression. Semanticcomparison of such monolithic logic-based service(effect) descriptions as a whole simply reduces tostandard first-order (description) logic reasoning,that is service and query (desired service) conceptsubsumption, respectively, satisfiability checkingcompletely within the logic theory.

For example, the logical post-plug-in match of anadvertised service concept S with a service queryconcept R is determined by the entailment of serviceconcept subsumption of S by R over a given knowl-edge base kb extended by the axioms of S and R:kb∪S∪R |= S � R. That is, the matchmaker checksif in each first-order interpretation (possible world)I of kb, the set SI of concrete provider services (ser-vice instances) is contained in the set RI of serviceinstances acceptable to the requester: SI ⊆ RI . Inother words, service S is more specific than the re-quest R, hence considered semantically relevant. A

logical subsumes match assures the requester thather acceptable service instances are also acceptableto the provider: kb ∪ S ∪ R |= R � S [5]. Promi-nent examples of monolithic logic-based matchmak-ers are RACER [13] and MaMaS 4 [14].

Alternatively, structured logic-based profilematching makes additional use of parameterizedservice descriptions provided by most semantic ser-vice description languages such as OWL-S, WSMLand SAWSDL. In this case, logic-based semanticprofile matching is a combination of logical rea-soning within the logic theory of the formal ontol-ogy language used for annotation, and algorithmicprocessing outside the theory. For example, a logic-based plug-in serice (IOPE) profile match requiresto check that certain constraints hold on the typeand quantification of computed (approximated) log-ical implications between preconditions and effect,and logical subsumption between input and out-put concepts (denoted as in : C, resp., out : C) ofservice and query 5 : Service S semantically IOPE-

4 sisinflab.poliba.it/MAMAS-tng/5 Originally, a service S plugs into (plug-in matches with)another service R, if the effect of S is more specific thanthat of R, and vice versa for the preconditions of S and R[21]. Unfortunately, this notion of plug-in match in software

4

Page 5: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

plugs into request R iff precR ⇒ precS ∧ postS ⇒postR (specification plug-in match), and ∀in : C ∈InputS ∃in : C′ ∈ InputR : C � C′ ∧∀out : C ∈OutputR ∃out : C′ ∈ OutputS : C′ � C (signa-ture plug-in match). Alternative for undecidablefirst-order logical entailment checking is the poly-nomial, correct but incomplete polynomial thetasubsumption (a logical consequence relation) andinstance-based query answer set inclusion (querycontainment).

In general, the complexity of computing logic-based semantic relations depends on the ontologylanguage used for semantic annotation. For exam-ple, post-plug-in matching of service concepts inOWL-Full, that is SHOIQ+ (including transitivenon-primitive roles) has been shown to be undecid-able, but decidable in NEXPTIME for OWL-DL,WSML-DL and DL-safe SWRL.

Examples of logic-based OWL-S service match-makers are OWLSM [7] and OWLS-UDDI [16] bothfocussing on service IO-matching. The matchmakerPCEM [2] converts given OWL-S services and re-quest to standard PDDL (Planning Domain Defin-ition Language) actions and then computes logicalimplications between their preconditions and effectsin PROLOG.

4. Hybrid semantic service profile matching

Hybrid semantic service selection performed byour matchmaker OWLS-MX exploits both logic-based reasoning and non-logic-based informationretrieval (IR) techniques for OWL-S service profilesignature matching. That is, OWLS-MX focusseson service I/O-parameter matching and ignores log-ical service specification in terms of preconditionsand effects. Please note that the vast majority of ac-cessible OWL-S services does not possess any suchspecification, nor any composite process model yet.

In the following, we define the hybrid semanticservice filters of OWLS-MX, the generic OWLS-MXselection algorithm and its five variants accordingto five different text similarity metrics used by thematchmaker.

engineering has been adopted quite differently by most logic-based Semantic Web service matchmakers for both mono-lithic and structured service descriptions.

4.1. Matching filters of OWLS-MX

OWLS-MX computes the degree of semanticmatching for a given pair of service advertisementand request by successively applying five differentfilters exact, plug in, subsumes, subsumed-by

and nearest-neighbor. The first three are logicbased only whereas the last two are hybrid due tothe required additional computation of syntacticsimilarity values. Let be– T the terminology (TBox) of the OWLS-MX

matchmaker ontology specified in OWL-DL andCTT the concept subsumption hierarchy of T ;

– LSC(C) the set of least specific concepts (directchildren) C’ of C, i.e. C’ is immediate sub-conceptof C in CTT ;

– LGC(C) the set of least generic concepts (directparents) C’ of C, i.e., C’ is immediate super-concept of C in CTT ;

– in:C ∈ InputS (out:C ∈ OutputS) an input (out-put) concept C of service S defined in T ;

– SynSim(S,R) ∈ [0, 1] real-valued degree of textsimilarity between service S and request R. Thisdegree is computed as the averaged syntactic sim-ilarity of the serialized input, respectively, out-put concepts of S and R according to given textsimilarity metric SynSim(S,R): SynSim(S,R) =(SynSim(S,R)in + SynSim(S,R)out)/2. Serializa-tion of a set of concepts includes the terminolog-ical unfolding of these concepts in T , followed upby the conjunctive concatenation of the result-ing logical expression and its preprocessing to oneweighted keyword vector;

– α ∈ [0, 1] syntactic similarity threshold;– � (≡) terminological concept subsumption

(equivalence) relation.The semantic service matching degrees computed

by OWLS-MX are as follows.

Exact match. Service S exactly matches re-quest R ⇔ ∀ in:C ∈ InputS ∃ in:C’∈ InputR: C≡ C’ ∧ ∀ out:D∈ OutputR ∃ out:D’∈ OutputS: D≡ D outS . The service I/O signature perfectlymatches with the request with respect to logic-based equivalence of their formal semantics.

Plug-in match. Service S plugs into requestR ⇔ ∀ in:C∈ InputS ∃ in:C’∈ InputR: C’sqsubseteq C ∧ ∀ out:D∈ OutputR ∃ out:D’∈OutputS : D’∈ LSC(D). All service input para-meter concepts are matched by a more specific

5

Page 6: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

one in the request R. If the OWL input conceptdefinitions can be mapped to equivalent WSDLinput messages and service signature data types,this constraint guarantees at a minimum thatS is executable with any input provided by therequestor. In addition, S is expected to returnmore specific output data whose logically definedsemantics are exactly the same or very close towhat has been requested.

This kind of match is borrowed from the soft-ware engineering domain, where software compo-nents are considered to plug-in match with eachother as defined above but not restricting theoutput concepts to be direct children of those ofthe query. In particular, the definition of plug-insignature match used for OWLS-MX follows theoriginal notion of software specification plug-inmatch introduced in [21].

Subsumes match. Request R subsumes ser-vice S ⇔ ∀ in:C∈ InputS ∃ in:C’∈ InputR: C’sqsubseteq C ∧ ∀ out:D∈ OutputR ∃ out:D’∈OutputS : D’ sqsubseteq D. This filter is weakerthan the plug-in filter in the sense that the re-turned service output is more specific than re-quested by the user: It relaxes the constraint ofimmediate output concept subsumption to arbi-trary output concept subsumption.

Subsumed-by match. Request R is subsumed

by service S⇔ ∀ in:C∈ InputS ∃ in:C’∈ InputR:C’ sqsubseteq C ∧ ∀ out:D∈ OutputR ∃ out:D’∈OutputS : D’ ≡ D ∨ D’∈ LGC(D). This filter se-lects services whose output data is slightly moregeneral than requested, hence, in this sense, sub-sumes the request. We focus on direct parentoutput concepts to avoid selecting services re-turning data which we think may be too general.Of course, it depends on the individual perspec-tive taken by the user, the application domain,and the granularity of the underlying ontology athand, whether a relaxation of this constraint isappropriate, or not.

Logical Fail. OWLS-MX returns a logic-based se-mantic matching failure degree, iff service S doesnot match with request R according to any of theabove matching filters.

Hybrid subsumed-by match. Request R is sub-

sumed by service S ⇔ ∀ in:C∈ InputS ∃ in:C’∈InputR: C’ sqsubseteq C ∧ ∀ out:D∈ OutputR

∃ out:D’∈ OutputS: (D’ ≡ D ∨ D’∈ LGC(D))∧ SimIR(S, R) ≥ α. This hybrid filter comple-ments logic-based subsumed-by matching withsyntactic matching by means of a selected textsimilarity measurement.

Nearest-neighbor match. Service S is nearest

neighbor of request R ⇔ SimIR(S, R) ≥ α.This matching degree is non-logic-based since itchecks the degree of text similarity between theinput and output concepts of service and request.It is being applied only in case all of the abovelogic-based and hybrid matching filters fail.

Fail. OWLS-MX returns a total semantic match-ing failure degree as a result, iff service S doesnot match with request R according to any of theabove matching degrees.

These service matching degrees are sorted ac-cording to the order of their semantic relevancedegrees as follows: Exact < Plug-In < Subsumes

< Subsumed-By < Logical Fail < Hybrid

subsumed-By < Nearest-neighbor < Fail.A service S that logically matches exactly with

request R is assumed to be more semantically rele-vant to R than a plug-in matching one (Exact <Plug-In). A subsumes match of S with R is con-sidered semantically weaker than a plug-in matchdue to the relaxation of the service output conceptmatching condition (relaxation from direct childconcept to any arbitrary subconcept in the match-maker ontology). In other words, a plug-in matchingservice S is assumed to be semantically closer toR than a subsumes-matching service (Plug-In <Subsumes).

Further, we assume that a semantic service out-put concept which is more general than requestedrelaxes the degree of semantic relevance of thisservice to a query compared to a more specific ser-vice output. In particular, the restriction to directparent concepts in the ontology in the logic-basedsubsumed-by matching condition makes a service Swith a logical subsumes matching degree with focuson more specific, that is direct child concepts inthe ontology, semantically more relevant to a givenrequest R than services with more general outputconcepts like the subsumed-by-matching services(Subsumes < Subsumed-By).

From the perspective of logic-based only seman-tic matching, the complementary text similaritymeasurement by the nearest-neighbour filter is con-

6

Page 7: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

sidered weakest with respect to semantic relevance.This filter will only be performed in case all logic-based filters and the hybrid subsumed-by filterfail. Finally, only if none of the above sequentiallychecked matching degrees hold, the matchmakerreturns a matching failure (Nearest-neighbor <Fail).

4.2. Generic OWLS-MX matching algorithm

The OWLS-MX matchmaker takes any OWL-Sservice as a query, and returns an ordered set of rel-evant services that match the query each of whichis annotated with its individual degree of match-ing, and syntactic similarity value. The user canspecify the desired degree, and syntactic similaritythreshold. OWLS-MX then first classifies the ser-vice request I/O concepts into its local matchmakerontology.

Matchmaker ontology. The matchmaker ontol-ogy emerges from a given initial ontology by classify-ing all service and request I/O concepts into this on-tology each time a service advertisement or requestis being received. We assume that service provider,requester and matchmaker share a basic minimalvocabulary of primitive components together witha set of mapping rules such as synonym relations inthe thesaurus WordNet. Primitive components areterms out of which complex concepts are canonicallydefined in a description logic-based terminology.

Upon receipt of a service, the matchmaker focusesonly on those parts of referenced service ontologiesthat are relevant to understand the semantics ofthe service. For this purpose, it terminologically un-folds each service input and output concept leadingto logical concept expressions that include primi-tive components of a shared basic vocabulary. Eachof these concept expressions is self-contained in thesense that the rest of the referenced service ontologyis not necessary to understand the semantics of theunfolded concept.

Attached to each concept in the matchmakerontology are auxiliary data about which registeredservices are using that concept as an input and/oroutput concept. The respective lists of service iden-tifiers are used by the matchmaker to compute theset of relevant services that are matching with thegiven query.

Hybrid matching. Any failure of logical concept

subsumption produced by the integrated descriptionlogic reasoner of OWLS-MX will be tolerated, if andonly if the degree of syntactic similarity betweenthe respective unfolded service and request conceptexpressions exceeds a given similarity threshold.

The pseudo-code of the generic OWLS-MXmatching process is given below (cf. algorithms 1 -3). Let inputsS = { inS,i|0 ≤ i ≤ s}, inputsR = {inR,j |0 ≤ j ≤ n}, outputsS = { outS,k|0 ≤ k ≤r}, outputsR = { outR,t|0 ≤ t ≤ m}, set of inputand output concepts used in the profile I/O para-meters hasInput and hasOutput of registeredservice S in the set Advertisements, and the servicerequest R, respectively. Attached to each conceptin the matchmaker ontology are auxiliary data thatinforms about which registered service is using thisconcept as an input and/or output concept.

Algorithm 1 Match: Find advertised ser-vices S that best match in a hybrid fash-ion with a given request R; returns set of(S, degreeOfMatch, SIMIR(R, S)) with maximumdegree of match (dom) unequal FAIL (uses algs. 2and 3 to compute dom), and syntactic similarityvalue exceeding a given threshold α.1: function match(Request R, α)2: local result, degreeOfMatch,

hybridF ilters = { subsumed-by,nearest neighbour}

3: for all (S, dom) ∈ candi-

datesinputset(inputsR) ∧ (S, dom′) ∈candidatesoutputset(outputsR) do

4: degreeOfMatch← min(dom, dom′)5: if degreeOfMatch ≥ minDegree ∧ (

degreeOfMatch /∈ hybridF ilters ∨simIR(R,S) ≥ α) then

6: result := result ∪ { (S,degreeOfMatch, simIR(R,S) ) }

7: end if8: end for9: return result

10: end function

In the following section, we present five variants ofthis generic OWLS-MX matchmaking scheme.

4.3. OWLS-MX variants

We implemented different variants of the genericOWLS-MX algorithm, called OWLS-M1 to OWLS-M4, each of which uses the same logic-based se-mantic filters but different IR similarity metric

7

Page 8: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

Algorithm 2 Find services which input matcheswith that of the request; returns set of (S, dom) withminimum degree of match dom unequal FAIL.1: function candidatesinputset(inputsR)2: local H , dom, r3: � If a service input matches with multiple

request inputs the best degree is returned4: H := { (S, inS,i, dom) ∈ ⋃

j=1..n

candidatesinput(inRj ) | dom =argmaxl{ (S, inS,i, doml) |1 ≤ l ≤n, 1 ≤ i ≤ s} }

5: � If all inputs of service S are matched bythose of the request, S can be executed,and the minimum degree of its potentialmatch is returned

6: for all S ∈ Advertisement do7: if { (S, inS1 , dom1), · · ·, (S, inSs , doms)

} ⊆ H then8: r := r∪{ (S, min(dom1, · · ·, doms)) }9: end if

10: end for11: � Services with no input can always be exe-

cuted and are preliminary exact-matchcandidates: servNoIn() = { (S, exact)| S ∈ Advertisements ∧ inputsS = ∅ }

12: � Remaining, unmatched services are atleast nearest neighbour-match can-didates: remServ() = { (S, nearest

neighbour) | S ∈Advertisements∧ 〈S,degreeOfMatch′〉 /∈ r }

13: return r := r ∪ servNoIn() ∪ remServ()14: end function15:

16: function candidatesinput(inR,j) � Classifyrequest input concept into ontology, and usethe auxiliary concept data to collect servicesthat at least plug-in match with respect toits input.

17: local r18: r := r∪ { (S, inS , exact) | S ∈

Advertisements, inS ∈ inputsS, inS.=

inR,j , }19: r := r∪ { (S, inS , plug-in ) | S ∈

Advertisements, inS ∈ inputsS, inS ≥̇inR,j , }

20: return r21: end function

SIMIR(R, S) for content-based service I/O match-ing. The variant OWLS-M0 performs logic basedonly semantic service I/O matching.

Algorithm 3 Find services which output matcheswith that of the request; returns set of (S, dom) withminimum degree of match unequal FAIL.1: function candidatesoutputset(outputsR)2: local r, dom3: if outputsR = ∅ then4: return { (S, exact) | S ∈

Advertisements }5: end if6: for all S ∈ Advertisements do7: if (S, domt) ∈

candidatesoutput(outR,t) ∧ domt

≥ subsumes for t = 1..m then8: r := r ∪ { (S, min{dom1, · · ·,

domm})}9: else if (S, domt) ∈

candidatesoutput(outR,t) ∧ domt

∈ { exact, subsumes } for t = 1..mthen

10: r := r ∪ { (S, subsumed-by }11: end if12: end for13: � Any remaining, unmatched service is a

potential nearest neighbour-match:remServ() = { (S, nearest neigh-

bour) | S ∈ Advertisements ∧ S /∈ r }14: return r := r ∪ remServ()15: end function16:

17: function candidatesoutput(outR,t) � Classifyrequest output concept into ontology, anduse the auxiliary concept data to collect ser-vices with output concepts that match withoutR,t.

18: local r19: r := r ∪ { (S, exact) | outS

.= outR,t }20: r := r ∪ { (S, plug-in) | outS ∈

LSC(outR,t) ∧ S /∈ r }21: r := r ∪ { (S, subsumes) | outS ≤̇ outR,t

∧ S /∈ r }22: r := r ∪ { (S, subsumed-by) | outS ∈

LGC(outR,t) }23: return r24: end function

OWLS-M0. The logic-based semantic filters Ex-

act, Plug-in, Subsumes and Subsumed-By

are applied as defined in section 3.1.

OWLS-M1 to OWLS-M4. The hybrid variantsOWLS-M1, OWLS-M3, and OWLS-M4 also com-pute the syntactic text similarity value SimIR

8

Page 9: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

(outS , outR) by use of the loss-of-informationmeasure, extended Jacquard similarity coeffi-cient, the cosine similarity value, and the Jensen-Shannon information divergence based similarityvalue, respectively.

Based on the experimental results of measuring theperformance of similarity metrics for text informa-tion retrieval provided by Cohen and his colleagues[3], we selected the top performing ones to build theOWLS-MX variants. These symmetric token-basedstring similarity measures are defined as follows.

– The cosine similarity metricSimCos(S, R) = �R·�S

||�R||22·||�S||22with standard TFIDF

term weighting scheme, and the unfolded con-cept expressions of request R and service Sare represented as n-dimensional weighted in-dex term vectors �R and �S respectively. �R · �S =∑n

i=1 wi,R × wi,S , ||X ||2 =√∑n

i w2i,X , and wi,X

denotes the weight of the i-th index term in vec-tor X .

– The extended Jaccard similarity metricSimEJ(S, R) = �R·�S

||�R||22+||�S||22−�R·�S with standardTFIDF term weighting scheme.

– The intensional loss of information based simi-larity metricSimLOI(S, R) = 1− LOIIN (R,S)+LOIOUT (R,S)

2

with LOIx(R, S) = |PCR,x∪PCS,x|−|PCR,x∩PCS,x||PCR,x|+|PCS,x| ,

x ∈ {IN, OUT }, PCR,x and PCS,x set of primi-tive components in unfolded logical input/outputconcept expression of request R and service S.

– The Jensen-Shannon information divergencebased similarity measureSimJS(S, R) = log2− JS(S, R) =

12log2

∑ni=1 h(pi,R) + h(pi,S)− h(pi,R + pi,S)

with probability term frequency weigthingscheme, e.g., pi,R denotes the probability of i-thindex term occurrence in request R, and h(x) =−xlogx.

The extended Jaccard metric is a standard formeasuring the degree of overlap as the ratio of thenumber of shared terms (primitive components)of unfolded concepts of both service and request,and the number of terms possessed by either ofthem. In contrast to the TFIDF/cosine similaritymetric, it does not favor documents with most com-

mon terms. The Jensen-Shannon measure is basedon the information-theoretic, non-symmetricalKullback-Leibler divergence measure. It measuresthe pairwise dissimilarity of conditional probabilityterm distributions between service and request textrather than looking at the whole collection as it isthe case for the TFIDF/cosine, or the extended Jac-card metric. The loss of (intensional) information incase some concept A is terminologically substitutedby concept B, can be measured as the inverse ratioof the number of matching primitive componentswith those which remain unmatched in termino-logically disjoint unfolded concept constraints. Thesymmetric LOI-based similarity value for a givenpair of service and request is then computed analo-gously for all I/O-concept definitions involved.

4.4. Example

Let us illustrate the hybrid service matching withOWLS-MX by means of a simple example. Figure2 shows the concept subsumption hierarchy or tax-onomy of the OWLS-MX matchmaker ontology, theservice request R for physicians of some hospital hthat provide treatment to patient p, and relevantservice advertisements S1 and S2.

Service S1 is considered semantically relevant torequest R, since it returns for any given person p andhospital h, the individual surgeon of h that operatedon p. Likewise, service S2 is relevant to R, since itreturns those emergency physicians who providedemergency treatment to p before her transport tohospital h. Hence, both services S1 and S2 shouldbe returned as matching results to the user.

However, the logic-based only variant OWLS-M0determines S1 as plug-in matching with R but failsto return S2, since the logic-based semantics ofthe output concept siblings ”emergency physician”and ”hospital physician” in the ontology are termi-nologically disjoint. Please note that this conceptdisjointness is not defined in any of the conceptsbut has been computed by the matchmaker indue course of its classifying these concepts into itsmatchmaker ontology. In this example, the set ofterminological constraints of unfolded concepts ccorresponds to the set of primitive components (cp)of which the individual concepts are canonicallydefined in the matchmaker ontology T . Hence, theunfolded concept expressions are as follows.

– unfolded(Patient, T ) = (and Patientp Personp)

9

Page 10: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

Fig. 2. Example of hybrid service matching with OWLS-MX

– unfolded(Hospital, T ) = (and Hospitalp (andMedicalOrganisationp Organisationp))

– unfolded(HospitalPhysician, T ) = (andHospitalPhysicianp (and Physicianp Personp))

– unfolded(Surgeon, T ) = (and Surgeonp (andHospitalPhysicianp (and Physicianp Personp)))

– unfolded(EmergencyPhysician, T ) = (andEmergencyPhysicianp (and Physicianp Personp))

As a result, for example, OWLS-M1 would return S1

as semantically plug-in matching service with syn-tactic similarity value of SimLOI (R, S1) = 0.87.In contrast to OWLS-M0, it also returns S2, sincethis service is nearest-neighbor matching with therequest R: Their implicit semantics exploited by thetext similarity metric LOI (cf. (3), (4)) with SimLOI

(R, S2) =(1− 5−4

5+4 )+(1− 4−23+3 )

2 = 0.78 ≥ α = 0.7 is suf-ficiently similar, that is, it exceeds a threshold givenby the requester.

5. Implementation

We implemented the OWLS-MX matchmakervariants (current version 1.1c) in Java using theOWL-S API 1.1 beta with the tableaux OWL-DL reasoner Pellet developed at university ofMaryland (cf. http://www.mindswap.org). Asthe OWL-S API is tightly coupled with theJena Semantic Web Framework, developed by

the HP Labs Semantic Web research group (cf.http://jena.sourceforge.net/), the latter isalso used to modify the OWLS-MX matchmakerontology. Figure 3 shows a screenshot of the OWLS-MX version 1.1 graphical user interface.

After parsing service advertisements and re-quests, the respective input and output conceptsare analysed and, if necessary, added to the localmatchmaker ontology together with auxiliary dataon their unfolding. As a consequence, the match-maker ontology is dynamically built and growingwith the number of services and underlying ontolo-gies loaded. In addition, the matchmaker ontologyis extended with auxiliary information for eachconcept, for example whether it is used as an in-put or output concept of a service registered at thematchmaker. Service requests are treated similarly,except that they are not stored in the extendedmatchmaker ontology.

For each service request concept, the service iden-tifiers attached to its immediate parent and childconcepts of the enhanced matchmaker ontology areretrieved. The semantic degree of matching for eachservice is then determined by applying the semanticfilters on this set of matching candidates. After thisstep, the syntactic similarity is computed by apply-ing the selected IR similarity metric to the strings ofunfolded concepts of the query and each registeredservice. Both the semantic degree of match and thesyntactic similarity value determine the hybrid de-

10

Page 11: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

Fig. 3. OWLS-MX v1.1c screenshot: OWLS-MX configuration

gree of matching of one service with the request. Ifthis hybrid degree is better than or equal to the min-imum degree specified by the user, then this servicewill be returned as potentially relevant.

OWLS-MX spends the largest amount of timewith classifying the service I/O-concept relatedparts of OWL-DL ontologies used by newly regis-tered services into the matchmaker ontology. Thatis, it classifies new service I/O concepts not yetknown to the matchmaker into its current ontol-ogy (see section 4.2, matchmaker ontology). Forexample, the processing of 582 services of the testcollection OWLS-TC 2.1 takes about ten minutes(on an IBM ThinkPad T41p with 1.7GHz and 2GBRAM) presuming that there are no additional time-outs due to unavailability of service related OWLontologies at remote sites. Once this preprocessinghas been completed, the average query responsetime of OWLS-MX appears reasonable but prob-ably not acceptable in practice (about 10 secs perquery to check over 582 services). There is definitelyspace for improving on that by means of applyingappropriate caching and indexing techniques.

OWLS-MX v1.1c comes with an integrated eval-uation tool for measuring its peformance in termsof precision and recall over a given OWL-S ser-vice retrieval test collection. Alternatively, we de-veloped the SME2 (Semantic MatchMaker Eval-

uation Environment) tool which provides morefunctionality for testing arbitrary Semantic Webservice matchmakers for OWL-S, WSML andSAWSDL; the evaluation tool SME2 is available atprojects.semwebcentral.org/projects/sme2/.

6. Evaluation of Performance

In this section, we provide our experimental re-sults of the retrieval performance of logic-based andhybrid OWLS-MX variants in terms of recall andprecision.

6.1. Service retrieval test collection

For measuring the service I/O retrieval perfor-mance of each OWLS-MX variant, we used theOWL-S service retrieval test collection Owls-

TC v2.2. This collection consists of more than1000 services specified in OWL-S 1.1 in seven ap-plication domains, that are education, medicalcare, food, travel, communication, economy, andweaponry. The majority of these services wereretrieved from public IBM UDDI registries, andsemi-automatically transformed from WSDL toOWL-S. Owls-TC v2.2 provides a set of 28 testqueries each of which is associated with a set of 10

11

Page 12: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

to 20 services that a dozen people subjectively de-fined as relevant according to the standard TRECdefinition of binary relevance [19]. The collectionOwls-TC v2.2 is available as open source atprojects.semwebcentral.org/projects/owls-tc/. Weare working on Owls-TC v2.2g with graded rele-vance sets.

6.2. Overall R/P performance

We adopted the evaluation strategy of macro-averaging the individual precision values over allrequests q ∈ Q for λ recall levels [20] of each OWLS-MX variant over the test collection OWLS-TC 2.1.The matchmaker returns a rank list of all servicesfor evaluation, that is the answer set for evalua-tion purposes is the set S of registered services.For all queries qi ∈ Q, i ∈ {1..n} and recall levelsλj = j/λ ∈ [0, 1], j = 1..λ, we select the precisionPreci(r) value that is maximum (ceiling interpo-lation) for recall Reci(r) ≥ λj at some rank r =1, ..., |S|. Finally, we average these observed preci-sion values to obtain the macro-averaged precisionPrec(λj) (over all queries) at each recall level λj .

In summary, the evaluation results showed thathybrid semantic matching can improve logic-basedonly service selection in terms of both precision andrecall (cf. figure 4). While OWLS-M0 reached preci-sion of 0.67 and recall of 0.50 in average for its top-20ranked services, the hybrid OWLS-M3 achieved thatwith a higher precision (0.74) and recall (0.557).

The reason of higher precision of the hybrid vari-ants OWLS-M1 to OWLS-M4 is that they avoidedmost of logic-based false positives in case of logicalsubsumed-by matches in the given test collectionby complementary syntactic similarity measure-ment. In addition, the hybrid semantic matchmak-ers avoided logic-based false negatives caused bywrongly returned matching degree of logical failthrough complementary syntactic similarity mea-surements (nearest-neighbour match) which led toa better recall. All hybrid variants showed almostequal performance in average. In the following, weshow the main cases of logic-based and hybrid falsepositives and false negatives lowering precision,respectively, recall of OWLS-MX.

The quantitative impact of the above mentionedfalse positive and false negative cases on the match-maker performance, of course, depends on the usedtest collection. In this respect, our experimentalevaluation results are preliminary as long as there

Fig. 4. R/P performance of logic-based OWLS-M0 vs. syn-tactic matching (Cosine/TFIDF, threshold 0.6) vs. hybridmatching with OWLS-M3.

is no (quasi-)standard collection for semantic ser-vice retrieval available, similar to TREC in the IRdomain.

7. Anaylsis of false positives and negatives

In this section, we analyze the retrieval perfor-mance of OWLS-MX in terms of false positive andfalse negatives to reveal the benefits and pitfalls ofits logic-based and hybrid semantic matching filters.

7.1. Logic-based false positives

There are two main reasons for logic-based falsepositives of OWLS-MX: First, in the context ofservice matching, the known logical mismatch prob-lem of knowledge representation is manifested byinappropriate logical definitions of input or outputconcepts used to define the semantics of I/O con-cepts of services in the logic-based matchmaker on-tology. Second, the all-quantified logical matchingconstraints wrongly tolerates the missing of inputor output concepts. These types of logic-based falsepositives of OWLS-M0 are illustrated by example

12

Page 13: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

in the following.

Granularity of matchmaker ontology. Anylogic-based semantic service matchmaker risks toreturn false positives if the given logical conceptdefinitions in its ontology are not capturing thereal-world semantics of the concepts used to definedthe service semantics. This kind of logical mis-match is a general problem of symbolic knowledgerepresentation. In the context of semantic servicematching, the decision whether some service is afalse positive for a given query is subjective for eachindividual user. The same holds for the definitionof relevance sets for each query in the test collec-tion OWLS-TC2.2 we used to evaluate the retrievalperformance of OWLS-MX.

Fig. 5. Example: False positive due to tolerated unlimitedlogical parent-child relation between input concepts.

For example, in figure 5, the service at best log-ically plug-in matches with the query, since the(equally named) output concepts ”price” are de-termined logically equivalent, and the query inputconcept ”HybridRotaryEnginePoweredCar” is farmore specific than the service input concept ”Au-tomobile”. According to the developers of the testcollection, the real-world semantic distance betweenboth input concepts in the matchmaker ontologycan be considered too large for being of any inter-est which renders the service irrelevant. The reasonwhy all logic-based matching filters of OWLS-M0fail to reckognize this, hence return the service asrelevant, is that they accept an unlimited inputconcept distance in the matchmaker ontology.

Similarly, in the second example (cf. figure 6) thelogical comparison of service and query output defi-nitions result in a direct subsumption relation in thematchmaker ontology which could be (subjectively)considered wrong, hence produce a false positive.

Fig. 6. Example: False positive due to logical mismatch ofoutput concepts with direct parent-child relation.

In this case even a restrictive least generic conceptmatch of the logical subsumed-by filter of OWLS-M0 does not help to avoid this. However, in bothcases the additional syntactic matching of the hy-brid subsumed-by matching filter of OWLS-M1 toOWLS-M4 can potentially avoid logical subsumed-by matches that are classified as false positives.This holds under the IR assumption that the degreeof syntactic similarity sufficiently corresponds withthe degree of real-world semantic similarity.

All-quantified logical matching constraints.Many false positives of OWLS-M0 are caused bythe restrictive all-quantified logical matching con-straints. Since the hybrid variants inherit the deci-sion of OWLS-M0 in case of a logical match, thesebecome false positives of OWLS-M1 to OWLS-M4too.

Query input without corresponding service input.The surjective mapping of service input conceptsto query input concepts (∀ inS ∃ inR) can lead tofalse positives: It tolerates the missing of service in-put concepts that correspond to those query inputconcepts that are important part of or even key fordefining the intended query semantics. For exam-ple, in figure 7, the input ”SFNovel” of the query”SFNovelPrice” does not match with any input ofthe service ”EntranceFee” but ”Author” with ”Per-son”. As a result, OWLS-M0 determines a plug-inmatch, hence wrongly returns the service as rele-vant.

Even worse, the surjective concept mapping byOWLS-MX can lead to false logical exact matchesin case of no input or output concepts provided. Forexample, in figure 8, the query ”BuyBook” and ser-vice ”DatingService” are returned as semantically

13

Page 14: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

Fig. 7. Example: False positive due to all-quantified match-ing. Incomplete coverage of query input by service input istolerated.

equivalent by OWLS-M0. The reason is that in thisparticular case there does not exist any query out-put concept which matches with the existing ser-vice output concept. Similarly, the same holds forthe query ”RoutingService” and the service ”Map”without any input concept to match which makesthe service-centred input matching constraints of alllogical filters of OWLS-MX true by default.

Fig. 8. Example: False positives due to tolerated lack ofservice or query I/O.

Same I/O concepts used to express differentquery and service semantics. Besides, this surjec-tive matching of I/O concepts by all logical filtersof OWLS-MX ignores the possible use of the sameconcept to describe different real-world semanticsof a query or a service. For example, the real-worldsemantics of service ”BookCopyCheck” and query”BookReview” in figure 9 are assumed to be notrelated at all, that is the service is not relevant tothe query.

However, OWLS-M0 classifies the service as se-mantically equivalent with the query, hence pro-duces a false positive. The same concept ”Book” isused twice in the service input but with obviouslydifferent real-world semantics than in the query. Inthese cases, even syntactic similarity measurementwould return a high relevance degree but at leastnot identity between service and query, since theterm unfolded concept ”Book” can be detected asa surplus term of the input string by means of fine-grained syntactic overlap measurement like the ex-tended Jaccard coefficient.

Fig. 9. Example: False positive caused by using the sameconcept ”book” for describing different service and querysemantics

7.2. Avoiding logic-based false positives

The hybrid variants OWLS-M1 to OWLS-M4 canincrease their precision compared to OWLS-M0 byavoiding its false logic-based subsumed-by matchesthrough additional syntactic similarity measure-ment. In fact, the hybrid subsumed-by filter allowsto detect the irrelevance of a service S that logicallysubsumes the query R with insufficient syntacticsimilarity value (SynSim(S,R)≤ α).

Hybrid false positives. However, due to se-quential execution of ordered logic-based and hy-brid matching filters, the hybrid filters inherit theremaining logic-based false positives from OWLS-M0. That can be avoided by complementary syn-tactic matching for all logic-based filters (exceptthe exact match) which led to the development ofOWLS-MX2 (cf. section 8).

Syntactic false positives only. On the otherhand, the complementary syntactic matching canalso cause hybrid false positives in case of sufficientsyntactic similarity but non-matching real-worldsemantics between service and request which wouldbe correctly determined by OWLS-M0 (logicalmatching failure). For example, logical connectiveslike ”and”, ”or” in the logically unfolded conceptexpressions are ignored by syntactic matching, sincethey are eliminated as classical stop-words in thepreprocessing step of unfolded service and query I/Oconcept expressions to weighted keyword vectorsfor text similarity measurements (SynSim(S,R)out;SynSim(S,R)in).

For example, in figure 10 we are asking for a ser-vice that is capable of either colouring or framinga given picture, and consider a service that is re-stricted to jointly perform both actions irrelevant.In this case, the logic-based OWLS-M0 correctly re-turns a logical matching failure while OWLS-M1 toOWLS-M4 ignore the subtle difference between the

14

Page 15: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

use of the terms ”and” and ”or” in the concept ex-pressions, hence detect a syntactic exact match ofboth pairs of I/O strings and return the service asrelevant with a hybrid nearest-neighbour matchingdegree (text similarity value of 1.0).

Fig. 10. Example: Hybrid false positives due to ignorance oflogical connectives by complementary syntactic matching.

7.3. Logic-based false negatives

Like for logic-based false positives, the reasonsof logic-based false negatives are mainly due tothe logical mismatch problem of the matchmakerontology and the implication of the all-quantifiedmatching constraints of OWLS-MX.

Ontology granularity: Similar concept siblings withlogical disjoint definitions. The problem of logicalmismatches due to insufficient ontology modelingcan also cause false negatives, that are serviceswrongly classified as irrelevant by OWLS-M0. Oneexample of such logic-based only false negatives isthe case of logically disjoint concept siblings withsimilar real-world semantics in a fine-grained on-tology. Please note that these concepts are notexplicitly defined disjoint in the ontology but de-termined to be disjoint by the matchmaker whilematching the service with the query. For example,in figure 11, the query output ”Hopital-Physician”and service output ”Emergency-Physician” are as-sumed to be semantically close such that the serviceis considered relevant to the query. However, thematchmaker classifies both conjunctive concept de-finitions differing in only one pair of their (equallyweighted) logical constraints as logically disjoint,hence produces a false negative.

All-quantified logical matching constraints: Moregeneric service input only. The all-quantified match-ing filters of OWLS-M0 require that the service

Fig. 11. Example: False negative caused by semantically sim-ilar but (not defined as) logically disjoint concept siblings inthe matchmaker ontology.

input must be logically more generic than or equalto the query input. In case of a linear mapping ofservice and query I/O concepts to correspondingXMLS signature data types on the service ground-ing level, this guarantees that the WSDL servicecan be invoked with the information specified in thequery by the user, in principle. However, this canlead to false negatives as shown in figure 12, sinceno logical filter of OWLS-M0 evaluates to true incases where the service input is more specific thanrequested.

Fig. 12. Example: False negative due to required genericityof service input.

All-quantified logical matching constraints: Logicalconcept relations of same type. Finally, the match-ing filters of OWLS-M0 require each pair of serviceand query I/O concepts having the same type oflogical subsumption relation. For example, in figure13, the logical subsumption relations between out-put concepts of query ”CarPlusBike” and service

15

Page 16: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

”4WheeledCarPackage” are different. As a result,OWLS-M0 fails to detect the service as relevant.

Fig. 13. Example: False negative due to different conceptsubsumption relations not accepted by the all-quantified fil-ter constraints of OWLS-MX

7.4. Avoiding logic-based false negatives

Due to complementary syntactic matching in caseof logical matching failure returned by OWLS-M0,the hybrid variants OWLS-M1 to OWLS-M4 canavoid the above cases of logic-based false negatives,thereby increasing their recall compared to OWLS-M0. This is achieved by detecting (hybrid) nearest-neighbour matches, if the degree of syntactic sim-ilarity between the considered pairs of concepts orservice and query I/O-signature as a whole is suffi-cient.

8. OWLS-MX2

The version OWLS-MX2 integrates syntac-tic similarity-based matching with logic-basedsubsumes and plug-in matching like the hybridsubsumed-by filter in OWLS-MX. That avoidsfalse-positives the hybrid OWLS-M1 to OWLS-M4inherit from OWLS-M0. Our experiments over theOWLS-TC 2.2 that contains cases for all of theabove mentioned false positives and false negativesshowed that OWLS-MX2 did outperform OWLS-MX for this reason, and performed slightly betterthan text IR by avoiding syntactic similarity-basedonly false positives.

Since the number of cases for false positives oftext IR in the collection OWLS-TC 2.2 is still signif-icantly less than those for logic-based false positives,the hybrid OWLS-M3 did not outperform syntac-tic matching only. In fact, we observed that in most

Fig. 14. R/P performance of hybrid OWLS-MX2, OWL-S-M3, and IR metric cosine/TFIDF.

domain ontologies fine-grained logical concept def-initions other than pure subclass relations are rarein practice which still handicaps logic-based only se-manric service matching in practice. However, thequantitative relation between logic-based and textIR only false positives and false negatives in the Se-mantic Web is unknown.

9. Related Work

There are only a few other hybrid semantic servicematchmakers available for OWL-S service profiles.We discuss each of them in very brief only (see alsofigure 1 in section 3), and refer to [10] for a coverageof SWS matchmaking in general.

Our OWLS-MX matchmaker is strongly inspiredby the hybrid matchmaker LARKS [17]. However,LARKS differs from OWLS-MX in several aspects:LARKS performs IOPE matching of service pro-files written in a proprietary capability descriptionlanguage with a description logic different fromOWL-DL. Besides, LARKS does not offer logicalsubsumes nor subsumed-by nor hybrid nearest-neighbour matching, and has never been experi-mentally evaluated.

The logic-based variant OWLS-M0 of OWLS-MXis similar to the prominent OWLS-UDDI match-maker [18] but is different with respect to the follow-ing issues: OWLS-UDDI makes use of a different no-tion of plug-in matching and does not perform addi-tional subsumed-by matching. Further, OWLS-M0

16

Page 17: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

allows to use arbitrary rather than known servicequery concepts into its local matchmaker ontology,and is not integrated with the UDDI registry stan-dard for Web service discovery.

The hybrid semantic OWL-S service profilematchmaker iMatcher [9] uses multiple edit- ortoken-based text similarity metrics (Bi-Gram, Lev-enshtein, Monge-Elkan and Jaro similarity mea-sures) to determine the degree of semantic matchingbetween a given pair of OWL-S service profiles. LikeOWLS-MX, the iMatcher transforms each struc-tured service profile description into a weightedkeyword vector that includes not only the namesbut terms derived by means of logic-based unfold-ing of its service input and output concepts. Inthis sense, iMatcher classifies as a hybrid match-maker. However, it does not perform logic-basedmatching which resulted in lower precision and re-call compared to OWLS-MX. In its adaptive modeiMatcher2 learns (over a test collection like OWLS-TC2.2) which of its ten text similarity measures toselect best for a given query. It has been experimen-tally shown that the combined logical deductionand regression-based learning of text similarities ofiMatcher2 is superior to logic-based only matching;iMatcher2 did outperform OWLS-MX in terms ofprecision.

The hybrid semantic service matchmaker FC-MATCH [1] performs a combined logic-basedand text similarity-based matching of mono-lithic service and query concepts written inOWL-DL. In this approach, a service conceptS is defined as logical conjunction of existen-tial qualified role expressions where each rolecorresponds to a selected profile parameter: S= ∃hasCategory(C1) � ∃hasOperation(C2) �∃hasInput(C3) � ∃hasOutput(C4)). Unlike mono-lithic logic-based service matching, FC-MATCHdetermines hybrid matching degrees by means oflogic-based subsumption of their profile parameterconcepts (Ci) together with computing the so-calledDice (name affinity) similarity coefficient betweenterms occuring in these concepts according to giventerminological relationships of the thesaurus Word-Net. However, to the best of our knowledge, FC-MATCH has not been experimentally evaluated yet.

[12] presents an approach to hybrid matching ofmonolithic logic-based semantic service descriptionsin OWL-DL extended with pricing policies (modeledin DL-safe SWRL rules) according to given prefer-ences by means of SPARQL queries to a given ser-vice repository.

10. Conclusions

The presented approach to hybrid semantic Webservice matching, called OWLS-MX, utilizes bothlogic based reasoning and non-logic based IR tech-niques for semantic Web services in OWL-S. Exper-imental evaluation results provide strong evidencein favor of the proposition that the performanceof logic-based matchmaking can be considerablyimproved by incorporating non-logic based infor-mation retrieval techniques into the matching algo-rithms.

The hybrid matchmaker OWLS-MX has beensuccessfully used in two fielded mobile e-healthsysems for emergency medical assistance and repa-triation planning, namely the Health-SCALLOPSsystem (www.dfki.de/scallops) and the CASCOMsystem (www.ist-cascom.org).

References

[1] D. Bianchini, V. D. Antonellis, M. Melchiori, D. Salvi,Semantic-enriched service discovery, in: Proceedings ofIEEE ICDE 2nd International Workshop on Challengesin Web Information Retrieval and Integration (WIRI06),Atlanta, Georgia, USA, 2006.

[2] L. Botelho, A. Fernandez, M. Klusch, L. Pereira,T. Santos, P. Pais, M. Vasirani, Service discovery., in:M. Schumacher, H. Helin (Eds.): CASCOM - IntelligentService Coordination in the Semantic Web. Chapter 10.Birkh”auser Verlag, Springer, 2008.

[3] W. Cohen, P. Ravikumar, S. Fienberg, A comparisonof string distance metrics for name-matching tasks, in:Proc. IJCAI-03 Workshop on Information Integration onthe Web (IIWeb-03), DBLP at http://dblp.uni-trier.de,2003.

[4] D. Fensel, F. van Harmelen, Unifying reasoning and

search to web scale., in: IEEE Internet Computing,March/April 2007.

[5] S. Grimm, Discovery - identifying relevant services., in:Semantic Web Services. Concepts, Technologies, andApplications. Springer, 2007.

[6] I. Horrocks, P. Patel-Schneider, F. van Harmelen, Fromshiq and rdf to owl: The making of a web ontologylanguage, Web Semantics, 1(1), Elsevier.

[7] M. Jaeger, G. Rojec-Goldmann, C. Liebetruth,G. M”uhl, K. Geihs, Ranked matching for servicedescriptions using owl-s., in: Proceedings of 14. GI/VDEFachtagung Kommunikation in Verteilten SystemenKiVS, Kaiserslautern, 2005.

[8] F. Kaufer, M. Klusch, Wsmo-mx: A logic programmingbased hybrid service matchmaker., in: Proceedings ofthe 4th IEEE European Conference on Web Services(ECOWS 2006), IEEE CS Press, Zurich, Switzerland,2006.

17

Page 18: OWLS-MX:AHybridSemanticWebServiceMatchmaker forOWL …klusch/i2s/owlsmx-08-final.pdfhybrid matching filters of OWLS-MX led to an even further improved matchmaker version called OWLS-MX2

[9] C. Kiefer, A. Bernstein, The creation and evaluation ofisparql strategies for matchmaking., in: Proceedings ofEuropean Semantic Web Conference, Springer, 2008.

[10] M. Klusch, Semantic web service coordination., in: InM. Schumacher, H. Helin (Eds.): CASCOM - IntelligentService Coordination in the Semantic Web. Chapter 4.

Birkh”auser Verlag, Springer, 2008.[11] M. Klusch, Z. Xing, Deployed semantic services for

the common user of the web: A reality check., in:Proceedings of the 2nd IEEE International Conferenceon Semantic Computing (ICSC), Santa Clara, USA,IEEE Press, 2008.

[12] S. Lamparter, A. Ankolekar, Automated selection ofconfigurable web services., in: 8. Internationale TagungWirtschaftsinformatik. Universitaetsverlag Karlsruhe,Karlsruhe, Germany, March 2007.

[13] L. Li, I. Horrocks, A software framework formatchmaking based on semantic web technology, in:Proceedings of the Twelfth International Conference onWorld Wide Web, ACM Press, 2003.

[14] T. D. Noia, E. Sciascio, F. Donini, M. Mogiello, Asystem for principled matchmaking in an electronicmarketplace., in: Electronic Commerce, 2004.

[15] OWL-S, Semantic markupfor web services; w3c member submission 22 november2004, http://www.w3.org/Submission/2004/SUBM-OWL-S-20041122/.

[16] M. Paolucci, T. Kawamura, T. Payne, K. Sycara,Semantic matching of web services capabilities.,in: Proceedings of 1st International Semantic WebConference (ISWC), 2002.

[17] K. Sycara, M. Klusch, S. Widoff, J. Lu, Larks: Dynamicmatchmaking among heterogeneous software agentsin cyberspace, Autonomous Agents and Multi-AgentSystems, 5(2), Kluwer.

[18] K. Sycara, M. Paolucci, A. Anolekar, N. Srinivasan,Automated discovery, interaction and composition ofsemantic web services, Web Semantics, 1(1), Elsevier.

[19] TREC, Text retrieval conference,http://trec.nist.gov/data/.

[20] C. van Rijsbergen, Information Retrieval, 1979.[21] A. M. Zaremski, J. M. Wing, Specification matching

of software components, in: 3rd ACM SIGSOFTSymposium on the Foundations of Software Engineering,1995.

18