23
This article was downloaded by: [University of California Santa Cruz] On: 10 October 2014, At: 19:24 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK International Journal of Geographical Information Science Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tgis20 Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web Chuanrong Zhang a , Tian Zhao b , Weidong Li c a & Jeffrey P. Osleeb a a Department of Geography and Center for Environmental Sciences and Engineering , University of Connecticut , Storrs, CT, USA b Department of Computer Science , University of Wisconsin–Milwaukee , Milwaukee, WI, USA c College of Resources and Environment, Huazhong Agricultural University , Wuhan, Hubei, China Published online: 15 Mar 2010. To cite this article: Chuanrong Zhang , Tian Zhao , Weidong Li & Jeffrey P. Osleeb (2010) Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web, International Journal of Geographical Information Science, 24:6, 903-923, DOI: 10.1080/13658810903240687 To link to this article: http://dx.doi.org/10.1080/13658810903240687 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

Embed Size (px)

Citation preview

Page 1: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

This article was downloaded by: [University of California Santa Cruz]On: 10 October 2014, At: 19:24Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of GeographicalInformation SciencePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/tgis20

Towards logic-based geospatial featurediscovery and integration using webfeature service and geospatial semanticwebChuanrong Zhang a , Tian Zhao b , Weidong Li c a & Jeffrey P.Osleeb aa Department of Geography and Center for EnvironmentalSciences and Engineering , University of Connecticut , Storrs, CT,USAb Department of Computer Science , University ofWisconsin–Milwaukee , Milwaukee, WI, USAc College of Resources and Environment, Huazhong AgriculturalUniversity , Wuhan, Hubei, ChinaPublished online: 15 Mar 2010.

To cite this article: Chuanrong Zhang , Tian Zhao , Weidong Li & Jeffrey P. Osleeb (2010) Towardslogic-based geospatial feature discovery and integration using web feature service and geospatialsemantic web, International Journal of Geographical Information Science, 24:6, 903-923, DOI:10.1080/13658810903240687

To link to this article: http://dx.doi.org/10.1080/13658810903240687

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

Page 2: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 3: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

Towards logic-based geospatial feature discovery and integration usingweb feature service and geospatial semantic web

Chuanrong Zhanga*, Tian Zhaob, Weidong Lic,a and Jeffrey P. Osleeba

aDepartment of Geography and Center for Environmental Sciences and Engineering, University ofConnecticut, Storrs, CT, USA; bDepartment of Computer Science, University of

Wisconsin–Milwaukee, Milwaukee, WI, USA; cCollege of Resources and Environment, HuazhongAgricultural University, Wuhan, Hubei, China

(Received 21 July 2008; final version received 30 July 2009)

Open geospatial consortium (OGC) web feature services (WFSs) facilitate feature-levelspatial data sharing over the web. However, OGC WFSs only emphasize technical datainteroperability via standard interfaces and cannot resolve semantic heterogeneity pro-blems in spatial data sharing. The lack of explicit semantics in the OGCWFS descriptionproves to be a major limitation to automatic geospatial feature discovery and WFScomposition. To overcome these limitations, this study proposed a solution for searching,discovering, and composing semantically heterogeneous transportation spatial data atfeature level from different sources over the web through providing semantic specifica-tions of WFSs. Geospatial semantic web technologies such as description logic, descrip-tion logic-based reasoner, inference rules, and web ontology language ontologies wereused to support geospatial feature data interoperability at the semantic level. Algorithmsfor automatic geospatial feature discovery and WFS composition were developed in thisarticle.

Keywords: WFS; OWL; DL; geospatial semantic web; transportation

1. Introduction

Although open geospatial consortium (OGC) web feature service (WFS) technologies haveundoubtedly improved the sharing and synchronization of feature-level geospatial informa-tion across diverse resources, recent literatures show that there are limitations to the currentimplementation of OGC WFSs.

First, the implemented OGC WFSs only emphasize technical data interoperability viastandard interfaces and cannot resolve semantic heterogeneity problems in spatial datasharing (Lutz and Klien 2006, OGC 2006). Difference in semantics used in different datasources is one of the major problems in spatial data sharing and data interoperability (Bishr1998). However, the OGCWFS description only allows for the specification of the syntax ofbasic service contents such as operation metadata, FeatureType list, and filter capabilities,and it provides no semantic descriptions of the meaning of these contents (OGC 2006). Twoidentical XML descriptions maymean very different things depending on the context of theiruses. In addition, the WFS specification of the outputs of each call to the service similarlylacks semantic definitions. All defined search operations return results using the same data

International Journal of Geographical Information ScienceVol. 24, No. 6, June 2010, 903–923

*Corresponding author. Email: [email protected]

ISSN 1365-8816 print/ISSN 1362-3087 online# 2010 Taylor & FrancisDOI: 10.1080/13658810903240687http://www.informaworld.com

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 4: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

structure, regardless of what information is requested. For example, Road feature contains afield highway which is used to describe highways and a field small-road which is used todescribe small roads. Even if the type of road specified in a Road feature file was clearlyidentified in a Type field by the interface designer, the OGC WFS description provides nouniform way of enabling such interpretations. It is up to the WFS client to recognize thevalues in these fields, which indicate whether it is a highway or a small road.

Second, without a formal semantic description of WFSs, it is difficult for users andapplications to perform an intelligent content-based search, and users cannot automaticallycompose and synthesize the discovered WFSs without additional human assistance orprogramming (OGC 2006). The lack of a formal and explicit representation of the semanticsin the XML-based standard OGCWFS description proves to be a major limitation to achievesemantic interoperability (Kuhn 2005). It is unrealistic to expect advertisements and requestsof geospatial features to be equivalent, or even that there exists a WFS service that can fulfillexactly the needs of the requester. For example, a WFS may advertise as a freeway-featuredata provider, whereas a user may request a highway feature from the data provider eventhough the freeway feature and the highway feature here refer to the same things. Thus tomake geospatial features more practically searchable and ubiquitously available, we need asemantic-based approach such that applications can use a WFS capability to a level of detailthat permits automatic discovery and composition geospatial features.

To overcome the aforementioned problems and facilitate automatic geospatial featurediscovery and integration, this study aims to extend the existing OGCWFSs with geospatialsemantic web technologies and examine the use of geospatial semantic web technologiessuch as description logic (DL) reasoner, inference rules, and web ontology language (OWL)ontology for OGC WFS descriptions to enable disparate geographic information system(GIS) to share and integrate geospatial features at the semantic level. Thus, the systems builton these technologies should automatically search and access geospatial features throughknowledge-base reasoning. We proposed a framework for feature-level geospatial datasharing. Algorithms for automatic geospatial feature discovery and WFS compositionwere developed for the proposed framework.

1. A framework for feature-level geospatial data sharing fortransportation network data

Figure 1 illustrates the framework of feature-level geospatial data sharing for transportationnetwork data. We use OGC WFSs to publish feature-level data through the web fromsemantically heterogeneous databases. The OGC WFSs are connected to OWL ontologies.We map the OGCWFS descriptions to OWL ontologies to provide a semantically based viewof the services, which span from abstract descriptions of the capabilities of the services to theactual feature data contents that exchange with other services. Because OWL is based on DL,we use a DL-based reasoner and inference rules to collect a knowledge base for the automaticgeospatial feature-matching engine.We develop an extendedDL formalism for spatial relationreasoning. OWL ontology in the proposed framework has a Feature class at the top, and theFeature class includes both a spatial feature class and a none-spatial feature class. The spatialfeature class contains data with point and line geometry such as Stop and Link, and the none-spatial feature class contains data such as Route, LinkSequence, and Patterns, which areclosely related to the spatial features. For example, each instance of the Route class containsseveral instances of Link that make up the route through the property route link. In the case thatthe relation between two ontology classes cannot be directly inferred in ontology, we firstconvert OWL ontologies into Resource Description Framework (RDF) triples and load them

904 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 5: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

into the knowledge base. We then apply inference rules to the facts in the knowledge base todetermine the relations.

The matching engine in the proposed framework includes geospatial feature discoveryalgorithms and a WFS composition algorithm. The geospatial feature discovery algorithmsallow matching a request described with concepts in ontologies with atomic WFS provided.Note that the provided WFSs are also described with concepts in ontologies. However, toprecisely fulfill the user’s query, atomic WFS may not be enough, and two or more WFSsmay be needed to synthesize the required service. The WFS composition algorithm isdeveloped for this purpose. The WFS composition algorithm allows one to create a work-flow of WFSs by splitting and joining the available WFS choices. To increase the efficiencyof complex geospatial feature discovery, the algorithms in the framework adopt an indexstrategy that allows rapidly finding the provided feature data that match the request.

The major advantage of this proposed framework is that the OGC WFSs are enhancedsemantically using ontology language OWL. It not only allows technical data interoper-ability via standard interfaces but also resolves semantic heterogeneity problems in feature-level spatial data sharing. The reasoning capability and computer interpretable semanticmark-up in the proposed framework does not restrict geospatial feature matching to simplestring comparison but permits more complex semantic matching based on subsumptionrelationships that can be performed. This framework makes automatic geospatial featurediscovery, composition, and synthesis possible.

In the following sections, we introduce the primary technologies applied in the frame-work in detail. These include enhanced semantic WFSs using OWL description, deduction

Requester

Requester

Requester

Route

WFS

BusStop

WFS

PatternWFS

DL-based reasonerinference rules

Knowledge base

Matching engine

Discoveryalgorithm

Compositionalgorithm

Feature

Spatial featureNone-spatialfeature

TransitStop TransitRoute PatternLink

Ontology

Requster ProviderSearchengine

Route inshapefile

BusStopIn oracle

Pattern inPostGIS

WebWeb

Web

Web

Web

Figure 1. A framework of real-time feature-level geospatial data sharing for transportationnetwork data.

International Journal of Geographical Information Science 905

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 6: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

knowledge base using DL and inference rules, geospatial feature discovery, and WFScomposition algorithms.

2. Enhanced semantic web feature services using logic-based OWL description

In the proposed framework, we map OGC WFS descriptions to OWL ontologies toprovide richer semantic specifications. We focus on specifying semantic descriptions forthree operations: GetCapabilities, DescribeFeatureType, and GetFeature. We specifysemantics for each FeatureType and FeatureProperty in defined GetCapabilities,DescribeFeatureType, and GetFeature operations. The semantics of FeatureTypes andFeatureProperties in WFSs are mapped to disjunctions or conjunctions of (possibly negated)concepts in OWL ontologies. Because the definitions of the semantic concepts in the enhancedsemantic WFSs are available at referenced uniform resource identifier (URI) ontology data-base on the web, the WFS provider and the client have a means of sharing terms. The resultsare that, by taking anOWLdescription of theWFSs, aWFS client can distinguish and properlyinterpret all FeatureTypes and FeatureProperties in GetCapabilities, DescribeFeatureType,and GetFeature operations.

The other contribution of the enhanced semanticWFSs is the ability for the automaticWFScomposition. For example, the user maywant to compose geospatial feature data such asFerryRoad, Foot Track Road, Transit Route, and Air Route located in separateWFS servers togetherfor Waukesha County. Because of the semantic heterogeneity of FeatureTypes andFeatureProperties, currently, the user must select these WFSs first, then interpret theFeatureTypes and FeatureProperties, and create the interoperable FeatureTypes andFeatureProperties. Then, the user must manually composite these feature data together. Withthe enhanced semantic WFSs, the information necessary to select and compose geospatialfeatures will be encoded at the WFS sites. WFS client software can be written to manipulatethese representations to achieve the request automatically by specifying data flow interactions.

Because OWL is based on DL, we can use a DL reasoner to compare (semantically)descriptions written in OWL and automatically reap the wealth of semantic information inOWL ontologies that describe relations between ontological concepts. In the followingsection, we give a general discussion on DLs in more detail and look at how a DL reasonercan help us find the knowledge needed to match the requested geospatial features.

3. Knowledge reasoning based on description logic and inference rules

To facilitate querying the enhanced semantic WFSs, the proposed framework requires thebuilding of a knowledge base for an automatic geospatial feature-matching engine that uses aDL-based reasoner and inference rules.

One of the characteristics of DL is that it enables systems built on it to infer implicitlyrepresented knowledge from the knowledge that is explicitly contained in the knowledgebase. To infer implicitly represented knowledge, the following two axioms are often applied:

Terminological Axioms If C, D are concepts and R, S are roles, then C # D (R # S) iscalled an inclusion axiom, which means that concept C (role S) is more specific than conceptD (role S). Also, C ;D (R ; S) is called an equivalence axiom, which means C and D (R andS) are equivalent and is an abbreviation of the pair of axioms C # D and C $ D (R # S andS $ R).

The notation of DL (Baader et al. 2003) used to describe the knowledge base can bedirectly translated to OWL-DL syntax. Assuming that knowledge about a transit system is to

906 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 7: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

be defined, we give an example to show how to use DL to express the knowledge base for theautomatic geospatial feature-matching engine.

Example Knowledge ‘A TimePoint is a TransitStop which is associated with at least onescheduled-time is defined by the following concept of inclusion axiom

TimePoint � TransitStop˙ 9 associatedWith:Time

where TransitStop and Time are either primitive or defined concepts and associatedWith is arole. The axiom implicitly defines TimePoint as a subconcept of TransitStop.

Using the OWL/RDF syntax, the above knowledge would be written as

,owl:Class rdf:about= "#TimePoint".

,rdfs:subClassOf.

,owl:Class.

,owl:intersectionOf rdf:parseType= "Collection".

,rdf:Description rdf:about= "#TransitStop"/.

,owl:Restriction.

,owl:onProperty rdf:resource= "#associatedWith"/.

,owl:onClass rdf:resource= "#Time"/.

,owl:minQualifiedCardinality

rdf:datatype= "&xsd;nonNegativeInteger".1,/owl:minQualifiedCardinality.

,/owl:Restriction.

,/owl:intersectionOf.

,/owl:Class.

,/rdfs:subClassOf.

,/owl:Class.

Defined concepts (‘if and only if’) can be added in the knowledge base and exploited toautomatically enrich the given basic annotations. Thus, we can define our own requiredconcept, such as ‘public_route_crossing_a_river’ as ‘route which is public and crosses ariver’ with a TBox axiom:

Public_route_crossing_a_river;route˙ public˙ cross:river

In addition, we may want to retrieve the instances of these concepts. This meansthat such instances must be recognized automatically, and this is what ontology-basedquery answering is all about. Obviously, inference is required to obtain these instances,as there are no told instances of Public_route_crossing_a_river. Definitions such asPublic_route_crossing_a_river refer to both spatial and non-spatial aspect descriptions.It is obvious that qualitative spatial description cross is of great importance for theconcept definition of Public_route_crossing_a_river. To provide means for dealingwith spatial descriptions used in OGC WFS operator descriptions, such as equals,disjoint, intersects, touches, contains, and crosses, we develop the extended DLformalism to reason spatial relations. The reasoning tasks depend on the extendedDL formalism for representing spatial knowledge. Spatial reasoning can be done byderiving spatial relationships from given knowledge such as existing transit networkmaps. Given, for instance, the following spatial descriptions ‘Bus stop A touches RouteB’ and ‘Route B is located in Waukesha’, software programs can automatically derivethat bus stop A is also located in Waukesha. The popular and well-known set of the

International Journal of Geographical Information Science 907

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 8: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

RCC8 relations for regions (polygons) is adopted in the extended DL formalism(Randell et al. 1992). Because most transit network data involve spatial relationsbetween points and lines, we extended the polygon RCC8 relations to points andlines as shown in Figure 2. There are three relations between line and line: equal(EQ), disconnected (DC), and cross (CR); two relations between point and line:un-touch (UT) and touch (TO); three relations between point and polygon: inside(IN), tangential part (TP), and outside (OUT); and four relations between line andpolygon: inside (IN), tangential part (TP), cross (CR), and disconnected (DC). Usingthese base spatial relations, indefinite spatial knowledge can be expressed as a union ofdifferent possible base spatial relations.

To query the individual spatial entities, a spatial relation network is computed from thegeometry of the exiting transit network map such as bus stop, bus route, and street and isrepresented by means of role assertions in ABox, e.g., (i,j):TPPI, (i,k):CR. Consider theinstance retrieval query Public_route_crossing_a_river (?x) on ABox:

A ¼ i: route˙ public; k: river; ði;kÞ : CRf g

To retrieve the instances of Public_route_crossing_a_river, we consider and check eachindividual instance separately. For example, let us consider i. Verifying whether i is aninstance of Public_route_crossing_a_river is reduced to checking the unsatisfiability of

A¨ i : : Public_route_crossing_a_riverf g or

A¨ i : : route¨: public¨ ð"CR:: riverÞð Þf g

This ABox is unsatisfiable; thus, i is a Public_route_crossing_a_river.In addition, we define inference rules to enable further query from the OWL knowledge

base. See Zhao et al. (2008) for details of the inference rules.

A,B

EQ

A B

DC

A

B

CR

A

B

UT

A

B

TO

AA

A

A

A

AA

B B B

B

BB

B

IN TP OUT

IN TP DCCR

Figure 2. Feature relations. Line and line relations: EQ, equal; DC, = disconnected; CR, cross. Pointand line relations: UT, un-touch; TO, touch. Point and polygon relations: IN, inside; TP, tangential part;OUT, outside. Line and polygon relations: IN, inside; TP, tangential part; CR, cross; DC, disconnected.

908 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 9: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

By providing a formal conceptualization of the transportation domain (definesprecisely the transportation concepts and the relationships) and using the aforemen-tioned DL and inference rules, the proposed framework facilitates knowledge sharingand reuse via automatic machine processing. In the following section, we introduce thegeospatial feature discovery and WFS composition algorithms used in the proposedframework.

4. Geospatial feature discovery algorithms

To find feature-level geospatial data, we consider a WFS feature as a basic unit of servicerather than the entire WFS server. The reason is that a user should not be concerned withwhether two features are located in different servers. Instead, a query processing moduleshould handle the task of joint query transparently. In this way, requesters can searchseparate sources that provide the requested information. The discovery algorithms considera WFS description as a consistent collection of restrictions over the named properties of aWFS, such as URI of the WFS server, feature type name, feature property name, geometrytype of feature, and bounding box of geometry.

Definition 1 AllWFS features are represented as tuples of the form (T, P,G, and B), whereT is feature type name, P is feature property name,G is the geometry type of feature, and B isthe bounding box of geometry. Let Q = (T0, P0, G0, B0) be a service query. The discoveryproblem can be defined as automatically finding a set S of WFSs such thatS ¼ ðT ;P;G;BÞ j T <: T 0;P <: P0;G <: G0;B <: B0f g, where ,: defines a partial orderon features, property sets, geometry types, and bounding boxes.

Define T1 <: T2 if T1 corresponds to an ontology class that is the same as or a subclassof that of T2.

Define p1 <: p2 if feature property P1 corresponds to an ontology property that is thesame as or a subproperty of P2.

Define P1<: P2 if for each feature property p2 in P2, there exists a p1 in P1 such thatp1<: p2. We may relax this definition to replace ‘for each’ with ‘for some’ to allow morematches.

Define G1<: G2 if G1 is a geometry type that is equal to or a subtype of G2.Define B1 <: B2 if B1 is a bounding box contained in B2. We may relax this definition so

that the partial order holds even if they only intersect to retrieve more feature results.

Before the discovery process, we need to index the available WFS features in an indexfile for a more efficient search when there is a large number of features. The indexing processis to check all available services and collect the following fields for each feature: (1) URI oftheWFS server; (2) feature names; (3) feature property names; (4) geometry type of features;and (5) bounding box of geometry. A service broker, responsible for collecting WFSs,maintains the index file with an entry for each WFS feature.

The discovery process has several steps as illustrated in Figure 3. First, we map eachWFS feature to its domain and application ontologies using the algorithm illustrated inFigure 4a. We use the breadth-first search algorithm to traverse the ontology class hierarchyto search for matched ontology classes. Leaves classes are searched first, then theirimmediate superclasses, and so on. For instance, if one WFS bus route feature has theterm Route, we would map it to TransitRoute ontology instead of its superclass, Line.Because the algorithm uses breadth-first search, it is guaranteed that the more preciselymatched subclass in the hierarchy has already been checked when the more generalsuperclass is being compared.

International Journal of Geographical Information Science 909

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 10: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

Second, wemap a user’s query to domain and application ontologies and generate formalquery descriptions using the algorithm shown in Figure 4b. The discovery engine abstracts auser’s query into four parts of service descriptions that are required to answer the query:feature type names, feature property names, geometry type of features, and bounding box ofgeometry. These four parameters of capability descriptions must (1) reflect the semanticcontent of the query and (2) reflect the requirements of the generated requests.

Third, find the matched WFS features using the matching algorithm in Figure 4c. Thediscovery engine finds the appropriate WFS features by matching the descriptions requiredto solve the query with the descriptions of providers through the parameters. The algorithmfirst uses the query bounding box of geometry parameter to narrow down the list of servicesin the repository. It acquires all of those services that produce at least matched bounding boxof geometry (all WFSs that are located within the geography limitation). From thoseservices, it further narrows down the list of services by geometry type of features, then byfeature type names, and finally by property names. All the description parameters providedby WFSs must either be equivalent to or subsume the required description parameters in thequery. Whenever an exactly equivalent match is found, it is recorded with the highest score.Otherwise, according to the degree of match detected, it is recorded with a lesser score. Wecalculate the score using the following formula:

sðiÞ ¼ ST ðT ; T 0Þ �WT þ SPðP;P0Þ �WP þ SGðG;G0Þ �WG þ SBðB;B0Þ �WB (1)

where s(i) is the calculated score, ST is the similarity score between two ontology featureclasses T and T0, SP is the similarity score between two feature property sets, SG is the

Figure 3. WFS discovery process.

910 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 11: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

similarity score between two geometry type of feature sets, SB is the similarity score betweentwo bounding box of geometry sets, and WP;WG;WB are the weights of the property set,geometry type, and bounding box, respectively. The similarity score ST of two ontologyfeature classes, T and T0, can be based on a number of inheritance levels between them. Forexample, ST ðT ; T 0Þ ¼ 0 and ST ðT ; T 0Þ ¼ 1 if T is a direct subclass of T0.WT is the weight ofthe similarity score of ontology feature classes. The similarity score SP of two property sets iscomputed the same way except we add up the similarity scores between the matchedproperties in the two sets to produce the similarity score of the two sets. SG is computedthe same way as ST. SB is calculated as below:

(A) Algorithm for mapping WFS features to domain and application ontology 1 Input available WFS features F 2 domain and application ontology O 3 Initialize I = empty set

4 for each feature f in F 5 find a class T in O that matches the feature name 6 find a property set P in O such that each of P matches a feature property 7 find a class G in O that matches the geometry type of f8 create an instance B of bounding-box class in O that corresponds to the bounding box of f 9 add (T, P, G, B) to I10 end for

11 Output an index file I

(B) Algorithm for generating formal query descriptions1 Input target query q 2 domain and application ontology O

3 if q specifies a feature name, then find a class T´ in O that matches the name, else let T´ be the root Feature class in O

4 for each property specified in q, find a matching property in O and collect the properties into a set P´. Otherwise, P´ is empty.

5 if q specifies a geometry name, then find a matching class G´ in O, else let G´ be the root Geometry class in O

6 if q specifies a bounding box, then create an instance B´ of the bounding-box class in O, else let B´ be the largest bounding box.

7 Output query in form of (T′, P′, G′, B′).

(C) Algorithm for geospatial feature match1 Input index file I of WFS 2 feature query (T´, P´, G´, B´)

3 Initialize I´ = empty set

4 for each (T, P, G, B) ∈ I 5 if T <: T´ and P <: P´ and G <: G´ and B <: B´, then add (T, P, G, B) to I´ 6 end for

7 for each '),,,( IBGPTi ∈=8 compute BBGGPPTT WBBSWGGSWPPSWTTSis *)',(*)',(*)',(*)',()( +++=9 end for

10 Rank each element i in I´ using s(i) so that i is before j if )()( jsis > .

11 Output a set of WFS features I´

Figure 4. WFS discovery algorithms.

International Journal of Geographical Information Science 911

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 12: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

SBðB;B0Þ ¼ sizeOf overlapAreaOf ðB;B0Þð Þ (2)

The main advantage of the matching algorithm is that it supports a flexible semanticmatch between the provided WFSs and the requests. The match between the descriptions ofthe requester and the descriptions of the WFS provider depends on the relation betweenthe concepts associated with those parameters in descriptions. For instance, considerhow a request with the feature class MilwaukeeBusRoute matches a WFS when the featureclass is TransitRoute. Given a transit ontology, the discovery engine would matchMilwaukeeBusRoute with TransitRoute instead of Highway, because MilwaukeeBusRouteis a subclass of TransitRoute, whereas it has no direct relation with Highway.

Furthermore, the result of thematch is not a hard true or false, and it depends on the degree ofsimilarity between the concepts in the match. The discovery engine can draw an inferencebetween descriptions of the provided WFSs and requests on the basis of available ontologies.Despite the flexibility, the discovery engine still rejects WFS features that do not match therequests and accepts, but with a low score, matches that may not be satisfactory for the requester.

To evaluate the accuracy of the algorithm, we set up a benchmark set of web features anda suite of test cases. Each test case is a query that can be answered by one or more webfeatures in the benchmark set. We first identify the web feature types that contain answers toour test cases, then use them as ground truth to our test cases. The evaluation of query usestwo metrics – precision and recall – as follows:

Precision ¼ number of correct feature types returned by the algorithm

total number of feature types returned by the algorithm

Recall ¼ number of correct feature types returned by the algorithm

total number of correct feature types in the benchmark set

Precision is related to the false-positive rate, whereas recall is related to the false-negative rate. Higher precision can be achieved through a stricter matching standard in thealgorithm, but it can decrease recall by potentially missing more correct answers. Part of theevaluation process is to find suitable parameters for the algorithm to achieve reasonablebalance of precision and recall.

5. Geospatial web feature service composition algorithm

Should the matching engine not find a single WFS that matches the user’s query, it willsearch for two or more WFSs that can be composed to synthesize the required service usingcomposition algorithms. This task is called composition.

Definition 2 Let P be the set of all provided WFS features in a given web servicerepository. AWFS is represented as tuples of the form (T, P, G, B), where T is feature typename, P is a set of feature property, G is geometry type of feature, and B is bounding box ofgeometry. LetQ = (T0, P0,G0, B0) be a service query. The composition problem can be definedas automatically finding a set S of services such as S = (S1, S2, . . . Sn) where for all i, Si = (Ti,Pi, Gi, Bi) and

Ti <: T 0; ðP1 ¨P2 ¨ :::¨PnÞ <: P0; Gi <: G0; ðB1¨B2 ¨ :::¨BnÞ <: B0

912 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 13: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

When no completely matched WFS is discovered to satisfy a user’s objective, theexisting partially matched WFSs may need to be combined together to fulfill the queryrequest. The WFS composition is the process of selecting and combining WFSs toachieve the user’s goal that cannot be realized by the existing WFSs. To performautomated composition, the steps in the discovery process will be performed first toallow the reasoning system in the discovery engine to order and combine WFSs. If thediscovery engine cannot find an existing appropriate atomic WFS to satisfy a user’sobjective, it will try to find partially matched WFSs or composable services and combinethe partially matched WFSs or composable services to achieve the user’s objective byusing the composition process and algorithm as illustrated in Figures 5 and 6. As shownin Figure 5, to produce the composite service, the composable WFSs that are useful forthe composition are selected at multiple stages: First, the discovery engine tries toretrieve the WFSs with feature type name T and geometry type of the feature Gparameters such that T ontology is equal to or is a subclass of the query ontology T0

and G ontology is equal to or is a subclass of the query ontology G0. This will result in aset of service S1. Second, from the set of service S1, the discovery engine narrows downthe subset S1 to a subset S2 under the condition that the feature property P is equal to or isa subproperty of query feature property P0, where P is the union of the propertyparameters of the WFSs in S2. Third, the discovery engine further narrows down thesubset S2 to a subset S3 under the condition that the bounding box B is contained in thequery bounding box B0, where B is the union of the bounding box parameters of theWFSs in S3. We repeat steps two and three until S3 is found or all possible S2’s have been

Retrieve services with T and G

parameter such that T <: T’, G <:G’

and result in a set of services S1

Find a subset S2 of the services in

S1 such that if P is the union of the

property parameters of the services

in S2, then P <: P’

Return the services in S3 as a

composition of split services

Find a subset S3 of the services in

S2 such that if B is the union of the

bounding boxes of the services in

S3, then B <: B’

Given query

parameter

Q = (T ′, P′, G′, B′)

Repeat until S3 is found or all

possible S2 have been tested.

Figure 5. WFS composition process.

International Journal of Geographical Information Science 913

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 14: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

tested. Finally, we return the services in S3 as a composition of the split services. Theformal algorithm for service composition is shown in Figure 6.

The evaluation of the query composition algorithm uses the same benchmark set and testsuite used for evaluating the feature discovery algorithm. However, the ground truth for thetest suite is some sets of web feature types that need to be combined to give completeanswers to the original query. We also use precision and recall to evaluate the compositionalgorithm. Again, there could be tradeoff between improving precision and improving recall.We can increase recall by imposingmore relaxed rules in selecting web feature types, but thiscould potentially decrease precision.

6. A prototype implementation

A prototype was implemented based on the proposed framework shown in Figure 1. Themain goal of the prototype is to search and access semantically heterogeneous geospatialfeatures for transportation data using the proposed discovery and composition algorithms.Figure 7 illustrates the architecture of the implemented prototype. The prototype’s architec-ture mainly consists of the following:

l ESRI ArcGIS, which provides semantically heterogeneous spatial data;l Geoserver (http://geoserver.sourceforge.net/html/index.php), an open-source soft-

ware that enables full implementation of OGC WFS and WMS specifications andserves ShapeFile data using WFSs and WMSs;

1 Input: Q = (T′, P′, G′, B′)2 A set of WFS S

3 Initialize: let sets S1, S2, and S3 be empty

4 For each s = (T, P, G, B) in S 5 If T <: T´ and G <: G´, then add s to S1 6 End for

7 A property p´ is covered by a service s = (T, P, G, B) if there is a p in P such that p <: p´.

8 A bounding box B´ is covered by a set of services if the union of the bounding boxes of the services covers B´

9 Repeat until either B´ is covered by services in S3 or all possible S2 has been tested

10 Set S2 to empty and let S1´ = S1

11 Repeat until either S1´ is empty or all properties in P´ are covered by services in S2 12 Take a service s = (T, P, G, B) from S1´ such that P covers at least one properties in P´

that are not already covered by services in S2 13 Add s to S2

14 Set S3 to empty

15 Repeat until either S2 is empty or B´ is covered by services in S3 16 Take a service s = (T, P, G, B) from S2 such that B and B´ overlaps 17 Add s to S3

18 Output: The services in S3 as a composition of split services (that can be executed in concurrently)

Figure 6. WFS composition algorithm.

914 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 15: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

l Tomcat, a Java Servlet container, which provides web developers with a simpleconsistent mechanism for extending the functionality of a web server and for acces-sing web application including GeoServer and spatial ontology server through HTTPprotocol;

l Spatial ontology server, which is based on Joseki (http://www.joseki.org/) andprovides HTTP services to answer ontology queries in SPARQL forms. The serveruses a domain ontology for spatial features and application ontology for transpor-tation network data. The ontology server stores ontology instances in files ordatabases;

l Spatial data query and transformation component, which was developed based onJena (http://jena.sourceforge.net/). This component extracts information from WFSservers and creates ontology definitions for the extracted web features. Also, thecomponent transforms feature instances into ontology format to store in the ontologyserver; and

l Web-based spatial query client, which is used to render the ontology queried results asgraphic maps. This client uses OpenLayers – a JavaScript library for spatial data.

The data used in the implemented prototype come from the Waukesha Transit TripPlanning Project. Two WFS servers were created. The bus route WFS server publishes thebus route data using the feature name ‘wks:routes’, whereas the bus stop WFS serverpublishes the bus stop data using the feature name ‘wksha:BusStops’. Note that the name-space prefixes are different as the features are in different WFS servers.

To implement the prototype, the following steps are performed:

Shapefiles PostGIS Spatial DBSpatial DB

Java-

basedTomcat Servlet Container

WFS/WMS server A

(Geoserver)

Java-

based Tomcat Servlet Container

WFS/WMS server B

(Geoserver)

Spatial data query and

transformation component

Java-

based

Java-

based

Tomcat Servlet Container

Spatial ontology server

(Joseki Server) Triple DB

Map/spatial feature

Web client

(OpenLayers)

Map/spatial feature

Web client

(OpenLayers)

Figure 7. Architecture of the implemented prototype.

International Journal of Geographical Information Science 915

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 16: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

(1) Create a domain ontology for feature data in general and an application ontology fortransit bus route and bus stop spatial data. We created three domain-specific ontologies usedin transportation networks:

(1) transportation base (http://jiangxi.cs.uwm.edu/planner/tranbase.owl),(2) road networks (http://jiangxi.cs.uwm.edu/planner/tranroad.owl), and(3) transit networks (http://jiangxi.cs.uwm.edu/planner/trantransit.owl).

These ontologies are in OWL and translated from existing UML data models fortransportation applications (http://www.fgdc.gov/standards/standards_publications/; Part 7,Transportation Base).We developed algorithms to automatically transform these UMLmodelsinto OWL ontologies (Zhang et al. 2008). However, because of the differences between UMLand OWL, we could not create all necessary ontologies by the automatic transformationmethod. So we used the ontology editor tool Protege to create those ontologies that cannotbe transformed from existingUML datamodels. Then,we integrated the three domain-specificontologies together. As the three domain-specific ontologies are internally consistent, wesuccessfully avoided all the integration difficulties that would arise from importing othertransportation ontologies into ours.

The application ontology in our prototype is divided into two parts: one part correspondsto some WFS features and the other part corresponds to some database tables that supple-ment theWFS features. To represent geospatial features as ontology instances, we used somepredefined OWL classes such as Feature, Geometry (with subclasses Point, Line, andPolygon), BoundingBox, and Area. Relations between classes were often asserted withowl:subClassOf property. Also, we introduced some object properties such as has_geometryand data-type properties such as minx, maxx, miny, and maxy. The domains and ranges ofproperties were asserted using rdfs:domain and rdfs:range. Restrictions on properties werenot used as it might prevent future extension. We also automatically generated some OWLclasses fromWFS features such as Route, Link, and Stop. In addition, some OWL propertieswere auto-generated from feature properties as well. Because the scope of OWL name isglobal and name conflicts are not tolerated while WFS property names are locally scoped,we defined an OWL property to overwrite any previous definition of the same name toresolve name clashes. To prevent any further conflicts, we did not place domain or rangerestriction on these properties. We related auto-generated OWL definitions with predefinedones through assertions such as owl:subClassOf and owl:subPropertiesOf. WFS featureinstances were automatically translated into OWL individuals using the predefined and auto-generated classes and properties. To enable WFS feature search and discovery, we auto-generated an ontology individual for each feature type to include properties such as featurename, URL, bounding box, and name geometry type.

The application ontology corresponding to database tables is a virtual graph of RDFnodes created by a tool – D2R server from database tables. The RDF ontology was generatedbased on a mapping configuration file used by D2R server. The ontology is a straightforwardmapping from database tables – one table maps to one ontology class and one column mapsto one ontology property. Additional object properties were defined via inference rules basedon existing data-type properties and classes. Moreover, we defined inference rules todescribe some object properties to connect the two parts of application ontology to supportthe user query. For example, to find the geometry of the bus stops in a bus route, we needboth the application ontology that corresponds to WFS features, which contains the geome-tries and the IDs of the stops, and the application ontology that corresponds to the databasetables, which contains the correspondence between a route ID and the IDs of the stops in the

916 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 17: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

route. Using the inference rules and the reasoning ability of OWL ontology, our prototypecan support queries that cannot be answered by WFS or database alone.

(2) Index the available WFS features by extracting feature names, feature property lists,geometry types, and bounding boxes of the features. The indices are instances of a specialontology class Feature. Each instance contains the detailed information about a WFSfeature. When the user searches for features, the returned feature instances are used to locatethe corresponding ontology classes, which are used to locate all the feature instances.

(3) Map WFS features, properties, and geometries to ontology classes and properties.This process has to be done manually so that ontology classes generated from WFS featureswill become subclasses of the domain and application ontology classes. Also, if a generatedontology property is equivalent to an existing property, then they are merged. For example,we generated an ontology class TransitRoute corresponding to the feature wks:routes. Wehave to manually identify that this class is indeed a subclass of TransitLink, which describesa segment of a transit route. This information cannot be determined based on the names ofthe features alone as names can be misleading. Also, we generated an ontology propertythe_geom from a common feature property with the same name. A similar property geom hasalready been defined in our domain ontology to refer to the geometry of features. Therefore,we merged the two properties into one. After completing this process, we can automaticallyquery for spatial data. For example, if we are to find out the geometry of a route by the nameof ‘Summit’, we can simply query for instances of TransitRoute with the name ‘Summit’.The geometry of the route is the union of the geometries of the links on the route.

(4) Take service queries and return a list of WFS feature services. For example, to locatethe requested bus route and bus stop features from the two separated WFS servers, twoservice queries are needed:

ðTransitRoute; fgeometry; descriptiong; Line;BÞ; andðTransitStop; fgeometry; intersectiong;Point;BÞ;

where B is a bounding box described in N3 notation, such as

[ a :BoundingBox ;:maxx "-73.90782"66xsd:float ;:maxy "40.882076"66xsd:float ;:minx "-74.04719"66xsd:float ;:miny "40.67965"66xsd:float

]

These queries ask for features of the type TransitRoute or TransitStop, and the feature’sgeometry type must be line or point and their bounding box must cover B. Once a usersupplies the two queries, the system matches the ontology class TransitRoute to wks:routesand matches TransitStop to wksha:BusStops. Similarly, the property lists are matchedagainst the properties of the two features. Finally, the system makes sure that geometrytypes are matched, and the two feature services cover the bounding box B.

If the system found the partially matched WFSs using the proposed discovery algorithmin Figure 4c, it will compose the partially matched services using the composition algorithmin Figure 6. The resulting data can be parsed to present the needed results.

(5) To improve performance, retrieved feature instances were transformed into ontologyindividuals and were stored in the ontology server. This way, the client does not have torepeatedly send requests to WFS servers for the same feature instances.

International Journal of Geographical Information Science 917

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 18: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

6.1. Some experimental results

With formal semantic descriptions of WFSs in the implemented prototype, it is possible toallow users and applications to discover, compose, and synthesize semantically heteroge-neous geospatial features automatically. Figure 8 shows the results of the automaticallydiscovered and composite bus roads and bus stops located within 1 km of the bus stop in theFox Run Shopping Center of the downtown Waukesha. The URL for this demo is http://129.89.38.96:8080/ijgis/wks.html. Users can get all the routes in the downtown Waukeshaby clicking the button ‘Get Route Names in Waukesha’. Users can then query an individualroute by clicking the corresponding ‘display’ link in the column of ‘Get Route’ and all busstops related to a route by clicking the corresponding ‘display’ link in the column of ‘GetStops’. Particularly, users can query any composite bus roads and bus stops within a certaindistance (default is 1 km) of the selected bus stop by clicking the hyperlink of the location(user should first click to select a bus stop geometry).

While semantically enhanced WFSs help to automate the discovery and integrationof geospatial features, they can introduce significant runtime overhead with ontologyreasoning when there are a large number of ontology instances. This may impede thereal-time access and exchange of the geospatial feature data. The performance of TBoxreasoning (reasoning about concepts) is generally acceptable. However, ABox reasoning(reasoning about individuals) can be quite slow for some realistic sets of spatial data.The implemented prototype manipulates thousands of feature instances. We can affordto convert all these feature instances into ontology instances and then perform geospatialreasoning. The performance of the prototype (http://129.89.38.96:8080/ijgis/wks.html)for user queries and composition of bus routes and stops is comparable with that ofdirect access to WFS servers. However, the transportation network in a larger city maycontain tens of thousands to millions of feature instances, which could make it imprac-tical to convert WFS features into ontology instances and then conduct reasoning. In

Figure 8. Bus roads and bus stops located within 1 km of the bus stop in Fox Run Shopping Center ofdowntown Waukesha.

918 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 19: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

fact, we decided not to convert the street network of Waukesha to ontology instances inour implementation as it would incura noticeable delay in querying. We chose to storethe street network in a WMS server and relay WMS query from our ontology server toretrieve street maps instead. Thus, the challenge is how to improve the efficiency ofgeospatial reasoning so that it is still practical to allow real-time combination ofgeospatial features from various sources. In this case, we need to avoid converting allgeospatial features into ontology instances. Instead, we need a query pre-processor totranslate the A-Box queries into some subqueries, which are sent to spatial or non-spatial data sources such as WFS server or database to get the needed data. This methodcan greatly enhance query efficiency, though we may lose some reasoning power as wecannot apply reasoners to geospatial features directly.

To evaluate the accuracy of the web feature search algorithm and web feature composi-tion algorithm, we conducted a small experiment using a micro-benchmark consisting of 33web feature types. To evaluate the performance of the search algorithm, we tested threequeries on point, line, and polygon geometries. The average precision is about 81.94%,whereas the recall is 100%.We also tested three queries on the composition algorithm whereeach query involves more than one web feature type. The average precision is 100%,whereas the recall is about 83.85%. It is understandable that the search algorithm yieldshigher recall, whereas the composition algorithm has higher precision: The search algorithmreturns any feature type that might have relevant answers to the query, but the compositionalgorithm requires a precise match and thus could potentially miss correct answers. Note thatthis experiment is mainly to test the usability of the algorithms, and the precision/recallmight be different if the algorithms are tested for different sets of web features and fordifferent queries. Thus, it is necessary to have larger data-sets with more diverse webfeatures to fully evaluate the algorithms.

7. Related work

Although many efforts in IT are underway to propose different algorithms or approaches tofacilitate heterogeneousweb service discovery and composition at the semantic level (Paolucciet al. 2002, Sirin et al. 2003, Sycara et al. 2003, Liang et al. 2004, Brogi et al. 2005, Yu and Lin2005, Kona et al. 2006), these algorithms or approaches deal with web services in general,which are based on the Web Services Description Language specification. However, OGCWeb services are Representational State Transfer (REST)-based and do not have WSDLdescription; thus they cannot be directly applied to OGC WFSs for integrating or sharingsemantically heterogeneous geospatial data at feature level. There is recent interest byresearchers in exploiting the geospatial semantic web for automatic integration of semanticallyheterogeneous geospatial data (Wiegand et al 2004, Wiegnd and Zhou 2005, Wiegand andGarcia 2007, Alam et al. 2008, Li et al. 2008, Yang et al. 2008). But none of them focuses onextending OGC WFSs for feature-level geospatial data sharing.

In addition, although geospatial web service discovery and composition have been exploredin the GIS literature, most studies focused on discovering semantically annotated metadata(Klien et al. 2004, 2006, Lutz and Klien 2006, Tanasescu 2006, Lutz 2007) or using OWL-S,Business Process Execution Language (BPEL), or Web ServiceModelling Ontology (WSMO)fordifferent types ofweb service discovery or composition (Granell et al. 2004, Lemmens 2004,Jagery et al. 2005, Alam et al. 2007, Gone and Schade 2007, Lutz 2007, Yue et al. 2007). Theseapproaches have some limitations. For example, the WSMO discovery is not yet fully imple-mented, and orchestration is under-specified (Tanasescu 2006). The Web Service ModellingeXecution environment (WSMX) engine used byWSMO is currently not capable of creating a

International Journal of Geographical Information Science 919

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 20: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

composition of service instances on the fly. The other existing SAWSDL (semantic annotationsfor WSDL) approach is a simple extension of WSDL using the extensibility elements and justcannot be directly applied to REST-based OGC web services. Furthermore, all aspects of thespecification of SAWSDL have not been implemented yet (http://www.alphaworks.ibm.com/tech/wssem). Although BPEL has good control over the workflow at design time, it definescomposition processes using variables defined in WSDL and XML schema. Thus REST-basedOGC web services also cannot be used directly within a BPEL process, because they do nothave WSDL descriptions. In addition, BPEL offers only limited support in the compositionprocess.

The main objective of this research is to find a solution to share semantically hetero-geneous geospatial data at feature level by using enhanced semantic WFSs. This proposedsolution is different from the popular OWL-S approaches in the literature (Alam et al. 2007).Although OWL-S provides a formal mechanism for modeling and describing web services(in terms of their inputs, outputs, preconditions and effects, and of their process model) andprovides mechanisms for mapping an OWL-S specification to a WSDL specification, it isdesigned for different types of web service discovery and composition. OWL-S does notwork well with OGC geospatial web services such as WFSs, which are not ‘web services’that conform to the WSDL standard. Thus, it is difficult to ground OWL-S definitions toWFSs. Furthermore, because OGC WFSs use an identical protocol, there are no semanticdifferences in terms of inputs, outputs, preconditions, and effects. The semantic differencesin OGCWFSs result from the contents of geospatial features. There is, however, no apparentway to encode these differences in OWL-S, which only deals with the way by which webservices send/reply messages.

Finally, few publications exist that specifically study feature-level data sharing at thesemantic level using ontology-based OGC WFSs (Lutz and Klien 2006, Zhao et al. 2008).Zhao et al. (2008) proposed geospatial data queries and integration, which used a method ofrewriting SPARQL ontology queries to WFS getFeature requests and SQL queries todatabases. Although Lutz and Klien (2006) studied ontology-based retrieval of geographicinformation by discovering appropriate spatial data sources from the metadata of cataloguesthrough formulating a WFS query using terms from a shared ontology, their study has somelimitations: (1) they assumed requesters only search one data-set source that provides all therequested information. In reality, this might not always be the case as many real-worldapplications cannot completely perform their tasks by only querying a single data-set, andusually they need to request data from separate sources to fulfill their needs; (2) they limitedtheir search for discovering WFSs that provide semantically appropriate feature types(semantic annotation of feature types) and did not consider spatial reasoning and complexqueries (queries that consist of two or more OGC WFS queries); and (3) their system workson smaller data-sets where the designer of the interface must have intimate knowledge ofthe data semantics so that she/he can build a translation layer from the user query to theconcept query.

In this study, we go a step further and explore how to automatically discover andcompose appropriate feature-level geospatial data from different sources using ontology-based WFSs. We study how to search and compose the appropriate WFSs by extendingthe normal logic-based reasoning to spatial reasoning. We develop searching algorithmsfor fast retrieval of feature-level geospatial data from separate WFS sources and acomposing algorithm for combining two or more OGC WFS queries so as to allowfor complex queries and integration of feature-level data. Finally, we describe a systemthat can work on large data-sets where the system developer needs much less knowledgeabout data semantics.

920 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 21: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

8. Discussion and conclusion

This study examines the use of geospatial semantic web technologies that are capable ofsolving the issue of automatically discovering and composing semantically heterogeneousgeospatial features. A solution is proposed by providing semantic specifications of WFSs.Algorithms for automatic geospatial feature discovery and WFS composition are developedin this study. A prototype system was implemented to test the proposed solution. Resultsshow that ontologies for WFSs are useful for conveying geospatial semantics and allowautomatic geospatial feature discovery and WFS composition.

While the approach has all the above advantages, it does have its limitations. The mainconcerns with the approach are as follows.

First, although DL is an excellent tool to support ontology reasoning and knowledgeobtaining, ontology modeling in DL is not at all easy and is an intuitive task. It is difficult torepresent each single real-world object with axioms about concepts and roles. Furthermore,the reasoning strength of DL depends on the quality of ontologies. Second, although theproposed framework supports spatial reasoning by using extended DL, the spatial reasoningabilities are limited, and further extensions would be necessary to make spatial terminolo-gical reasoning practical.

Third, in the proposed framework, the discovery still belongs to a certain semiautomaticstage. Without any constraints, completely automatic discovery still is not possible yet. Inour implemented prototype, we manually match feature type/properties to the predefineddomain/application ontology definitions. At this stage, this is unavoidable as there is nogeneral way to infer the true intent of a feature type based on its description alone. Serviceproviders and consumers of the same domain have to agree on some compatible applicationontology to support effective service search and discovery tools. Another possibility is tocombine ontology with information retrieval methods to support keyword-based search andretrieval with some semantic reasoning. For example, one can use search keywords to locaterelevant concepts in domain/application ontology and then use the retrieved concepts to findappropriateWFSs. This hybrid approach could be more useful when there are a large numberof ontological concepts, and users are not familiar with the domain or application ontology.

In addition to the above concerns, several other concerns such as fault/error handling,reliability, and security issues still need further research.

Acknowledgment

We thank the anonymous reviewers for their constructive comments on the manuscript. This research ispartially supported by USA NSF grant No-0616957. Authors have the sole responsibility to all of theviewpoints presented in this article.

ReferencesAlam, A., Khan, L., and Thuraisingham, B., 2008. Geospatial Resource Description Framework

(GRDF) and security constructs. In: Data Engineering Workshop, 2008. ICDEW [online],475–481. Available from: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4498363[Accessed 10 November 2008].

Alam, A., et al., 2007. DAGIS: a geospatial semantic web services discovery and selection framework.In: F. Fonseca, M.A. Rodrıguez, and S. Levashkin, eds. Lecture notes in computer science forGeoS 2007, LNCS 4853. Heidelberg: Springer-Verlag, 268–277.

Baader, F., et al., 2003. The description logic handbook: theory, implementation, and applications.Cambridge: Cambridge University Press.

Bishr, Y., 1998. Overcoming the semantic and other barriers to GIS interoperability. InternationalJournal of Geographical Information Science, 12, 299–314.

International Journal of Geographical Information Science 921

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 22: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

Brogi, A., Corfini, S., and Popescu, R., 2005. COMPOsition-Oriented Service Discovery. In:T. Gschwind, U. Aßmann, and O. Nierstrasz, eds. Lecture notes in computer science for SC2005, LNCS 3628. Heidelberg: Springer-Verlag, 15–30.

Gone, M. and Schade, S., 2007. Towards semantic composition of geospatial web services – usingWSMOin comparison to BPEL. In: F. Probst and C. Keßler, eds. GI-Days 2007 – young researchers forum:proceedings of the 5th geographic information days, Munster: IfGI, 43–63.

Granell, C., Poveda, J., and Gould, M., 2004. Incremental composition of geographic web services: anemergency management context. In: Proceedings of the 7th conference on geographic informationscience (AGILE 2004). Heraklion, Greece: University of Crete Press, 343–348.

Jagery, E., Altintas, I., and Zhang, J., 2005. A scientific workflow approach to distributedgeospatial data processing using web services. In 17th international conference on scientificand statistical database management (SSDBM’05). Santa Barbara, CA: University ofCalifornia, 87–90.

Klien, E., Lutz, M., and Kuhn, W., 2006. Ontology-based discovery of geographic informationservices: an application in disaster management. Computer, Environment and Urban Systems,30, 102–123.

Klien, E., et al., 2004. An architecture for ontology-based discovery and retrieval of geographicinformation. In Proceedings of the 7th conference on geographic information science (AGILE2004). Heraklion, Greece: University of Crete Press, 574–578.

Kona, S., et al., 2006. Semantics-based efficient web service discovery and composition[online]. Available from: http://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-196/alpsws2006-paper6.pdf [Accessed 10 November 2008].

Kuhn, W., 2005. Geospatial semantics: why, of what, and how? Journal of Data Semantics, 3, 1–24.Lemmens, R., 2004. Applicability of the semantic web to geo service chaining. In: Proceedings of

OpenGIS information communities and semantics working group meeting, Southampton:International Institute for Geo-Information Science and Earth Observation (IIC).

Li, W., Yang, C., and Raskin, R., 2008. A semantic enhanced model for searching in spatial webportals. In: AAAI spring symposium semantic scientific knowledge integration technical report SS-08-05, Palo Alto, CA: the Association for the Advancement of Artificial Intelligence (AAAI)47–50.

Liang, Q., et al., 2004. A semi-automatic approach to composite web service discovery, description andinvocation. International Journal of Web Services Research, 1, 64–89.

Lutz, M., 2007. Ontology-based descriptions for semantic discovery and composition of geoprocessingservices. Geoinformatica, 11, 1–36.

Lutz, M. and Klien, E., 2006. Ontology-based retrieval of geographic information. InternationalJournal of Geographical Information Science, 20, 233–260.

OGC, 2006. Geospatial semantic web interoperability experiment report. Document 06-002r1.Available online at http://www.openspatial.org/projects/initiatives/gswie.

Paolucci, M., et al., 2002. Semantic matching of web services capabilities. In: I. Horrocks andJ. Hendler, eds. Lecture notes in computer science for international semantic web conference(ISWC) 2002, LNCS 2342. Heidelberg: Springer-Verlag, 333–347.

Randell, D.A., Cui, Z., and Cohn, A.G., 1992. A spatial logic based on regions and connections. In:Proceedings of the international conference on principles of knowledge representation andreasoning (KR’92), Cambridge, MA: Morgan Kaufmann Publishers.

Sirin, E., Hendler, J., and Parsia, B., 2003. Semi-automatic composition of web services using semanticdescriptions. In: Proceedings of web services: modeling, architecture and infrastructure.Workshop in Conjunction with ICEIS2003, Angers, France: International conference onEnterprise Information Systems (ICEIS) Press.

Sycara, K., et al., 2003. Automated discovery, interaction and composition of semantic web services.Web Semantics: Science, Services and Agents on the World Wide Web, 1, 27–46.

Tanasescu, V., et al., 2006. A semantic web services GIS based emergency management application.In: Proceedings of Workshop on Semantic Web for eGovernment of ESWC 2006. Budva,Montenegro: ESWC.

Wiegand, N. and Garcia, C., 2007. A task-based ontology approach to automate geospatial dataretrieval. Transactions in GIS, 11, 355–376.

Wiegand, N. and Zhou, N., 2005. Ontology-based geospatial web query system. In: P. Agouris andA. Croitoru, eds. Next generation geospatial information: from digital image analysis to spatio-temporal databases. London: Taylor & Francis, 157–168.

922 C. Zhang et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014

Page 23: Towards logic-based geospatial feature discovery and integration using web feature service and geospatial semantic web

Wiegand, N., et al., 2004. Ontology-based geospatial XML query system. In: Proceedings of theNational Digital Government Conference, Seattle: Digital Government Research Center, 289–290.

Yang, C., et al., 2008. Distributed geospatial information processing: sharing distributed geospatialresources to support Digital Earth. International Journal of Digital Earth, 1, 259–278.

Yu, T. and Lin, K. J., 2005. Service selection algorithms for composing complex services with multipleQoS constraints. In: B., Benatallah, F., Casati, and P., Traverso, eds. Lecture notes in computerscience for service-oriented computing ICSOC 2005, LNCS 3826. Heidelberg: Springer-Verlag,130–143.

Yue, P., et al., 2007. Semantics-based automatic composition of geospatial web service chains.Computers and Geosciences, 33, 649–665.

Zhang, C., et al., 2008. Transforming transportation data models from UML to OWL ontologicalrepresentation. Journal of Transport Research Board: Transport Research Record, 2064, 81–89.

Zhao, T., et al., 2008. Ontology-based geospatial data query and integration. In: T.J., Cova, et al., eds.Lecture notes in computer science for the fifth international conference on geographic informationscience 2008, LNCS 5266. Heidelberg: Springer-Verlag, 370–392.

International Journal of Geographical Information Science 923

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 19:

24 1

0 O

ctob

er 2

014