14
A functional perspective on map generalisation Omair Z. Chaudhry a, * , William A. Mackaness a,1 , Nicolas Regnauld b a School of Geosciences, The University of Edinburgh, Drummond St., Edinburgh EH8 9XP, UK b Ordnance Survey Research Labs, Ordnance Survey, Romsey Road, Southampton SO16 4GU, UK article info Keywords: Map generalisation Multiple representations Data modelling Database enrichment abstract In the context of map generalisation, the ambition is to store once and then maintain a very detailed geo- graphic database. Using a mix of modelling and cartographic generalisation techniques, the intention is to derive map products at varying levels of detail – from the fine scale to the highly synoptic. We argue that in modelling this process, it is highly advantageous to take a ‘functional perspective’ on map generalisa- tion – rather than a geometric one. In other words to model the function as it manifests itself in the shapes and patterns of distribution of the phenomena being mapped – whether it be hospitals, airports, or cities. By modelling the functional composition of such features we can create relationships (parto- nomic, taxonomic and topological) that lend themselves directly to modelling, to analysis and most importantly to the process of generalisation. Borrowing from ideas in robotic vision this paper presents an approach for the automatic identification of functional sites (a collection of topographic features that perform a collective function) and demonstrates their utility in multi-scale representation and generalisation. Ó 2009 Omair Z. Chaudhry. Published by Elsevier Ltd. All rights reserved. 1. Introduction From a database perspective a map can be viewed as a set of geometries rendered via a look up table of symbols, together with associated text. From the human view however, the map reflects a collection of concepts, often grouped or connected together in clearly defined ways – as a result of both physical and human pro- cesses. Thus the viewer does not see a twisty blue line, but sees a meandering river as it snakes through the delta on its way to the sea. The viewer does not see a dense collection of small angular polygons, but sees a collection of buildings, performing many dif- ferent but related tasks that all contribute to the idea of urban space and the city. The cartographer takes advantage of the view- er’s interpretative view when they come to generalise at smaller scales (Mackaness, 2007). An icon of an aeroplane substitutes the multi-storey car parks, hangars, terminals, aprons, and runways that make up an airport. The letter ‘H’ replaces the clinics, car parks, outpatient facilities, heating plant, and wards that typically constitute our understanding of what is meant by ‘Hospital’. And a simple dot with the word ‘Jakarta’ next to it is used to locate and convey this vast megalopolis. So the key argument around ‘func- tional site modelling’ is that the generalisation process would be hugely facilitated by data that were ‘functionally tagged’ and struc- tured in a similar manner. So that, at changing levels of detail, fine scale phenomena could be grouped together according to their col- lective function. In this manner, all the features that constitute an airport, a hospital, or a city could be replaced by a more generalised form (an aeroplane symbol, the letter ‘H’, or a dot with the word Jakarta next to it). Thus we can define a functional site as a collec- tion of objects (natural or anthropogenic), usually in proximity to one another, which collectively perform a specific function. The term ‘site’ is considered to cover a range of geographies. Just a few examples might be: schools, retail parks, business districts, air- ports, docks, or cities – each of which we can associate a particular function or set of functions. We argue that from a multi representational database perspec- tive (Mustière & van Smaalen, 2007) we need to make explicit the nested connections that exist between these functional sites, as well as the components that constitute them. For example the car park, platforms, station building and shunting yards that con- stitutes a ‘railway station’ or the playing fields, classrooms, and sports hall, that constitute a ‘school’. In this manner we can deliver appropriate visualisations of these functional sites or their ‘compo- nents’, at different levels of detail. Such an approach can also facil- itate automated text placement (Barrault, 1995; Zhang & Harrie, 2006) for example where text is used to convey the extent, impor- tance or function of something. Furthermore, by linking a func- tional site with its components we can automate the update process. For example as more suburban houses are built at the edge of the city, we can automatically update the extent of the city 0198-9715/$ - see front matter Ó 2009 Omair Z. Chaudhry. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.compenvurbsys.2009.07.002 * Corresponding author. Tel.: +44 (0) 23 8030 5044; fax: +44 (0) 23 8030 5072. E-mail addresses: [email protected] (O.Z. Chaudhry), [email protected] (W.A. Mackaness), Nicolas.Regnauld@ordnancesur- vey.co.uk (N. Regnauld). 1 Fax: +44 131 650 2524. Computers, Environment and Urban Systems 33 (2009) 349–362 Contents lists available at ScienceDirect Computers, Environment and Urban Systems journal homepage: www.elsevier.com/locate/compenvurbsys

A functional perspective on map generalisation

Embed Size (px)

Citation preview

Page 1: A functional perspective on map generalisation

Computers, Environment and Urban Systems 33 (2009) 349–362

Contents lists available at ScienceDirect

Computers, Environment and Urban Systems

journal homepage: www.elsevier .com/locate /compenvurbsys

A functional perspective on map generalisation

Omair Z. Chaudhry a,*, William A. Mackaness a,1, Nicolas Regnauld b

a School of Geosciences, The University of Edinburgh, Drummond St., Edinburgh EH8 9XP, UKb Ordnance Survey Research Labs, Ordnance Survey, Romsey Road, Southampton SO16 4GU, UK

a r t i c l e i n f o a b s t r a c t

Keywords:Map generalisationMultiple representationsData modellingDatabase enrichment

0198-9715/$ - see front matter � 2009 Omair Z. Chadoi:10.1016/j.compenvurbsys.2009.07.002

* Corresponding author. Tel.: +44 (0) 23 8030 5044E-mail addresses: Omair.Chaudhry@ordnancesu

[email protected] (W.A. Mackaness), Nicvey.co.uk (N. Regnauld).

1 Fax: +44 131 650 2524.

In the context of map generalisation, the ambition is to store once and then maintain a very detailed geo-graphic database. Using a mix of modelling and cartographic generalisation techniques, the intention is toderive map products at varying levels of detail – from the fine scale to the highly synoptic. We argue thatin modelling this process, it is highly advantageous to take a ‘functional perspective’ on map generalisa-tion – rather than a geometric one. In other words to model the function as it manifests itself in theshapes and patterns of distribution of the phenomena being mapped – whether it be hospitals, airports,or cities. By modelling the functional composition of such features we can create relationships (parto-nomic, taxonomic and topological) that lend themselves directly to modelling, to analysis and mostimportantly to the process of generalisation. Borrowing from ideas in robotic vision this paper presentsan approach for the automatic identification of functional sites (a collection of topographic features thatperform a collective function) and demonstrates their utility in multi-scale representation andgeneralisation.

� 2009 Omair Z. Chaudhry. Published by Elsevier Ltd. All rights reserved.

1. Introduction

From a database perspective a map can be viewed as a set ofgeometries rendered via a look up table of symbols, together withassociated text. From the human view however, the map reflects acollection of concepts, often grouped or connected together inclearly defined ways – as a result of both physical and human pro-cesses. Thus the viewer does not see a twisty blue line, but sees ameandering river as it snakes through the delta on its way to thesea. The viewer does not see a dense collection of small angularpolygons, but sees a collection of buildings, performing many dif-ferent but related tasks that all contribute to the idea of urbanspace and the city. The cartographer takes advantage of the view-er’s interpretative view when they come to generalise at smallerscales (Mackaness, 2007). An icon of an aeroplane substitutes themulti-storey car parks, hangars, terminals, aprons, and runwaysthat make up an airport. The letter ‘H’ replaces the clinics, carparks, outpatient facilities, heating plant, and wards that typicallyconstitute our understanding of what is meant by ‘Hospital’. And asimple dot with the word ‘Jakarta’ next to it is used to locate andconvey this vast megalopolis. So the key argument around ‘func-tional site modelling’ is that the generalisation process would be

udhry. Published by Elsevier Ltd. A

; fax: +44 (0) 23 8030 5072.rvey.co.uk (O.Z. Chaudhry),olas.Regnauld@ordnancesur-

hugely facilitated by data that were ‘functionally tagged’ and struc-tured in a similar manner. So that, at changing levels of detail, finescale phenomena could be grouped together according to their col-lective function. In this manner, all the features that constitute anairport, a hospital, or a city could be replaced by a more generalisedform (an aeroplane symbol, the letter ‘H’, or a dot with the wordJakarta next to it). Thus we can define a functional site as a collec-tion of objects (natural or anthropogenic), usually in proximity toone another, which collectively perform a specific function. Theterm ‘site’ is considered to cover a range of geographies. Just afew examples might be: schools, retail parks, business districts, air-ports, docks, or cities – each of which we can associate a particularfunction or set of functions.

We argue that from a multi representational database perspec-tive (Mustière & van Smaalen, 2007) we need to make explicit thenested connections that exist between these functional sites, aswell as the components that constitute them. For example thecar park, platforms, station building and shunting yards that con-stitutes a ‘railway station’ or the playing fields, classrooms, andsports hall, that constitute a ‘school’. In this manner we can deliverappropriate visualisations of these functional sites or their ‘compo-nents’, at different levels of detail. Such an approach can also facil-itate automated text placement (Barrault, 1995; Zhang & Harrie,2006) for example where text is used to convey the extent, impor-tance or function of something. Furthermore, by linking a func-tional site with its components we can automate the updateprocess. For example as more suburban houses are built at the edgeof the city, we can automatically update the extent of the city

ll rights reserved.

Page 2: A functional perspective on map generalisation

350 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

boundary, because we have made the explicit link between thefunction of the boundary and the objects that constitute it.

A further benefit to this functional perspective is the idea thatwe can explicitly model the relationships between functional sites.For example, if we understand the function of a road, and that ofthe hospital, then we can define the notion of access points be-tween the two. We can also model relationships between func-tional sites, such as the service provided by the hospitals to thecity. This richer view provides a more intuitive framework bywhich we might combine third party data. Increasingly third partyinstitutions are using the ‘framework data’ of National MappingAgencies to answer themed questions that go far beyond conven-tional series mapping. These ‘themed’ questions might relate toevacuation strategies or access pathways to hospitals, to mainte-nance contracts of large civil installations, or to transportationplanning. In these instances we need to precisely know what com-ponents constitute a particular site (an industrial estate, a shop-ping centre, a forestry reserve) – and in these contexts we need adatabase that supports this functional description of space, andits relationships between sites. In the next section we justify theneed for an automated approach to the identification of functionalsites. Section 3 characterises functional sites as a precursor to thedevelopment of a generic model. Section 4 details the methodologythat falls from the framework described in Section 3. Section 5 pre-sents results from the implementation, and the results and evalu-ation that point to future work.

2. Database requirements

Rich attribution of and the entities that comprise the functionalsites is one way of explicitly modelling the functional site member-ship. For example we can use taxonomic and partonomic labellingthat explicitly states that (say) (a) ‘this is a playing field’ and (b) ‘itis part of Newbury Grammar School’. The problem with creatingsuch multi representational databases is that in the past field sur-veyors have collected data in anticipation of their cartographic rep-resentation at a specific scale – where there has been norequirement to precisely and consistently categorise features,and where membership is self apparent (for example, that a termi-nal is part of Heathrow Airport) thus obviating the need for thistype of attribution. This cartographic view has resulted in dat-abases that (1) are poorly attributed; (2) are inconsistent; (3) usemixed approaches to geo-referencing and address labelling; and(4) do not make explicit the obvious association between the func-tional components and their parent site. In other words they do notsupport multiple representations. In the case of Ordnance Survey(the National Mapping Agency of Great Britain) data, we have caseswhere the address of the terminal does not contain the name of theairport, and the attribution associated with a car park is vague suchthat it makes it very hard to ascertain which functional site itserves, or even that it is a car park.

2.1. Solution

If each object was richly and consistently attributed, it would bea simple matter to determine the partonomic structures and func-tional descriptions of higher order concepts through the interroga-tion of those attributes. Using humans to attribute the databasewould be a huge undertaking; every object would have to includedescriptions of multiple partonomic memberships. Given the enor-mity of this task, it is highly desirable to seek an automated solu-tion to this problem – in other words to automate as much aspossible, the process by which components are associated with aparticular functional site. Because existing attribution may be inac-curate, and because the composition of functional sites can be

complex (for example components may be dispersed in the waythat city universities often are), it is unlikely that a completelyautomated solution is achievable, and therefore any solutionshould anticipate the involvement of a human in validating thepartonomy and extent of a functional site.

3. A classification of functional sites

Functional sites are typically made up of components of differ-ent classes. For example a refinery might be made up of port facil-ities, storage vessels, bounded ponds, and tanker depots.Functional sites may be nested. For example fish processing, boatmaintenance and sea rescue functions – can all exist within thefunctional site ‘port’. Some sites have very crisp boundaries (theairport delimited by a security fence) whilst others are much moreopen to interpretation (for example the extent of suburbia or amountainous area) (Smith & Varzi, 2000). There is considerablevariation in the areal extent of functional sites (from a small rail-way station to a massive city), and this variation can exist evenwhere they serve the same function (for example the educationalfunction of a small primary school as compared with a large uni-versity campus). All these factors complicate the process of identi-fication and validation. We discuss such factors later in this paper.Further, the task of classification is made difficult by the fact thatour notion of a functional site varies with context and geography.One might ask how was the epithet ‘Lake District’ justified? Or atwhat point is a city deemed to have a ‘financial district’? (indeedwhat is meant by ‘District’ in each of these cases). Collectivelywe see that the relationship between functional sites is multi-scaled and complex. Table 1 is a tiny subset of all functional sitesthat seeks to illustrate the variability of sites (by class and size).It illustrates that there are broad categories for which representa-tion is appropriate at a range of scales (or levels of detail).

3.1. An object ontology framework

It is interesting to note that the human eye is able to examinethe map, and often with no textual attribution is able to inferand identify a wide range of functional sites because of the shapeof their components and the way they interact with surroundingfeatures. For example dockyards can be said to interact with (orconnect together) sea and land based networks. Peruse a map,and one can identify various ways by which you can discern: (1)the type of functional site, (2) its components, and (3) its geograph-ical extent. From such observations, we can begin to identify qual-ities that can potentially be used to identify and discern the extentof functional sites in an automated context (Table 2).

Various research has explored ways of automatically enrichingthe description of entities. Thomson and Béra (2007, 2008) useddescriptive logic reasoning to aid in the classification and enrich-ment of OS MasterMap� features into higher level concepts (ter-raced, semi-terraced and detached houses). Another project atOrdnance Survey used ontologies to aid in the identification offarming land data in OS MasterMap (Kovacs & Zhou, 2007).Lüscher, Weibel, and Burghardt (2008), Lüscher, Weibel, andMackaness (2008) used an algorithmic approach for the automaticidentification of instances of similar higher order concepts. Theadvantage of using an algorithmic approach being that it can han-dle fuzzy membership and uncertainty, as well as cope with largevolumes of data.

When it came to modelling the relationships between functionand form, we noted interesting parallels with work in robotic vi-sion systems and understanding. For example, Wang, Kim, andKim (2005) describe an ‘object ontology framework’ in which thevision system of the robot seeks to understand the function of an

Page 3: A functional perspective on map generalisation

Table 1A simplistic matrix of functional types against the level of detail they might be shown at.

Level of detail

High Medium Low

Commerce Building Depot, harbour Port complexAttractions Gardens, castles, monument Historical, cultural sites National parks, maritime reservesEducation Schools, college UniversityPublic infrastructure/

settlementHospital, civil, military establishments,prisons

Town, housing/industrial estate, parks, financialdistrict

City, military range

Manufacturing Factory Industrial estate Processing plantsRetail Shops Retail park, shopping mallTransport Stations (bus/rail), local airfield International airports Networks (air, land, water)Natural landscape Hillock Island group, moorland Archipelago, mountain chain, rural

areasWater Streams, pond Lake groups, estuaries Seas

Table 2Measures we can use to discern site type and extent.

Measure Example

Areal extent International airports large, primary school smallShape and angularity of polygons Angular form to anthropogenic structures, smooth/irregular boundaries to natural featuresPrototypical composition Schools have playfields, airports have runwaysPatterns/repetitions, orientations and

arrangementsRepeat pattern of rail sidings, or blocks of flats; shopping mall defined by arrangement of shops and shared/central parkingareas

Textual/categorical attribution ‘Southampton General Hospital’Topological adjacency and network

connectivityAdjacency of playfields and school buildings; network connections between facilities within hospital grounds

Access points (external) Factories and connection to the (pedestrian, road, rail) networkDistance from site ‘centre’/topological ‘depth’ Objects further away or topologically disconnected are less likely to be part of a functional sitePhysical impediment A road or ‘obstructing feature/boundary’ (wall or fence) may lie between the school and its playing fieldOwnership boundary (land cadastre) Garden and house lie within its property boundary

O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362 351

object from its compositional form. To capture the compositionalnature of functional sites, we adopted Wang et al.’s (2005) fea-ture-based perspective, in which a ‘component’ is defined as afunctionally significant subset of a functional site. Each componentis characterised according to its function and use (Fig. 1). By apply-ing form-function reasoning, we deduce its functional elements,together with their metric and topological relations and con-straints – those that exist within and between functional sites.

Thus a functional site ‘school’ decomposes into components:playing field, parking space, classroom/admin buildings, and secu-rity office. Each of these components has a specific function – rec-reation and sports, car parking, and rooms in which to teach,manage school activities, and control access, respectively. Such amodel allows us to model usage over time (vacation/term time),and activity (shared use of playing fields). A component’s functiondetermines its ‘functional elements’. For example the functionalelements of a car park are that it is connected to the road network,has a hard surface, and has road markings so as to facilitate theparking of cars. It has to be of a minimum size, is regular in form,and (in the context of a school) proximal to where teaching takesplace. In other words the component function (car storage) governs

Functional site

decompose Component

Functioncharacterise Fo

Fig. 1. Overview of the obje

the functional elements as well as the topological and metric con-straints and relations. A functional site may have any number ofcomponents, and each component may itself be composed of manysub-components. Thus this ‘functional site/component’ modelshould share the same set of properties. Given the interpretivecapacity of the human eye, we should not be surprised by this linkwith robotic systems and understanding.

4. Methodology

Given this object ontology framework, we can begin to see howwe might derive functional sites from a topographic data set (suchas OS MasterMap). In essence we start from the right hand side ofFig. 1 and identify functional sites by searching for their compo-nents (features in the source database). There are many differenttypes of functional site, so we started with the development of ageneric functional site and then looked to refine its properties inorder to be able to model qualities and properties specific to someof the functional types listed in Table 2. In this paper we illustratehow the methodology was able to identify three different types of

rm-functionreasoning

Functional elements

Metric/ topological relations & constraints

ct ontology framework.

Page 4: A functional perspective on map generalisation

Fig. 2. Entity relationship diagram for functional site.

Fig. 3. Methodology for constructing functional sites.

352 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

functional sites (hospitals, schools and airports) and how the gen-eric implementation was extended in order to model specific prop-erties of each functional site (FS).

4.1. Typical properties of functional sites

In order to automatically identify FS we need to take account oftheir properties (Table 2) and from a pragmatic point of view weneed to consider the existing data model and fields that currentlydescribe the data. In Ordnance Survey data, features (buildings) towhich post is delivered each have an ‘address point’. We can usethis as a ‘start point’ from which to ‘grow’ a FS because the addresshas information that allows us to identify the type of FS. Featureswith an address point are referred to as the primary component.The features surrounding it, are deemed to be secondary compo-nents. It is assumed that any given FS has access to the road net-work. The following list summarises these and otherassumptions. Fig. 2 describes the data model linking the functionalsite and its components.

Definition of terms:

� Each functional site is of a type listed in Table 1.� The components of a functional site are classified as being either

primary or secondary.� Each functional site has at least one primary component.� Each functional site can have any number of secondary

components.� A functional site can have a name.� Secondary components are assumed to be adjacent to the pri-

mary component.� Each functional site should be accessible via the road network.

The process of creating a FS begins with the selection of roadnetwork data and features contained by selected road partitions.Though a FS can sometimes lay either side of a major road, it is ini-tially assumed that the FS lies within a road partition (described la-ter in Section 4.3), and this is used as a basis for constraining the

search for likely components. What is likely to be part of any givenFS is reflected in a set of rules (discussed further in Section 4.4).Once the extent of the FS has been determined, the points of inter-section between the road network and the boundary of the FS areused to identify access points to the FS. The overall process is sum-marised in Fig. 3. The subsequent sections discuss each process indetail beginning with an overview of the data sets used.

4.2. Source data sets

The topographic (categorical) data set used in this research is OSMasterMap�. OS MasterMap data forms a complete coverage ofGreat Britain. It is object oriented and stores data in a seamless

Page 5: A functional perspective on map generalisation

O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362 353

format (Ordnance Survey, 2007). OS MasterMap is divided into foursub data sets (referred to as layers by Ordnance Survey). These areAddress layer, Imagery layer, Topography layer and IntegratedTransport Network (ITN) layer. Except for the Imagery layer all oth-ers were used in this research. In order to understand the proposedapproach it is important to understand the fundamental propertiesof these layers.

4.2.1. Topography layerThe features in the Topography layer (‘topo’) are captured at a

high level of detail (1:1250 scale in urban areas, 1:2500 scale inrural areas and 1:10,000 scale in mountain and moorland areas)and are stored in vector format. Each feature in the topo layerhas one of three geometrical structures – a point, a line or a poly-gon. The polygon features form a complete ‘geometric partition’ ofspace (no holes and no overlaps) (Molenaar, 1998). These polygonfeatures represent features that appear in the landscape, such asbuildings, vegetation patches, land, water bodies and roads.

In addition to polygon features we also use linear features. Eachpolygon feature boundary coincides with these line features(Fig. 4). When the data were originally captured, each line featurewas classified according to its accessibility (either non-obstructingor obstructing) (Fig. 4). This accessibility classification is used dur-ing the building of a functional site (discussed in Section 4.4). Apolygon can be bounded by many boundary lines. In topologicalterms these boundary lines are edges of the face (polygon). Eachtopo polygon and line feature has a unique identifier.

4.2.2. Address layerOS MasterMap address layer contains approximately 29 million

addressed features within Great Britain. Each feature has a pointgeometry. Features have a postal address (for instance commercialor residential building) or a non-postal address (for instance carparks, churches, halls). Each feature has a function attribute whichdescribes what the feature is (i.e. a house, commercial building,hospital, school, airport building). This attribute can be used to de-fine the type of functional site (Fig. 2). It also has a cross referenceidentifier (unique identifier associated with the polygon) which

Fig. 4. Boundary lines for topo polygon features and their accessibility cla

links it to a corresponding polygon feature in the topo layer. Someof the address layer features have a name attribute (for example‘Southampton General Hospital’ or ‘Watson Primary School’ or‘Hospital Car Park’). For historical reasons, there is a fair degreeof inconsistency in the attribution. For example the various compo-nents of Southampton General Hospital might be named ‘RSH’,‘Royal Southampton Hospital’ or ‘Hospital’. These inconsistenciesmean it is not possible to group components based only on theirclassification or their attribution.

4.2.3. ITN layerTransportation access was deemed to be an important defining

quality of a FS. Access points were determined by inspecting wherethe boundary of the FS intersected or came close to the road net-work. This involved using ITN data which currently contains theroad network and road routing information for Great Britain.The features in the ITN data set are topologically structured andthe geometry is stored in the form of graph theoretic elements(nodes and segments). By using the ‘descriptive term’ that definesthe class of each ITN feature (‘private road’, ‘A road’, ‘B road’, and‘local street’) it is possible to both identify secondary components,and to define the extent of the partition in which the search takesplace (Section 4.3).

4.3. Limiting the search space

It was observed that components typically associated with afunctional site are usually contained within the same road parti-tion – defined as any smallest closed loop in the road network.The partitions were created using ITN road segments, though roadsegments that were classified as ‘private road’ or ‘local street’ werenot used for partitioning. This is because it was observed that somefunctional sites such as hospitals and schools can have certain fea-tures such as ‘car parks’ or ‘playing ground’ either side of ‘privateroads’ or ‘local’ streets. Once these partitions were created thealgorithm selects those partitions that contained a particular typeof address layer feature. For example if we want to build functionalsites of type ‘Hospital’, the algorithm will first select all address

ssification (Ordnance Survey� Crown Copyright. All rights reserved).

Page 6: A functional perspective on map generalisation

Table 3Classification of the boundary lines shown in Fig. 5.

Classification Boundary lines in Fig. 5

Obstructing-building 3Non-obstructing 1, 5, 4, 7, 8, 10, 11Obstructing 2, 6, 9, 12, 13, 14, 15, 16, 17, 18, 19

354 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

layer features that have a functional attribute ‘Hospital’. It willthen select those partitions that contain these selected addresslayer features.

Once the partitions have been selected the algorithm then se-lects all the features from the topo layer (polygons and lines), ad-dress layer and ITN (segments and nodes) that intersect withthese selected partitions. It is important to note that some higherlevel functional sites such as a ‘University’ or ‘Industrial Park’may in reality have features across these road partitions but thesecan be dealt with via aggregation once their sub-functional siteshave been built. Using road partitions in this manner is a usefulway of managing the search space. Our source database (topolayer) contains 450 million unique features. It would be very inef-ficient to examine every feature’s partonomic relationship for eachand every primary component. Tests confirmed that it was reason-able to assume that the secondary components would lie withinthe partitions as defined. A further post process can consider thelikelihood of whether adjacent partitions contain functional sitesthat are indeed one and the same.

4.4. Building functional sites

The algorithm starts building the functional site by first identi-fying the primary component according to the address layer fea-ture of a given type. In other words a FS of type ‘Hospital’ willhave a primary component, topo polygon feature, cross-referencedby an address layer feature having function ‘Hospital’. The primarycomponent is assigned the same name as the name of the cross-referenced address layer feature. The next step is the identificationof its secondary components. The algorithm starts from the pri-mary component and adds secondary components (topo polygons)based on their associated address, classification and the topo linefeature’s semantic and geometric properties. The process startsby building an adjacency graph between the topo polygons andthe line features for each FS.

4.4.1. Adjacency graphThe adjacency graph represents each map object (polygon or

line feature) with a node. Edges connect nodes representing adja-cent features, (Molenaar, 1998), as in Fig. 5. Modelling adjacencybetween polygons via their boundary line features (as shown inFig. 5b) allows adjacency between polygon features to be condi-tional to the properties of the line features that separate them.

4.4.2. Selection of secondary componentsOnce the graph has been generated we start with the primary

component and traverse the graph following a breadth first search(Sedgewick, 1983). The boundary lines are retrieved from thegraph.

Table 3 provides a classification of the boundary lines for thefeatures shown in Fig. 5. Boundary lines that have a classification

Fig. 5. (a) Example of input data set (topo polygons and boundary lines). (b) Correspondiare shown in light grey nodes in the graph. PC denotes primary component.

‘obstructing’ are removed from the selection because such a classi-fication means the adjacent feature (topo polygon) is not accessiblevia this boundary line. In the real world there might be a wall, ahedge or a fence between the features. But boundary lines withclassification ‘non-obstructing’ are retained since adjacent featuresare accessible via this type of boundary line. A boundary line withclassification ‘obstructing-building’ is retained since it can repre-sent a wall with a door and therefore is considered to be accessible.

The adjacent polygon for each selected boundary line feature isthen checked. A set of rules are applied to this polygon in order toascertain whether it is part of the FS. These rules govern the asso-ciation of a feature and the FS. They also reflect the degree of con-nection between the components based on their perimeter(discussed later in the paper). The rules are summarised in Table4. Each polygon which satisfies these rules is selected as a second-ary component. This process of selection of boundary lines andthen adjacent polygons is repeated for each secondary componentuntil no features are left to check. Fig. 6 illustrates how the FS‘grows’ via this iterative selection process.

The length to perimeter ratio rule in Table 4 takes into accountthe degree to which two polygons are deemed to be connected. Theselected polygon can sometimes be a long thin polygon (polygon ‘g’in Fig. 5, highlighted black in Fig. 7). It is not appropriate to selectsuch polygons as this can result in selection of far off featureswhich are unlikely to be part of a FS. This rule computes the ratiobetween the length of the selected boundary line against theperimeter of the adjacent polygon. For a long thin polygon this ra-tio is quite low. Such polygons are not selected but are placed in aqueue. If the same polygon is again selected by a different non-obstructing boundary line the ratio is again computed and addedto the previous ratio. If this new total ratio is above the thresholdthe polygon is removed from the queue and is added as a selectedsecondary component. In Fig. 6 polygon ‘g’ is not selected becausethe ratio between the boundary line ‘10’ and the polygon ‘g’ inFig. 6c, is below the threshold (empirically determined to be0.25). Fig. 8 shows a hospital functional site created from featuresshown in Fig. 7.

4.5. Road access

Information concerning road access to a functional site is con-sidered to be important in planning (for evacuation strategies ortraffic modelling). The algorithm for the identification of access

ng adjacency graph. The polygons are shown as dark grey nodes and boundary lines

Page 7: A functional perspective on map generalisation

Fig. 6. Traversal stages for the input data set shown in Fig. 5. In each stage dark nodes represent features that have been selected whereas grey nodes represent features beingchecked at the current stage; white nodes represent features not selected.

Table 4The set of rules applied to each selected polygon feature based on its class.

Inputclass

Address rule Classification rule Length toperimeterratio

Building Non-postal, if postal then associated address point has type of current FS in any of its attributes. For instance if abuilding polygon is postal of type ‘office’ but one of its attribute states office of hospital, then it is selected as asecondary component of FS type ‘Hospital’

– –

Road – Junction or private roador street

Car park Not explicitly restricted to a particular FS. For instance if address attribute of a car park states that it is restricted for aparticular functional site such as a hospital then it won’t be selected for any other functional site other than hospital

All otherclass

Not residential garden orriver or forest

P0.25

O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362 355

Page 8: A functional perspective on map generalisation

356 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

points uses ITN road network data (segments and their start andend nodes) selected during the acquisition stage (Section 4.3).Any FS is deemed to have one or several access points. Once thealgorithm has built the FS (comprising primary and secondarycomponents), the algorithm then selects those ITN segments thatintersect the aggregate geometry (Fig. 9). The node of these se-lected segments that lies just outside the aggregated geometry isselected as the ITN access point (Fig. 9). Once these points havebeen identified we create what are termed ‘hook points’. Theseare the points at which the selected ITN segments intersect withthe aggregate geometry (Fig. 9). These hook points are intendedto facilitate the addition of third party data such as ‘reserved carpark’ or ‘delivery access point’ or ‘visitor access point’. Hook pointsare exclusive to a particular FS whereas ITN access points can beshared by many FS.

Fig. 7. An example of a long thin polygon (highlighted in black) that is unlikely

Fig. 8. Hospital functional site automatically identified from the region sho

In some cases the algorithm failed to identify any access points.This was due to non-intersection of ITN segments with the aggre-gate geometry of a given FS (for example in Fig. 10). In such casesthe algorithm performs a proximity analysis. It finds all ITN seg-ments that are within a 10 m distance of the aggregate geometry(a distance derived from empirical analysis). The node (start orend) of these segments which is closest to the aggregate geometryis then selected as an ‘access point’. If no access points are foundusing this approach then the algorithm selects the node that isclosest overall. This ensures that each FS has at least one accesspoint (Fig. 10). It is important to note that currently only accesspoints from road network are determined. This is because onlyroad data is currently available as a network. But in future a similarmethodology can be adapted for other types of network such asrailways, river or pedestrian networks.

to belong to a FS (Ordnance Survey� Crown Copyright. All rights reserved).

wn in Fig. 7 (Ordnance Survey� Crown Copyright. All rights reserved).

Page 9: A functional perspective on map generalisation

Fig. 9. Selection of ITN ‘access’ points and creation of ‘hook’ point using ITN data (nodes and segments) (Ordnance Survey� Crown Copyright. All rights reserved).

Fig. 10. ITN segments do not intersect FS aggregate geometry. Algorithm selects access point using proximity analysis (Ordnance Survey� Crown Copyright. All rightsreserved).

O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362 357

5. Implementation, results and evaluation

This approach to automatic identification of FS was imple-mented in Java using JTS (JTS, 2008) and Oracle 10g (Oracle,2005). JTS was used to model the geometry in Java and also to per-form various topological and metric measures (distance, intersec-tion, union). The Oracle database holds the entire source datasets used in this research. JDBC was used to establish connectionbetween Oracle and Java (Reese, 2000). The implementation con-verted Oracle geometries into JTS geometries. Results were storedin GML format. This GML file was then visualised in open sourceJUMP (Jump, 2008) or it can be easily converted into other proprie-tary formats. If required the output files can be exported back intothe Oracle database.

Fig. 11 shows the results of the algorithm for three functionalsites of type hospital, together with their access points. Fig. 12shows their (manual) representation at 1:25,000 scale. The featurehighlighted in Fig. 11 is an example of a feature that would prob-ably not be included if using a manual approach, but was includedhere because the length to perimeter threshold condition wassatisfied.

Figs. 13–15 show examples of functional sites of type schoolidentified by the algorithm. Comparison of output with the1:25,000 scale mapping in Fig. 15b highlights another limitationwhere the algorithm has failed to include features attributed as‘General surface’ (to the right of the FS in Fig. 15a). This featureis most likely to be a playground – but is not attributed as such.On inspection it was determined that the reason why the algo-rithm didn’t include this feature was because of the rule that pre-vented the checking of adjacent polygons that are bounded by‘obstructing’ topo lines. Thus for FS that are schools, it was neces-sary to refine the rules in order to incorporate this feature (Section5.1).

Figs. 16 and 17 present an airport functional site created usingthis approach and its representation at 1:25,000 scale mappingrespectively. Fig. 16 highlights (in black) ‘holes’ in the functionalsite. We also observe these holes in Figs. 11 and 15a. These holesappear because the features at these places do not fulfil the criterianecessary to become part of the functional site. As a future refine-ment such features could be identified at the end of the processand could be included using a membership likelihood value (Sec-tion 5.1).

Page 10: A functional perspective on map generalisation

Fig. 11. Three hospital functional sites and their access points in Southampton UK. A complex polygon is highlighted in black (Ordnance Survey� Crown Copyright. All rightsreserved).

Fig. 12. 1:25,000 Scale mapping showing the extent of the three hospital sites shown in Fig. 11 (Ordnance Survey� Crown Copyright. All rights reserved).

358 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

5.1. Evaluation and refinement

The issues raised in Section 5, reflect the complex nature offunctional sites (in their topology, shape, pattern and mixed com-position). In defining an automated solution, the challenge is inavoiding both omission errors (missed components), and commis-

sion errors (falsely included components). Too many constraints, ora failure to take into account the type of FS will result in omissionerrors, whereas overly relaxing constraints will result in commis-sion errors. The skill is in: (1) minimizing both types of error and(2) being able to bring to the attention of the user, suspected casesof either types of error. Here we propose two refinements to the

Page 11: A functional perspective on map generalisation

Fig. 13. Three school functional sites and their access points in Southampton UK (Ordnance Survey� Crown Copyright. All rights reserved).

Fig. 14. The 25k map representation showing ‘extents’ of three hospitals sites shown in Fig. 13 (Ordnance Survey� Crown Copyright. All rights reserved).

O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362 359

above approach in order to reduce the chance of these errorsoccurring.

The first solution is to take into account the type of FS. Each FShas certain properties that are different from others. For instance inorder to select the ‘General surface’ feature in Fig. 15a, we eitherneed to have proper attribution in the source database that showsthat this feature is a playground or we need to relax the obstruct-ing line feature rule. But relaxation of this rule in the basic imple-mentation results in selection of un-wanted features for othertypes of FS (commission error). It was realised that the above basicimplementation can be modified according to the classification ofFS. Thus for schools we can remove the rule of obstructing features

but keep this rule for other types of FS. Similarly in some cases forairport and hospital FS, we observed that we needed to check attri-butes other than ‘function’ in order to identify the primary compo-nent. Such specific properties can be implemented in child classeswhilst keeping the basic properties the same in the parent class.This idea of ‘inheritance’ (Sommerville, 1985) is illustrated inFig. 18. Similarly as more specific properties for each type of FSare identified, we can add these to the child classes keeping the ba-sic implementation intact.

There are plenty of situations where we are not sure that acomponent is part of the functional site – even where we have aer-ial photographs at hand. This is because features are difficult to

Page 12: A functional perspective on map generalisation

Fig. 15. A functional site of a school in which the algorithm has failed to capture the playing fields to the east (Ordnance Survey� Crown Copyright. All rights reserved).

Fig. 16. Airport FS in Southampton UK (Ordnance Survey� Crown Copyright. All rights reserved).

360 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

identify and classify. In this regard it is useful to assign a certaintyvalue. A similar approach was proposed by Greenwood and Macka-ness (2002) in the creation of complex features from groups of fea-tures using topological and semantic information. In thisimplementation features that fulfil the basic criteria are assigneda membership value of 1 (highest). Features that are includeddue to relaxation of certain rules are assigned a value of 0.5. Thisis to highlight the risk of creating commission errors as a resultof relaxation of the rules. Features such as holes are included witha membership value of 0.1. Such membership values can help theoperator to evaluate the components included in FS. Fig. 19 illus-trates the refined output of a school FS shown in Fig. 15a usingthe child class with membership values.

The process of evaluation is complicated by: (1) the scaledependency in defining functional sites, (2) people’s perceptionof what is, or is not, part of a functional site, and (3) because formaldefinitions of FS can vary with the task. For example the extent of acity may be defined in terms of census units, political or environ-mental boundaries or a mathematical definition based on the den-

sity of buildings (Chaudhry & Mackaness, 2008). Here we haveexplicitly stated the criteria governing inclusion using an explicitset of rules. Our vision is one of flexibility – an algorithm in whichthese rules can be tuned or added according to the user’s own def-initions. This flexibility does not answer the question ‘how correctare these solutions?’. There are no manual or automated solutionsthat compute FS with which we can compare. Our evaluation todate has been a visual one – to compare our results with mappingat smaller scales (Ordnance Survey 1:25,000 scale colour rastermaps (Ordnance Survey, 2008). It is not possible to validate our an-swers against a database since that database does not yet exist(hence the need for this research). Neither does aerial photographycontain the attribution that would enable us to validate our solu-tions. On the other hand work in automatic extraction of featuresfrom aerial photography. Baltsavias, Gruen, and Gool (2001) canhelp the proposed approach by providing more detailed attributionof features in the source database. Similarly other datasets such asasset or land registry might also support the process. For the timebeing ground truthing with each site manager or owner of each FS

Page 13: A functional perspective on map generalisation

Fig. 17. Southampton airport site at 1:2500 scale mapping (Ordnance Survey� Crown Copyright. All rights reserved).

Fig. 18. High level class diagram for FS illustrating inheritance.

Fig. 19. Degrees of membership for FS type school shown in Fig. 1

O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362 361

appears to be the only certain approach and there are plans to car-ry out such a study in future research at Ordnance Survey. Addi-tionally there is reason to believe that users of this type ofinformation will tolerate some imprecision in spatial extent, attri-bution and partonomic information.

The algorithm has been evaluated in its capacity to automati-cally determine the extent of hospitals, schools and airport func-tional sites in Southampton, a large city in the UK. Workcontinues to extend this generic approach to include other types

5a (Ordnance Survey� Crown Copyright. All rights reserved).

Page 14: A functional perspective on map generalisation

362 O.Z. Chaudhry et al. / Computers, Environment and Urban Systems 33 (2009) 349–362

of functional sites. The rules and properties governing the extent ofFS have in part, been derived from inspection of their representa-tion at various scales (notably 1:25,000 scale mapping). This pro-cess has highlighted a degree of inconsistency in the manualapproach (reflecting perhaps an element of cartographic licence).The challenge has been in formulating rules that accommodateboth the functional as well as the cartographic ambitions of na-tional databases. By using a generic design, it is anticipated thatfor each class of functional site, it should be possible to add rulesthat refine the selection process based on the FS type, its contextand the intended application.

6. Conclusions

It is argued that the ability to automatically derive higher orderfeatures from lower order features is critical to the derivation ofmulti-scale products from a single detailed database (Mackaness,Ruas, & Sarjakoski, 2007). It offers the potential for automated rea-soning which has various advantages (consistency checking, auto-mated update, automated labelling, intelligent integration of thirdparty data). Wang et al. (2005) showed how higher order objectscan be defined by their functional qualities which in turn dictatetheir shape and extent. In this research we have defined a set ofrudimentary qualities common to functional sites and used thisas a basis for identifying the extent of functional sites. The ap-proach uses graph theoretic techniques along with semantic, topo-logical and metric properties of source features as a basis fordetermining the extent of a FS. But the research has also high-lighted the need for better attribution and more detailed conceptdefinition and specification if databases are to support multi-scaled representations.

It was also observed that due to the complex nature of FS, it isnot possible to achieve a fully automated solution. Therefore anysolution needs to anticipate the role of a human in validating thecomponents of a FS. Even this ‘blended’ solution does not addressthe complex issue of how some functional sites are perceived(Mark & Smith, 2004) or how the outputs should be evaluated.There will always be a degree of uncertainty that will require somesort of manual intervention.

It is worth noting that there is a scale dependency and hierarchyto functional sites. For example a ‘retail park’ is itself a collection offunctional sites of type ‘shopping’. Similarly a ‘city’ can be viewedas a functional site which is itself a collection of functional sitessuch as ‘hospitals’, ‘schools’, ‘districts’, ‘retail parks’, ‘university’,‘residential’, and ‘industrial’ which themselves can be further di-vided into smaller functional sites. Future work will look into thecreation of these ‘higher level’ functional sites by aggregation oflower level related functional sites. In such a context we can morerichly and explicitly model the semantics of what constitutes func-tional sites.

Acknowledgements

The research was funded by the Technology Strategy Board andOrdnance Survey (GB) under the knowledge transfer partnership(No. 6837). Our special thanks to Rob Gower at Ordnance Surveyfor his continuous support and suggestions for improving the

methodology. We extend our gratitude to anonymous reviewersfor their constructive comments.

References

Baltsavias, E., Gruen, A., & Gool, L. (2001). Automatic extraction of man-made objectsfrom aerial and space images (III). Tokyo: A.A. Balkema Publishers.

Barrault, M. (1995). An automated system for linear feature name placement whichcomplies with cartographic quality criteria. In AutoCarto 12 (Vol. 4, pp. 321–330). Charlotte, North Carolina: ACSM/ASPRS.

Chaudhry, O. Z., & Mackaness, W. A. (2008). Automatic identification of urbansettlement boundaries for multiple representation databases. ComputerEnvironment and Urban Systems, 32(2), 95–109.

Greenwood, J., & Mackaness, W. A. (2002). Revealing associative relationships forcomplex object creation. In Geoscience 2002 abstracts: The second internationalconference on GIS (pp. 48–55). Boulder Colorado.

JTS (2008). JTS topology suite. <http://www.vividsolutions.com/jts/jtshome.htm>.Retrieved 10.01.08.

Jump (2008). OpenJUMP. <http://openjump.org/wiki/show/HomePage>. Retrieved10.01.08.

Kovacs, K., & Zhou, S. (2007). Key challenges in expressing and utilising geospatialsemantics at Ordnance Survey. In Presentation held at the Europeangeoinformatics workshop, 7–9 March, Edinburgh, UK.

Lüscher, P., Weibel, R., & Burghardt, D. (2008). Alternative options of usingprocessing knowledge to populate ontologies for the recognition of urbanconcepts. In 11th ICA workshop on generalisation and multiple representation, 20–21 June. Montpellier, France.

Lüscher, P., Weibel, R., & Mackaness, W. A. (2008). Where is the terraced house? Onthe use of ontologies for recognition of urban concepts in cartographicdatabases. In Headway in spatial data handling, Lecture notes in geoinformationand cartography (pp. 449–466). Berlin, Heidelberg: Springer.

Mackaness, W. A., Ruas, A., & Sarjakoski, L. (2007). Observations and researchchallenges in map generalisation and multiple representation. In Generalisationof geographic information: Cartographic modelling and applications (pp. 315–323).Oxford: Elsevier.

Mackaness, W. A. (2007). Understanding geographic space. In Generalisation ofgeographic information: Cartographic modelling and applications (pp. 1–10).Oxford: Elsevier.

Mark, D., & Smith, B. (2004). A science of topography: Bridging the qualitative–quantitative divide. In Geographic information science and mountaingeomorphology (pp. 75–100). Chichester: Springer–Praxis.

Molenaar, M. (1998). An introduction to the theory of spatial object modelling for GIS.London: Taylor and Francis.

Mustière, S., & van Smaalen, J. (2007). Database requirements for generalisationand multiple representations. In Generalisation of geographic information:Cartographic modelling and applications (pp. 113–136). Oxford: Elsevier.

Oracle (2005). Spatial user’s guide and reference 10g release 1 (10.1). <http://www.oracle-10g-buch.de/oracle_10g_documentation/appdev.101/b10826/toc.htm>. Retrieved 23.10.07.

Ordnance Survey (2007). OS MasterMap user guide. <http://www.ordnancesurvey.co.uk/products/osmastermap/userguides/docs/userguidepart1.pdf>. Retrieved23.10.07.

Ordnance Survey (2008). 1:25 000 Scale colour raster: Mid-scale digital mapping forExplorer outdoor activities maps. <http://www.ordnancesurvey.co.uk/oswebsite/products/25kraster/>. Retrieved 10.01.08.

Reese, G. (2000). Database programming with JDBC and java (2nd ed.). O’Reilly. p.328.

Sedgewick, R. (1983). Algorithms. USA: Addision Wesley.Smith, B., & Varzi, A. (2000). Fiat and bona fide boundaries. Philosophy and

Phenomenological Research, 60, 401–420.Sommerville, I. (1985). Software engineering (2nd ed.). Wokingham: Addison

Wesley.Thomson, M. K., & Béra, R. (2007). Relating land use to the landscape character:

Toward an ontological inference tool. In GIS research UK 15th annual conference(pp. 83–87). Maynooth, Ireland.

Thomson, M. K., & Béra, R. (2008). A Methodology for inferring higher level semanticinformation from spatial databases. In GIS research UK 16th annual conference(pp. 268–274). Manchester, UK.

Wang, E., Kim, Y., & Kim, S. (2005). An object ontology using form-functionreasoning to support robot context understanding. Computer Aided Design andApplications, 2(6), 815–824.

Zhang, Q., & Harrie, L. (2006). A real-time method of placing text and icon labelssimultaneously. Cartography and Geographic Information Science, 33(1), 53–64.