35
A Geo-spatial Perspective or What’s Special about the Spatial? Peter Burnhill Director EDINA, UK National Data Centre University of Edinburgh CoSMiC Terminologies Day, National Library for Scotland, Edinburgh

A Geo-spatial Perspective or What’s Special about the Spatial? Peter Burnhill Director EDINA, UK National Data Centre University of Edinburgh CoSMiC Terminologies

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

A Geo-spatial Perspectiveor

What’s Special about the Spatial?

Peter Burnhill

Director EDINA, UK National Data Centre

University of Edinburgh

CoSMiC Terminologies Day, National Library for Scotland, Edinburgh

Preamble

This is based on two earlier presentations:

1. Workshop on ‘Digital Gazetteers’, made (by AC & JJ) at ACM/IEEE-CS Joint Conference on Digital Libraries 2002, Portland, Oregon, USA, 14-18 July 2002

2. ‘New Directions in Metadata’, made (by PB & DMS) at OCLC/SCURL Pre-IFLA Conference, Edinburgh August 2002

So, acknowledgements to: David Medyckyj-Scott, Andy Corbett & James Reid

(Research & Geo-data Services, EDINA)

Purpose & Overview

• Set context • Internet & Digital Libraries & GIS

• EDINA & the JISC Information Environment

• What’s special about the spatial?• Referencing & spatial co-ordinates

• gazetteer models (nominal & active)

• Progress towards digital gazetteer services• geoXwalk & other projects

• Summary & Conclusions

Internet, Digital Libraries & GIS

1. Some Big Issues– Metadata & Interoperability– Naming, Identifiers & Authority Files– Ontologies– Shared services

2. Information Science – Digital Library, as mix of document & computation

traditions Michael Buckland, ‘The Landscape Of Information Science’ JASIS

Special Issue "JASIS at 50", Wiley 1999. (‘Presidential Address’)

3. Subject-matter methodology– Geographic Information Systems

• deconstructing the Map: both as database & as display device

• referencing: the cartographic trick ‘surface of sphere to flat paper/screen’

EDINA

• a JISC National Data Centre, 1995 -– part of Edinburgh University Data Library, 1984 -

• mission... to enhance productivity of research, learning and teaching in UK higher & further education

• major provider within the JISC Information Environment– range of bibliographic resources– launching sound and picture studio– key geo-spatial data and geo-referenced information

• UKBORDERS (1994 - ) boundary outlines & geo- reference database• Digimap (2000 -) online source of Ordnance Survey mapping

• strategic move toward interoperability & shared services role – adoption of appropriate standards

The JISC Information Environment is…

• variously stated as …– a national digital library... for UK higher and further education– a managed collection of quality assured resources– a distributed resource supporting learning and research in the

UK

• definitely heterogeneous– ‘words, numbers, pictures, sound’: including geo-spatial data

• for use by researchers, students, teachers & support staff • based on an underlying functional model

– simplified to: search -> obtain -> use -> publish [digital soup]– {discover/locate} {request/access} {view/copy/amend/combine}

{publish}

• now to have location-based searching– requiring geo-referencing of information objects

Q: What’s Special about the Spatial?

• subject content most often referenced by topic …… but much (80%?) can be referenced to specific

geographic places• broad disciplinary base for more powerful geographic

searching– across the social, life & physical sciences as well as the

humanities– also from libraries, archives and museums– now from digital libraries, service providers & data providers

• geo-referencing is thus a way of viewing information content:– subject, people, place and time

A: Geo-referencing, that’s what

Geo-spatial data“data that have some form of spatial or geo-graphic reference that enables them to be located in two- or three-dimensional space”

Statistical Account of Scotland

NUMBER XIII.

PARISH OF CULLEN.

(COUNTY OF BANFF, SYNOD OF ABERDEEN, PRESBYTERY OF FORDYCE.)

By the Rev. Mr. ROBERT GRANT.

Royalty, Extent, Climate, etc.

CULLEN, as appears from old charters, was originallycalled Inverculan, because it stands upon the bank ofthe Burn of Cullen, which, at the N. end of the town, fallsinto the sea: but now it is known by the name of Cullen on-ly. Cullen is a royal burgh, formerly a constabulary, ofwhich the Earl of Findlater was hereditary constable. Theset, as it is called, of the council, consists of 19, in which num-ber are included the Earl of Findlater, hereditary preses, 3bailies, a treasurer, a dean-of-guild, and 13 counsellors. Theparish extends from the sea fouthward, about 2 English milesin length.

So, what is geo-referencing? What are geo-data?

Barrow StreetBarrow upon SoarBarrow upon TrentBarrowbyBarrowdenBarrowfordBarryBarsbyBarthlowBarton (8)Barton Bendish

Barrow StreetBarrow upon SoarBarrow upon TrentBarrowbyBarrowdenBarrowfordBarryBarsbyBarthlowBarton (8)Barton Bendish

Models of Gazetteer(1): Place Name Vocabularies

• simple list of place names

has many problems

e.g. non-uniqueness

• common form is {name, location}

"index" in atlas or "geographical dictionary"

• ‘location’ field often has name of larger area that ‘contains’ the place

but even then the name may still not be unique

Barton, CambsBarton, Ches.Barton, Devon,Barton, Glos,Barton, Lancs (2)Barton, N. YorksBarton, Warks

Barton, CambsBarton, Ches.Barton, Devon,Barton, Glos,Barton, Lancs (2)Barton, N. YorksBarton, Warks

‘The Nominal Gazetteer’

Example: Hierarchical Thesaurus (part of the ‘Document Tradition’)

Comment: one type of simple relationship between entries is exploited entries ordered from very general to very specific (BT, NT) can efficiently determine what a given area contains normally structured to handle alternative names (SY)

X rigid structure, one view only, typically geo-politicalentities can belong in many hierarchies and new relationships evolve

X names may still not be uniqueX cannot deal with spatial proximity / contiguity

Fatal Flaw: no one single, simple hierarchy in Scotlandno way to relate to other, multiple (e.g. postcode) and ‘old’

geographies …

United Kingdom………………………… (nation)England …………………………..(country)

Devon………………………….. (county)Barton………………………………..

Boundaries in Fife, Scotland

Pause, to ponder the puzzle of place …

1. places can be defined in space (as an ‘area’, not a single ‘point’)

– a named feature, e.g. Lake Geneva– a space taken for human settlement, e.g. Edinburgh

and those areas change over time, can be fuzzy, or even poetic

2. names of places are not unique, nor persistent, and have considerable cultural ‘baggage’

– a given place can have more than one proper name• different languages• alternative contemporary and historic names, even within a given

language

Auchterderran, Fife, Scotland has 21 alternative names or name spellings

e.g. Auchterderay, Ochtirderay, Urchan, Hurkyndorath

Paradox: geography is global, but naming is local

Nevertheless, geo-referencing means more, requires more, than a controlled vocabulary of (place) names

Getting geographic (1): Being coordinated

• how should we geo-reference?– with a co-ordinate system that can be related to a specific position

or location on the earth's surface

• geographic co-ordinates allow places to be represented by the appropriate footprint– settlements, lakes as areas; roads, rivers as lines; stations as point

• and offer persistence, regardless of name, political boundary

or other changes and a consistent framework for spatial queries

• geographic co-ordinates allow proximate places, those close to one another, to be identified– appropriate geo-referencing thus ‘enriches’ textual description

• as ever, not everyone uses the same standard spatial coding scheme– systems that relate to geo-graphic (Cartesian) coordinates are the

preferred metadata of choice, providing opportunity for ‘cross-walk’

Barrow StreetBarrow upon SoarBarrow upon TrentBarrowbyBarrowdenBarrowfordBarryBarsbyBarthlowBarton (8)Barton Bendish

Barrow StreetBarrow upon SoarBarrow upon TrentBarrowbyBarrowdenBarrowfordBarryBarsbyBarthlowBarton (8)Barton Bendish

Getting geographic (2): Models of Gazetteers(2)

• Simple use of a geo-spatial reference in the location field: the National Grid

Barton (540620, 255780)Barton (344880, 354210)Barton (410080, 225320)Barton (351580, 437670)Barton (335223, 409318)Barton (423170, 508880)Barton (290950, 67220)Barton (410849, 251111)

Barton (540620, 255780)Barton (344880, 354210)Barton (410080, 225320)Barton (351580, 437670)Barton (335223, 409318)Barton (423170, 508880)Barton (290950, 67220)Barton (410849, 251111) Towards ‘The Active Gazetteer’

Task: Find resource about 'Liverpool docks’

Search using a nominal gazetteer might yield:

Using spatial proximity in an active gazetteer, the search can be widened:

Place County/UALiverpool Liverpool

Bebbington Wirral

Birkenhead Wirral

Bootle Sefton

New Brighton Wirral

Seacombe Wirral

Seaforth Wirral

Waterloo Sefton

… that means more & better hits …. !!!

co-ordinates allow (near) co-located places to be co-identified.

Gazetteer - A list of geographic features together with their associated spatial location

Digital Gazetteer - An electronic list of geographic features together with their associated spatial location

An authority database of places (and features?)An ‘Active Gazetteer”

Digital Gazetteer Service - A network-addressable middle-ware server supporting geographic referencing and searching.

A shared ‘terminology’ service.

‘Active’ Digital Gazetteer Services

International Digital Gazetteer Initiatives

extant digital gazetteer services

• USGS Geographic Name Information Systems• Canadian Geographical Names Data Base• GEOnet Names Server (NIMA)• Getty Thesaurus of Geographic Names• Columbia Gazetteer of the World Online

projects from which digital gazetteer services might spring

• Open GIS Consortium Geospatial Fusion Services Testbed Geocoder, Gazetteer and Geoparser services

• Alexandria Digital Library Gazetteer Development *• Electronic Cultural Atlas Initiative - ECAI *

• geoXwalk, JISC-funded collaborative project *

* Presentation to Workshop on ‘Digital Gazetteers’ at ACM/IEEE-CS Joint Conference on Digital Libraries 2002, Portland, Oregon, USA, 14-18 July 2001

The geoXwalk project

• funded under JISC DNER Development Programme– builds on scoping study – aims to develop a demonstrator gazetteer service

suitable for extension to full service.

• time-frame: 1 June 2002 - 31 May 2003• project partners: EDINA and History Data Service• similar to the ADL approach (Linda Hill et al)

– reviews the ADL Gazetteer Content Standard– builds on, and adapts ADL geographic feature ‘ontology’

• ‘near-contemporary’ geography focus, linking back into history

• geo-X-walk demonstrator due in 2003

Reference use

Information server

Information server

Searching

Geo-parsing &indexing

The geoXwalkServer

X-walk as digital gazetteer service: use

cases

JISC Information Environment

Portal

Content providers

End-user

Portal

Broker/Aggregator

Authentication

Authorisation

Collect’n Desc

Service Desc

Resolver

Inst’n Profile

Shared services

Portal

Provision layer

Fusion layer

Presentationlayer

geoXwalk

Uses of ‘geo-X-walk’ Digital Gazetteer Service

1. As ‘shared service’, enabling other information services to support full range of spatial searching (query constraints)• no need to hold all data (at service) to resolve spatial query• uses co-ordinates and (implicit) spatial relationships to

‘cross-walk’ between geographies• machine-to-machine (m2m) interaction to ‘shared service’

2. As reference facility for researchers, libraries & museums • including means to resolve variant names etc.

3. As online facility to assist metadata creators

Helping to make simple searching more effective

Find me documents on the 'Liverpool docks’ Search terms: subject = “docks”, place = “liverpool”

Using spatial proximity place search terms become

Liverpool

Bebbington

Birkenhead

Bootle

New Brighton

Seacombe

Seaforth

Waterloo

Supporting cross searching different services

geoXwalkServer

Content Provider C

ContentProvider A

ContentProvider B

Coordinate footprints

Parish names

Place names

Portal service

Post code: L34 0HS?

‘Find resources for this postcode’ (NB postcode often used to geo-reference survey data files)

Knowsley

340900,392300 - 347217, 397660

BX003

Supporting reference: the “where is?” type of question

What is at grid ref. NY 305 573 ?

Where is Aberdour?

What is the largest town in Aberdeenshire?

List me all places ending with ‘kirk’

What parishes fall within the Loch Lomond National Park?

On what river is Dundee situated?

Which Roman roads pass through Scotland?

By what alternative names has Edinburgh been known?

+ research use to resolve variant names etc.

As online facility to assist metadata creators (1)

• Traditional use of ‘controlled vocabulary’ for ‘found’ place names

but, to be ‘found’, metadata records on objects must have appropriate geo-referencing

• This is achieved using an ‘analytical’ (geo-coded) gazetteer

e.g. BLGOa facility devised by EDINA for the British Library for use in its NOF-funded ‘A Sense of Place’ activity– uses 1:50 000 Gazetteer licensed from Ordnance Survey– presently ‘in test’ at the British Library by over 20

archival staff

As online facility to assist metadata creators (2)

The task of indexing place names in documents

• Place Names within the digitised pages of the Statistical Accounts of Scotland (1790 & 1840s) can be recognised semi-automatically.

[ http://edina.ac.uk/statacc/ ]

• We call this geo-parsing ...

As online facility to assist metadata creators (2)

Need screen shot of parser here

Some Success, but also ‘Current Challenges’

1. Merging geo-names from different scales & from different sources1. when place names differ, should all names be regarded as proper!– do we trust positional accuracy & how do we express confidence?– how to minimise effort in de-duplication of place(s)?

• places have multiple names, types, and footprints• need to be able to identify duplicate entries for the same place

2. Presenting geo-names on different occasions?– many variant ‘proper’ names, what is preferred?

• what is the ‘name authority body’? - none in the Scotland or the UK• preferred name varies with location and use and culture

– there are language and character code set issues– standard codes for postal addresses and other geographies

3. There is IPR in metadata; and hence terms & conditions of use4. There are always service performance issues

Summary Conclusions

1. geographic referencing is needed in the digital library• for indexing information objects & for finding out what is

where

2. names as words are not enough• places need their co-ordinate numbers

3. ‘active’ digital gazetteer services can add value• a few initiatives internationally, now beginning to collaborate

• common data model, protocols, sharing data & interoperability

• licensing and copyright are serious issues• particularly if want wider, global access

• a variety of interesting technical challenges to tackle

4. EDINA has developed digital gazetteer services for Scotland

global answer is geo-referencing by co-ordinates but to ‘act global’ we must also ‘think local’ …

local to object, to location, to geographic vocabulary of user

Active Gazetteers offer more than Nominal Gazetters

Contact details

• Authors contactable at:[email protected] and [email protected]

• For EDINA services contact: http://edina.ac.ukEDINA, Data Library, University of [email protected] or telephone +44 (0)131 650 3302

• For information on geoXwalk project:Dr David Medyckyj-Scott, Project DirectorCressida Chappel, Head of History Data Service

([email protected])

Gazetteer - A list of geographic features together with their associated spatial location

Digital Gazetteer

Digital Gazetteer Service

Some Definitions

Review of ADL Gazetteer Content Standard

Geographic Feature ID Geographic Name Variant Geographic Name (R) Type of Geographic Feature (R) Other Classification Terms (R) Geographic Feature Code (R) Spatial Location (R) Street Address Related Feature (R) Description Geographic Feature Data (R) Link to Related Source of Information (R) Supplemental Note

Metadata Information http://www.alexandria.ucsb.edu/gazetteer

1. each feature is self-contained making model very flexible2. comprehensive description but with small set of core elements3. temporal aspects of names, footprints, relationships, …4. documents source, spatial accuracy/scale of footprint5. permits explicit relationship types!

Beyond ‘Settlement’ Place Names to Ontologies:

Geographic Feature Types• incorporate dictionary of

terms defining each feature type

• thus, support queries such as – “What schools exists in Leeds

and where are they?”– “Show lakes in Cornwall”

• hierarchy of feature types preferred

• propose to adopt the ADL Feature Type Thesaurus

• some problems… but ADL acknowledge these

• adapting thesaurus for UK– US & UK use words differently

hydrographic features

. aquifers . bays . . fjords . channels . deltas . drainage basins . estuaries . floodplains. streams . . rivers . . . bends (river) . . . rapids . . . waterfalls