19
Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop, Chilworth, Southampton, UK Safeguarding the Citation Lifecycle for Geospatial Repositories (based on presentation at EGU 2007)

Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Embed Size (px)

Citation preview

Page 1: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Guy McGarva, EDINA National Data Centre

Rajendra Bose, DCC and School of Informatics

University of Edinburgh

Tuesday 15 May 2007

CLADDIER Project Workshop, Chilworth, Southampton, UK

Safeguarding the Citation Lifecycle for Geospatial Repositories

(based on presentation at EGU 2007)

Page 2: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Contents

• Issues– Datasets vs. Features

• Citation of Geospatial Features– OS MasterMap® example

• Dataset Citation– Example using Go-Geo! and GRADE

• Work in progress…

Page 3: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Geospatial Dataset vs Geospatial Features

Citation may need to be at either Dataset or Feature level because:

- You need to cite the dataset, if:• Dataset is small (num. features, extent etc.)• Whole dataset used in analysis• Not feature based (e.g. raster/surface)

– But, you need to cite specific features, if:• Only small extent from very large geodatabase• Only specific features used in analysis• Continuously changing (but only small proportion of

whole database), need to know which version

Page 4: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Geospatial Feature Citation

• Some Assumptions for features within a geospatial repository:– Every feature has a unique ID

• e.g. OS TOIDs (Topographic Object IDs)

– Features have version numbers/time-stamps

• Note: Talking about vector data that could be used to produce cartographic products, but not the maps themselves

Page 5: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

journal article

citation creation

(future) discovery of citation

citation included in published work

retrieve/ resolve citation

citation targetaccess a copy of

article held in a library or institutional repository

journal article

article held in a library or institutional repository

Citation lifecycle for journal articles

Page 6: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

geospatial data

citation creation

(future) discovery of citation

citation included in published work

retrieve/ resolve citation

citation targetreassemble geospatial

database contents

Author, 2006, Citation target,… <web_link>

Author, 2006, Citation target,… <web_link>

article held in a library or institutional repository journal article

data held in geospatial repository

Citation lifecycle (broken) for geospatial data

Delaware, Ohio [map]. 1885. Scale not given. “Sanborn Fire Insurance Maps, 1867-1970 – Ohio”. OhioLINK Digital Media Center. <http://dmc.ohiolink.edu/mrsid/bin/viewmap.pl?client=Sanborn&image=Bdg/SanMaps/reel28/6674/00001.sid&oid=Reel28-6674-00001&sessionID=2108467497&title=Delaware%2C+Ohio&date=February%2C+1885&format=list&results=20&sort=thedate&searchstatus=1&hits=136&count=1> (2 May 2005).

Page 7: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

citation creation

(future) discovery of citation

citation included in published work

retrieve/ resolve citation

citation targetreassemble geospatial

database contents

Author, 2006, Citation target,… <web_link>

Author, 2006, Citation target,… <web_link>

article held in a library or institutional repository journal article

data held in geospatial repository

1

2

3

Citation lifecycle for geospatial data

geospatial data

Page 8: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

toid="osgb1000042724588" version="2"

toid="osgb1000000334380650" version="1"

toid="rkb_20070328_2"

toid="rkb_20070329_1"

Sample Feature Data

e.g. OS MasterMap

e.g. User data

Page 9: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Researcher_1

References:Map1…link

Jan 2007

Jan 2007

Jan 2007

<M>…

GIS_1

base data

Citation creation:

generation of manifest

incorporation into “bridge service” web page (with embedded manifest) <html>

citation in publication links to bridge service (via doi?)

other data source

<M>…

= toid="osgb1000042724588" version="2"toid="osgb1000000334380650" version="1"toid="rkb_20070329_1"toid="rkb_20070328_2"

where,

e.g. OS MasterMap

e.g User data

Page 10: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Bose, Rajendra. 2007. “Building geohazard data set”. University of Edinburgh Database Group. <link to manifest in bridge service that provides standard geospatial metadata>

Sample Citation

And 'bridge service', is a web site that provides access to geospatial metadata

Where, 'manifest' is XML file containing list of feature identifiers and version numbers

So, citation for this data becomes

Page 11: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Example of how bridge service would work:

embed manifest in ISO 19115, FGDC CSDGM, etc. metadata for set of features

Page 12: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

DNF Schema to support data association

Sample XML data based on DNF schema

- can be used to transfer object identifiers and version numbers

- can be used to associate users data with reference data

Object IdentifierVersion number

<gml:featureMember>

<!--This object contains a simple association with a version and a multiple association. The referenceLayer attribute is used to show that this object belongs to "Yon Layer".--> <exaro:ExampleApplicationReferenceObject gml:id="yonr22222222" dnf:version="2" dnf:referenceLayer="Yon Layer"> <exaro:exampleProperty1>Some application layer specific information</exaro:exampleProperty1> <exaro:meaningfulAssociationName> <!--In addition to the idReference this association specifies an individual version of the object. It also indicates the referenceLayer attribute showing that the referred to object is in the layer "That Layer".--> <dnf:SimpleAssociation xlink:href="#that11111111" dnf:version="2" dnf:referenceLayer="That Layer"/> </exaro:meaningfulAssociationName> <exaro:meaningfulListName> <!--The order of the sequence is the order in which the association appear in this file i.e. this is a sequence of objects that22222222, that11111111 and that33333333 in that order.--> <dnf:SimpleAssociationList> <dnf:simpleAssociations> <dnf:SimpleAssociation xlink:href="#that22222222" dnf:referenceLayer="That Layer" dnf:version="2"/> <dnf:SimpleAssociation xlink:href="#that11111111" dnf:referenceLayer="That Layer" dnf:version="3"/> <dnf:SimpleAssociation xlink:href="#that33333333" dnf:referenceLayer="That Layer" dnf:version="4"/> </dnf:simpleAssociations> </dnf:SimpleAssociationList> </exaro:meaningfulListName> </exaro:ExampleApplicationReferenceObject> </gml:featureMember>

Page 13: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Researcher_2

4

GIS_2

Citation retrieval: Completes the life-cycle

References:Map1…link

extract manifest and access archive from “bridge service”

<html>…

citation in publication links to bridge service (e.g. via doi as in STD-DOI project?)

repositories use manifest to provide historical

MasterMap and other features if user has

permissions

features can now be

reassembled with a GIS

<M>…

Jan 2007

Jan 2010

Repository/archive

Page 14: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Geospatial Dataset Citation Example

• An example using:– Go-Geo Metadata portal and – GRADE Geospatial Repository

• Shows how the life cycle could completed by maintaining the citation in the metadata and resolved by a domain repository

• GRADE solves problems of access and authorisation by only being accessible to users of Digimap (for uploading and downloading of data)

• Could be extended to support feature manifests using DNF schema

Page 15: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Go-Geo Geospatial Meta-data portal

Page 16: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Linking a record in Go-Geo with the data in GRADE

Location of data

Page 17: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Retrieving the data from GRADE

URL from Go-Geo!

Download File

Page 18: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Conclusions

• Current guidelines for citing a selection of features within a geospatial database are inadequate

• An XML manifest could serve as a definitive, compact, portable (& copyright free) list of geospatial features that could facilitate citation

• A ‘bridge service’ or metadata portal could provide a means of retrieving citations

• Further work…– How do manifests interact with

archives/repositories– Web services

Page 19: Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,

Safeguarding the Citation Lifecyclefor Geospatial Repositories

Rajendra Bose [email protected] McGarva [email protected]

Tuesday 15 May 2007

CLADDIER Project Workshop

Safeguarding the Citation Lifecycle for Geospatial Repositories