Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Preview:

DESCRIPTION

Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology. Sarah Carrier, SILS, Master’s Student Jackson Dube, Visiting Scholar, SILS/MRC Jane Greenberg, Associate Professor, Director SILS/Metadata Research Center , UNC-CH - PowerPoint PPT Presentation

Citation preview

Metadata Issues Underlying the Metadata Issues Underlying the Development of a Data RepositoryDevelopment of a Data Repository

for Evolutionary Biologyfor Evolutionary Biology

Sarah Carrier, SILS, Master’s StudentJackson Dube, Visiting Scholar, SILS/MRCJane Greenberg, Associate Professor, Director SILS/Metadata Research Center <MRC>, UNC-CHRuth Monnig, Doctoral Research Assistant, SILS/MRC

OverviewOverview

1. Metadata defined 2. Role of metadata in a repository3. Range of metadata standards4. Issues5. Discussion

The Knowledge Network for Biocomplexity (KNB)The Knowledge Network for Biocomplexity (KNB)

http://knb.ecoinformatics.org//data.html

Metadata Example for Metadata Example for a specimena specimen

Family: Pinaceae Species: Pinus serotinaDate identified: 1958-05-10County: Pasquotank CountyLocation collected: Woodland Border, 2.3

miles north east of NisontonCollected by: Harry E. Ahles

<Species> Pinus serotina </Species><Date.ID><scheme=SPEC.W3CDTF“>1958-

05-10 <Date.ID>

Metadata Metadata for afor a Water Quality Study Water Quality Study1. <fileName ID='File1'>Jordan Lake Study</fileName>

2. <varQnty>15</varQnty>

3. <caseQnty>2000</caseQnty>

4. <varGrp><labl>Study Procedure Information</labl></varGrp>

5. <varGrp type=”subject”><txt> The following 15 variables were used to measure water quality over a two-year period.</txt></varGrp>

6. <varGrp><defntn>Salinity is described as XXXXX</defntn></varGrp>

MetadataMetadata Data about the content, quality, condition,

and other characteristics of data (FGDC Glossary, 1992)

Additional information necessary for data to be useful (Musik, 1997)

Resource = data = object = entity = document = data object

Why metadata?Why metadata?

Facilitate discovery Permit use – intellectual and technical Manage and preserve Secure

Help advance the field of evolutionary biology

Range of published data objectsRange of published data objects

Table, graph Dataset Research methods / procedures

– Bayesian inference of phylogeny – Meta-analysis– Computational biology

Metadata continuumMetadata continuum

FGDC/ CSGSM

EMLDublin Core

DDI

The Knowledge Network for Biocomplexity (KNB)The Knowledge Network for Biocomplexity (KNB)

*http://knb.ecoinformatics.org//data.html

The Knowledge Network for Biocomplexity (KNB)The Knowledge Network for Biocomplexity (KNB)

*http://knb.ecoinformatics.org//data.html

ontologies

Data structures

IssuesIssues Cost

– More metadata, more cost to produce– Less metadata, cost to users

Metadata creation– Who, when, how?– Incentivizing

Preservation, sustainability– Data object and associated metadata

Open access (“a loaded word”)

– What levels of access/rights should be supported

Discussion TopicsDiscussion Topics

Range of data objects Granularity (metadata) Users: Needs, greater use Additional issues….

Metadata types and propertiesMetadata types and propertiesMetadata “type” Property, etc.

*Resource/data discovery

Title, subject

Provenance Creator, source

Terms and condition metadata (intellectual use)

Access rights, manipulation rights.

Structural metadata (technical use)

Software and hardware needs

*Resource = data = object = entity = document = data object

Range of metadata standardsRange of metadata standardsSchemes (just a few…)Schemes (just a few…) LSID TEI Header; MARC

bibliographic format, Dublin Core

EAD FGDC/CSGSM; NBII EML DDI

ODRL (Creative Commons Profile)

A Core PREMIS

CharacteristicsCharacteristics Objectives and

principles Domains

– Environment– Object type/format

Architectural Layout– Extent– Level of Complexity

Flat, hierarchical– Granularity

Range of metadata standardsRange of metadata standards

Data structure standardsData communication standardsData value standards

– Content representation, ontologies, authority files Data syntax standardsData models, architectures/packaging

Recommended