17
Metadata Issues Underlying Metadata Issues Underlying the Development of a Data the Development of a Data Repository Repository for Evolutionary Biology for Evolutionary Biology Sarah Carrier, SILS, Master’s Student Jackson Dube, Visiting Scholar, SILS/MRC Jane Greenberg, Associate Professor, Director SILS/Metadata Research Center <MRC>, UNC-CH Ruth Monnig, Doctoral Research Assistant, SILS/MRC

Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Embed Size (px)

DESCRIPTION

Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology. Sarah Carrier, SILS, Master’s Student Jackson Dube, Visiting Scholar, SILS/MRC Jane Greenberg, Associate Professor, Director SILS/Metadata Research Center , UNC-CH - PowerPoint PPT Presentation

Citation preview

Page 1: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Metadata Issues Underlying the Metadata Issues Underlying the Development of a Data RepositoryDevelopment of a Data Repository

for Evolutionary Biologyfor Evolutionary Biology

Sarah Carrier, SILS, Master’s StudentJackson Dube, Visiting Scholar, SILS/MRCJane Greenberg, Associate Professor, Director SILS/Metadata Research Center <MRC>, UNC-CHRuth Monnig, Doctoral Research Assistant, SILS/MRC

Page 2: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

OverviewOverview

1. Metadata defined 2. Role of metadata in a repository3. Range of metadata standards4. Issues5. Discussion

Page 3: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

The Knowledge Network for Biocomplexity (KNB)The Knowledge Network for Biocomplexity (KNB)

http://knb.ecoinformatics.org//data.html

Page 4: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Metadata Example for Metadata Example for a specimena specimen

Family: Pinaceae Species: Pinus serotinaDate identified: 1958-05-10County: Pasquotank CountyLocation collected: Woodland Border, 2.3

miles north east of NisontonCollected by: Harry E. Ahles

<Species> Pinus serotina </Species><Date.ID><scheme=SPEC.W3CDTF“>1958-

05-10 <Date.ID>

Page 5: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Metadata Metadata for afor a Water Quality Study Water Quality Study1. <fileName ID='File1'>Jordan Lake Study</fileName>

2. <varQnty>15</varQnty>

3. <caseQnty>2000</caseQnty>

4. <varGrp><labl>Study Procedure Information</labl></varGrp>

5. <varGrp type=”subject”><txt> The following 15 variables were used to measure water quality over a two-year period.</txt></varGrp>

6. <varGrp><defntn>Salinity is described as XXXXX</defntn></varGrp>

Page 6: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

MetadataMetadata Data about the content, quality, condition,

and other characteristics of data (FGDC Glossary, 1992)

Additional information necessary for data to be useful (Musik, 1997)

Resource = data = object = entity = document = data object

Page 7: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Why metadata?Why metadata?

Facilitate discovery Permit use – intellectual and technical Manage and preserve Secure

Help advance the field of evolutionary biology

Page 8: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Range of published data objectsRange of published data objects

Table, graph Dataset Research methods / procedures

– Bayesian inference of phylogeny – Meta-analysis– Computational biology

Page 9: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Metadata continuumMetadata continuum

FGDC/ CSGSM

EMLDublin Core

DDI

Page 10: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

The Knowledge Network for Biocomplexity (KNB)The Knowledge Network for Biocomplexity (KNB)

*http://knb.ecoinformatics.org//data.html

Page 11: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

The Knowledge Network for Biocomplexity (KNB)The Knowledge Network for Biocomplexity (KNB)

*http://knb.ecoinformatics.org//data.html

ontologies

Data structures

Page 12: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

IssuesIssues Cost

– More metadata, more cost to produce– Less metadata, cost to users

Metadata creation– Who, when, how?– Incentivizing

Preservation, sustainability– Data object and associated metadata

Open access (“a loaded word”)

– What levels of access/rights should be supported

Page 13: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Discussion TopicsDiscussion Topics

Range of data objects Granularity (metadata) Users: Needs, greater use Additional issues….

Page 14: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology
Page 15: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Metadata types and propertiesMetadata types and propertiesMetadata “type” Property, etc.

*Resource/data discovery

Title, subject

Provenance Creator, source

Terms and condition metadata (intellectual use)

Access rights, manipulation rights.

Structural metadata (technical use)

Software and hardware needs

*Resource = data = object = entity = document = data object

Page 16: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Range of metadata standardsRange of metadata standardsSchemes (just a few…)Schemes (just a few…) LSID TEI Header; MARC

bibliographic format, Dublin Core

EAD FGDC/CSGSM; NBII EML DDI

ODRL (Creative Commons Profile)

A Core PREMIS

CharacteristicsCharacteristics Objectives and

principles Domains

– Environment– Object type/format

Architectural Layout– Extent– Level of Complexity

Flat, hierarchical– Granularity

Page 17: Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology

Range of metadata standardsRange of metadata standards

Data structure standardsData communication standardsData value standards

– Content representation, ontologies, authority files Data syntax standardsData models, architectures/packaging