Upload
hector-quintero-casanova
View
479
Download
2
Embed Size (px)
DESCRIPTION
Short presentation on the lifecycle of scientific data and how it relates to the Glastir Monitoring and Evaluation Programme. The GMEP is effectively a "real-time" healthcheck system for the new Welsh agri-environment scheme Glastir.
Citation preview
Processing of scientific dataFrom field capture to web delivery
Hector Quintero CasanovaPostgraduate in e-Science
● GMEP ticks all the boxes:
✔ Highly multidisciplinary: social, landscape, water, birds
plants...
✔ Large volumes of data: covers the whole of Wales.
✔ Cross-organisational collaboration: 13 institutions.
Why e-Science? Data-intensive
Why e-Science? Metadata
● NERC's data policy says it all
– “It is essential that metadata are submitted”
● Metadata = context information about data
– Provenance = who, when, where, how
● Exposes data relationships → traceability
– Workflow = how. Essential if using models
● Enables reproducing outcome → repeatability
● Exactly what information depends on the stage.
● Raw data from the field– Metadata: method, calibration, place, units...
Data collection
● Information products: e.g. data from models– Metadata: name, conditions, where it applies
Data analysis
Data analysis
● Workflow metadata avoids costly reruns
– Identify model output needed → reuse
● But not enough for cross-organisation collab.
– 13 institutions in Glastir.
– Differences in storage structure, metadata defs...
● Need extra layer(s) for seamless access
– Web already offers tools needed.
Publication: linked data
● HTTP for generic retrieval of resources
● URIs for unique identification of those resources
– E.g. http://www.ceh.ac.uk
● Both can be used to build web services
– Amount to remote functions.
– Eg: seamless recording of workflows across institutions.
● Semantics for automated reasoning
– Acts as standardised metadata aimed at machines.
… We've come full circle!
¿?
Hector Quintero Casanova Postgraduate in e-Science
Thank youwww.hqcasanova.com