27
A Virtual A Virtual Laboratory Laboratory for Global for Global Biodiversity Biodiversity Analysis Analysis

A Virtual Laboratory for Global Biodiversity Analysis

Embed Size (px)

Citation preview

Page 1: A Virtual Laboratory for Global Biodiversity Analysis

A Virtual A Virtual Laboratory for Laboratory for

Global Global Biodiversity Biodiversity

AnalysisAnalysis

Page 2: A Virtual Laboratory for Global Biodiversity Analysis

22

Talk OutlineTalk Outline

Background Project InformationBackground Project Information

Biodiversity World SystemBiodiversity World System

Current Progress and Future WorkCurrent Progress and Future Work

Q & (hopefully) AQ & (hopefully) A

Page 3: A Virtual Laboratory for Global Biodiversity Analysis

33

Project ParticipantsProject Participants

Southampton:Southampton: Oliver BromleyOliver Bromley

Cardiff:Cardiff: Alec GrayAlec Gray Andrew JonesAndrew Jones Richard WhiteRichard White Nick FiddianNick Fiddian Xuebiao XuXuebiao Xu Nick PittasNick Pittas

Bristol:Bristol: PaulPaul ValdesValdes

ReadingReading:: Frank BisbyFrank Bisby Alistair CulhamAlistair Culham Neil CaithnessNeil Caithness Tim SuttonTim Sutton Peter BrewerPeter Brewer Chris YessonChris Yesson

NHM:NHM: Malcolm ScobleMalcolm Scoble Paul WilliamsPaul Williams Shonil BhagwatShonil Bhagwat

Page 4: A Virtual Laboratory for Global Biodiversity Analysis

44

Project AimsProject Aims

To create a prototype problem-solving To create a prototype problem-solving environment (PSE) for Global Biodiversity environment (PSE) for Global Biodiversity research on the GRID.research on the GRID.

To demonstrate application of the prototype To demonstrate application of the prototype in a range of data/computation intensive in a range of data/computation intensive biodiversity investigations.biodiversity investigations.

Page 5: A Virtual Laboratory for Global Biodiversity Analysis

55

ObjectivesObjectives

1.1. to establish a biodiversity GRID with nodes at to establish a biodiversity GRID with nodes at Reading, Cardiff, Southampton and the NHMReading, Cardiff, Southampton and the NHM

2.2. to design the architecture of a GRID-based to design the architecture of a GRID-based PSEPSE

3.3. to build, and test the basic systemto build, and test the basic system

4.4. to demonstrate the system in use for three to demonstrate the system in use for three exemplar analyses.exemplar analyses.

Page 6: A Virtual Laboratory for Global Biodiversity Analysis

66

Computer Science ChallengesComputer Science Challenges

To achieve the seamless integration of the To achieve the seamless integration of the resources required in order to construct the PSEresources required in order to construct the PSE

Deal with heterogeneity of resourcesDeal with heterogeneity of resources

Accommodate for the complex analyses requiredAccommodate for the complex analyses required

Consider metadata formats for selecting and Consider metadata formats for selecting and interpreting data from appropriate resourcesinterpreting data from appropriate resources

Apply the above in a GRID environmentApply the above in a GRID environment

Page 7: A Virtual Laboratory for Global Biodiversity Analysis

77

ExemplarsExemplars

BioClimatic Modelling & Climate ChangeBioClimatic Modelling & Climate Change

BioDiversity Richness & Conservation BioDiversity Richness & Conservation EvaluationEvaluation

Phylogenetic Analysis & BiogeographyPhylogenetic Analysis & Biogeography

Page 8: A Virtual Laboratory for Global Biodiversity Analysis

88

BioClimatic ModellingBioClimatic Modelling

Predicting species distributions under Predicting species distributions under past, present and future climate scenarios.past, present and future climate scenarios.

Models:Models: GARP (Genetic Algorithms for Rule-set Production)GARP (Genetic Algorithms for Rule-set Production) CSM (Climate Space Models)CSM (Climate Space Models) BioclimBioclim

Page 9: A Virtual Laboratory for Global Biodiversity Analysis

99

Page 10: A Virtual Laboratory for Global Biodiversity Analysis

1010

BioClimatic Modelling (cont)BioClimatic Modelling (cont)

Has the plant already reached all suitable Has the plant already reached all suitable environments world-wide, or are further environments world-wide, or are further expansions possible? expansions possible?

In which parts of Australia might it be worth In which parts of Australia might it be worth introducing the plant?introducing the plant?

Where will this plant die out, and where might it Where will this plant die out, and where might it now appear, if the world is subject to some of now appear, if the world is subject to some of the global warming scenarios?the global warming scenarios?

Page 11: A Virtual Laboratory for Global Biodiversity Analysis

1111

Biodiversity Richness & Biodiversity Richness & Conservation EvaluationConservation Evaluation

Concerned with analysis of biodiversity richness Concerned with analysis of biodiversity richness patterns for particular taxa around the globe.patterns for particular taxa around the globe.

Different approaches to measuring biodiversity Different approaches to measuring biodiversity (by species richness or by taxic diversity) (by species richness or by taxic diversity) depending on the purposes for which the depending on the purposes for which the measures are required.measures are required.

WORLDMAP to be used as the analysis WORLDMAP to be used as the analysis software (NHM)software (NHM)

Page 12: A Virtual Laboratory for Global Biodiversity Analysis

1212

Page 13: A Virtual Laboratory for Global Biodiversity Analysis

1313

Biodiversity Richness & Biodiversity Richness & Conservation Evaluation (cont)Conservation Evaluation (cont)

Enhance conservation network design by Enhance conservation network design by answering questions about patterns of answering questions about patterns of complementarity (species difference complementarity (species difference among areas).among areas).

Provide biodiversity richness assessment Provide biodiversity richness assessment for the for the Geometer MothsGeometer Moths group. group.

Page 14: A Virtual Laboratory for Global Biodiversity Analysis

1414

Phylogenetic Analysis & Phylogenetic Analysis & BiogeographyBiogeography

Aims to use phylogeny to interpret Aims to use phylogeny to interpret biodiversity data such as:biodiversity data such as:

Species distribution,Species distribution,

Species morphology, andSpecies morphology, and

Life history evolution.Life history evolution.

Page 15: A Virtual Laboratory for Global Biodiversity Analysis

1515

Phylogenetic Analysis & Phylogenetic Analysis & Biogeography (cont)Biogeography (cont)

is geography a good predictor of is geography a good predictor of relationship among lineages?relationship among lineages?

do all lineages show the same dispersal do all lineages show the same dispersal capacity? capacity?

have lineages stayed put, adapting in situ have lineages stayed put, adapting in situ while climates have changed?while climates have changed?

Page 16: A Virtual Laboratory for Global Biodiversity Analysis

1616

Architecture (simplified)Architecture (simplified)

Us e r I n te rfa ce

M e ta da taR e po s ito ry

W o rk f lo w M a n a g e r

W rap p er

W rap p er

W rap p er

W rap p er

R e s o u r c e M o d u l e s

B D W O R L D G R I D I n te rfa ce

O n to lo g y

Page 17: A Virtual Laboratory for Global Biodiversity Analysis

1717

Data FlowData Flow

ResourceWrapperBGICoreRequest

Response

Metadata WorkflowDesigner

WorkflowManagerTaxonomic

Verification

Page 18: A Virtual Laboratory for Global Biodiversity Analysis

1818

Resource ModulesResource Modules

Two types of resources:Two types of resources: Analytic resources (services)Analytic resources (services) Data resources.Data resources.

Wrapped through standard communication Wrapped through standard communication Interface.Interface.

Thus BDWorld GRID split into 2 sub-Grids:Thus BDWorld GRID split into 2 sub-Grids: Computational GRIDComputational GRID Data GridData Grid

Page 19: A Virtual Laboratory for Global Biodiversity Analysis

1919

Workflow ManagerWorkflow Manager

Main point of entry for the user into the system.Main point of entry for the user into the system.

Allows the user to define the sequence of tasks (with Allows the user to define the sequence of tasks (with associated data) in order to complete an analysis.associated data) in order to complete an analysis.

Two versions investigated in parallel:Two versions investigated in parallel: Current one based on Current one based on XPDLXPDL representation and the Open representation and the Open

Business Engine (Business Engine (OBEOBE) WF engine.) WF engine.

Has been decided to revert to the TRIANA workflow engine (if Has been decided to revert to the TRIANA workflow engine (if not both the TRIANA WFM engine and UI) in the near future.not both the TRIANA WFM engine and UI) in the near future.

Page 20: A Virtual Laboratory for Global Biodiversity Analysis

2020

Metadata RepositoryMetadata Repository

Allows for resources to publish their metadata:Allows for resources to publish their metadata: Computational capabilitiesComputational capabilities Supported data typesSupported data types

Allow for workflow (sub)sequences to verify their validityAllow for workflow (sub)sequences to verify their validity

Holds advanced system information:Holds advanced system information: Data provenanceData provenance Alternative computational resources for a given taskAlternative computational resources for a given task

Currently implemented as a relational DBCurrently implemented as a relational DB

Plans to move on a semantically more flexible Plans to move on a semantically more flexible implementation (OAVs) in future releasesimplementation (OAVs) in future releases

Page 21: A Virtual Laboratory for Global Biodiversity Analysis

2121

OntologyOntology

Provides a high level description of system Provides a high level description of system entitiesentities

Helps the user in workflow formulationHelps the user in workflow formulation Concept hierarchies that denote equivalent Concept hierarchies that denote equivalent

concepts/resourcesconcepts/resources Concept association showing intra-concept Concept association showing intra-concept

relationshipsrelationships

Currently work in progressCurrently work in progress

Page 22: A Virtual Laboratory for Global Biodiversity Analysis

2222

Communications Layer APICommunications Layer API

Allows for communication Allows for communication among system among system componentscomponents

Remote component Remote component invocationinvocation

Data interchangeData interchange

Provides clients that Provides clients that invoke resources invoke resources transparentlytransparently

Provides server side Provides server side behaviour to resources by behaviour to resources by inheriting a single classinheriting a single class

Implements a standard Implements a standard Data Exchange format Data Exchange format (Object (Object XML XML serialisation/revival).serialisation/revival).

Allows for monitoring the Allows for monitoring the progress of active progress of active processes.processes.

Currently existing in two Currently existing in two ““flavours” :flavours” :

RMI-based one (pre-RMI-based one (pre-GLOBUS era)GLOBUS era)

OGSA –based.OGSA –based.

Page 23: A Virtual Laboratory for Global Biodiversity Analysis

2323

Communications Layer API (cont)Communications Layer API (cont)

Page 24: A Virtual Laboratory for Global Biodiversity Analysis

2424

Using GlobusUsing Globus

No major problems encountered (post-No major problems encountered (post-beta versions)beta versions)

Tricky to work with complex data Tricky to work with complex data structuresstructures

Deployment of finished resources is time-Deployment of finished resources is time-consuming/error-prone.consuming/error-prone.

Page 25: A Virtual Laboratory for Global Biodiversity Analysis

2525

Current ProgressCurrent Progress

RMI communications layer currently usedRMI communications layer currently used

OGSA based architecture about to be rolled out (was?)OGSA based architecture about to be rolled out (was?)

Enough resources to carry out the Bioclimatic analysis Enough resources to carry out the Bioclimatic analysis exemplar. (Biogeography almost ready)exemplar. (Biogeography almost ready)

Rudimentary Metadata repository exists.Rudimentary Metadata repository exists.

XDPL/OBE Workflow manager has been successfully XDPL/OBE Workflow manager has been successfully implemented. TRIANA currently considered.implemented. TRIANA currently considered.

Page 26: A Virtual Laboratory for Global Biodiversity Analysis

2626

Future plansFuture plans

Introduce system’s OntologyIntroduce system’s Ontology

Revert to TRIANARevert to TRIANA

Convert to an all OGSA architecture(??)Convert to an all OGSA architecture(??)

Introduce resources/workflows for the remaining 2 Introduce resources/workflows for the remaining 2 exemplars.exemplars.

Enhance the representational power of the metadata Enhance the representational power of the metadata repository.repository.

Page 27: A Virtual Laboratory for Global Biodiversity Analysis

2727