Upload
ira-harrell
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
myGrid: Personalised e-Biology
on the Grid
Professor Carole Goblehttp://www.mygrid.org.uk
Contact [email protected]
e-Science
myGrid: Personalised e-Science
on the Grid
Personalised extensible environments
fordata-intensive
in silico experiments in biology
e-Science & Biology
• Biology is a multi-faceted & increasingly multi-disciplinary science.
• Bioinformatics is an “e-Science”.– Discovery is done in silico on results
obtained from experiments using a number of analysis & data resources.
• Molecular biology & genomics are our particular focus.
Circadian Rhythms
• Has anyone studied the effect of neurotransmitters on the circadian rhythms in Drosophila?
• How do the functions of the clusters of proteins from my experiment interrelate? What are the proteins with a particular function?
• Is a structure known for this protein and what other proteins have a similar structure?
• Can I build a homology 3D model?• What is known about the homologous protein?
Information Weaving
• Large amounts of data & many applications.
• Highly heterogeneous.– Different types,
algorithms, forms, implementations, communities, service providers
• Highly complex and inter-related.
• Highly volatile.• Obstacles Everywhere
Descriptive knowledge
Circadian Rhythms
1. Has anyone else studied the effect of neurotransmitters on the circadian rhythms in Drosophila?
2. How do the functions of the clusters of proteins from my experiment interrelate? And what are the proteins with a particular function?
3. Is a structure known for this protein and what other proteins have a similar structure?
4. Can I build a homology 3D model?5. What is known about the
homologous protein?
1
2
54
3
E-Science Q & A
Who else has asked this question & can I use/adapt their approach?– Workflow.
What were the results at each stage?– Dynamic Data Repositories.
When was P12345 last updated?Which BLAST did I use?
– Provenance.Has PDB changed since I last ran this?
– Notification.
1
2
54
3
Personalisation.
3
54
myGrid Objectives
• Straightforward discovery, interoperation, fusion, sharing of data, knowledge and workflows.
• Explicit management of workflows.– information & processes & best practice.
• Improving quality of experiments & data.– provenance & propagating change.
• Scientific discovery is personal & global.– personalisation & collaborative working.
• Security, ownership -> valuable assets.
Who is myGrid for?
– Users, developers, maintainers.– Biologists.– Bioinformaticians, resource
providers.– Tool builders, system
administrators.
myGrid users
biologists IS specialists
infrequentproblem specificbioinformaticians
tool builders
serviceprovider
systemsadministrators
bioinformaticstool builders
myGrid Outcomes
1. e-Scientists– Environment built on toolkits for service
access, personalisation & community.– Gene function expression analysis (fly & yeast).– Annotation workbench for the PRINTS pattern
database.
2. Developers– Protocols and service descriptions.– myGrid-in-a-Box developers kit of core services.– Reference implementation services &
applications.– Bio services – already delivered.
myGrid Stack
MetadataServicesCoordination Services
DataWorkflow Directory
Networked Services
Applications
Client Framework
Governance
DirectoryProvenancePersonalisation
SemanticServices Info. Extraction Workflow Ontology
Portal User AgentCollaboration
Data
Admin
myGrid Pre-Prototype
Portal
Bioinformatic Services
PersonalRepository
Metadata:OntologyWorkflow
Enactment
Metadata:Service
DirectoryWorkflowRepository Bioinformatic Services
Portal
PersonalRepository
Meta Data:Ontology
WorkflowRepository
Meta Data:Service Type
Directory
RepositoryClient
OntologyClient
WorkflowClient
How do the functions of the clusters of proteins from my experiment interrelate?
Locating a workflow
Portal
PersonalRepository
Meta Data:Ontology
WorkflowRepository
Meta Data:Service Type
Directory
RepositoryClient
OntologyClient
WorkflowClientLocating a
workflow
Portal
PersonalRepository
Meta Data:Ontology
WorkflowRepository
Meta Data:Service Type
Directory
RepositoryClient
OntologyClient
WorkflowClientLocating a
workflow
Portal
PersonalRepository
Meta Data:Ontology
WorkflowRepository
Meta Data:Service Type
Directory
RepositoryClient
OntologyClient
WorkflowClientLocating a
workflow
Repos.Client
Bioinformatic Services
PersonalRepository
WorkflowEnactment
ServiceDirectory
4
2
2?
2?
ProvenanceData
3
WorkflowClient
Service SelectionClient
1
Running a workflow
Repos. Client
Bioinformatic Services
PersonalRepository
WorkflowEnactment
ServiceDirectory
4
2
2?
2?
ProvenanceData
3
WorkflowClient
Service SelectionClient
1
Running a workflow
myGrid generic technologies
1. Ontologies, Protocols & APIs.2. Database access from the Grid.
Reference implementation for UK DBTF.
3. Process enactment on the Grid.4. Provenance services.5. Metadata services.
– From Semantic Web: DAML+OIL, RDF(S).
6. Personalisation services.7. Reference implementation of OGSA.
Converging Technologies
Agents
Grid Computing
Web Technologies
Globus, Sun Grid Engine, Condor, DS (Jini, Corba)
SOAP, WSDL, UDDI, WSFL
DAML+OIL, OWL, RDF(S)
ACL, methodology
An early adopter for OGSA
The myGrid Team• Carole Goble• Norman Paton• Brian Warboys• Stephen Pettifer• Luc Moreau• Dave De Roure• Chris Greenhalgh• Tom Rodden• John Brooke• Paul Watson• Alan Robinson• Rob Gaizauskas• Robert Stevens• Ian Horrocks• Neil Wipat
• Matthew Addis• Nick Sharman• Rich Cawley• Simon Harper• Karon Mee• Simon Miles• Vijay Dailani• Xiaojian Liu• Tom Oinn• Martin Senger• Milena Radenkovic• Kevin Glover• Angus Roberts• Chris Wroe
• Mark Greenwood • Phil Lord• Neil Davis• Darren Marvin• Justin Ferris• Peter Li• Nedim Alpdemir• Luca Toldo• Robin McEntire• Anne Westcott• Tony Storey• Bernard Horan• Paul Smart• Robert Haynes
myGrid Partners
m
myGrid Summary
• myGrid aims to develop infrastructure middleware for an e-Biologist’s workbench.
• The setting is bioinformatics but the results are intended to be generally applicable to e-Science.
• A mix of standard, vanguard and bleeding edge technologies, advanced development and (some) research.
• Academic & commercial partnership.• myGrid project is timely & reflects a
community desire to “collaborate, or die”.
myGrid: Personalised e-Science
on the Grid.
Professor Carole Goblehttp://www.mygrid.org.uk
Contact [email protected]