Upload
kimo
View
33
Download
2
Tags:
Embed Size (px)
DESCRIPTION
my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble. Overview. e-Science is about scientists doing science A Tale of Two Projects my Experiment Design Patterns for a VRE. Comb e Chem pilot project. Video. Simulation. Properties. Analysis. - PowerPoint PPT Presentation
Citation preview
myExperiment – A Web 2.0 Virtual
Research Environment
David De Roure
Carole Goble
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 2
Overview
e-Science is about scientists doing science
– A Tale of Two Projects
myExperiment
Design Patterns for a VRE
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 3
X-Raye-Lab
Analysis
Properties
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid Middleware
StructuresDatabase
CombeChem pilot project
www.combechem.org
E-Scientists
Entire e-Science CycleEncompassing experimentation, analysis, publication, research, learning
Institutional Archive
LocalWebPublisher
Holdings
Digital Library
E-Scientists Graduate Students
Undergraduate Students
Virtual Learning Environment
E-Experimentation
E-Scientists
Technical Reports
Reprints
Peer-Reviewed Journal &
Conference Papers
Preprints & Metadata
Certified Experimental
Results & Analyses
Data, Metadata & Ontologies
http://www.ukoln.ac.uk/projects/ebank-uk/
Reducing time-to-experiment
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 5
The key observation!
“Publication at Source” describes the need to capture data and its context from the outset and maintain a complete end-to-end connection between the laboratory bench and the intellectual chemical knowledge that is published as a result of the investigation
Provenance
The details of the origins of data are just as important to understanding as their actual values
The details of the origins of data are just as important to understanding as their actual values
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 6
My Chemistry Experiment
Box of Chemists
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 7
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 9
e-Research workflows
Aggregator services
Institutional data repositories
Data curation & preservation: databases & databanks
Validation
Harvest
Data creation & capture in “Smart lab”
Deposit
Publishers: peer-review journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Search, harvest
Presentation services: portals
Data discovery, linking, citation
Linking, citation
Laboratory repository
Deposit
(Chemistry Central)
e-Crystals Federation model
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 13
Bioinformatics is not Chemistry
There are many pieces, from many boxes, but no box, and no lid with a complete picture of what the puzzle is supposed to be.
Planning? No. Metadata an afterthought
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 14
myGrid
Open Source middleware for Life Scientists that enables them to undertake in silico experiments and share those experiments and their results.
Machinery for linking together datasets and tools
Individual scientists, in under-resourced labs, who use other people’s datasets and applications.
Ad hoc & exploratory workflows (data flows)
To support sharing and collaboration between scientists to disseminate best practice and improve the quality of science
33,000 downloads; 200+ user sites; 400+ workflows;
3500 third party external services accessible.
Moved from prototype to production quality.
Open Middleware Infrastructure Institute UK
http://www.mygrid.org.uk
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 15Taverna Workflow Workbench
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 16
Users in US, Asia, UK, Europe, Australia
Systems biology Proteomics Gene/protein annotation Microarray data analysis Medical image analysis Heart simulation orchestration High throughput screening of
chemical compounds Phenotypical studies Public Health studies Clinical trial analysis Plants, Mouse, Human Astronomy Cultural Heritage
Widespread Adoption
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 17
Identified a pathway for which its correlating gene (Daxx) is believed to play a role in trypanosomiasis resistance.
Manual analysis on the microarray and QTL data failed to identify this gene as a candidate.
Repetitive, unbiased analysis.
Paul Fisher et al A Systematic Strategy for Large-Scale Unbiased Analysis of Genotype-Phenotype Correlations Bioinformatics in review
Trypanosomiasis cattle workflow reused without change to identify the biological pathways involved in sex dependence in the mouse model, previously believed to be involved in the ability of mice to expel the parasite.
Previously a manual two year study of candidate genes had failed to do this.
Recycling, Reuse, Repurposing
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 18
Service and workflow annotation
Ontology 710 classes
Full time curator
Tagging by the masses
3500 service. 350 curated
Provenance
Ontology 35 classes
Enriched with domain ontologies and service ontologies. Possibly.
Export with data. Desirably.
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 19
New Scientific Digital Artefacts
Design
Workflow design history
Experiment purpose
Scientist
LogBook
Workflow run log
Data lineage
Results interpretation log
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 20
Kepler
Triana
New digital artefacts
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 21
myExperiment.org Portal Party
28th & 29th Sept 2006
Hand picked Taverna users + Taverna development team
Facilitated by NCeSS.
AJAX based development
CombeChem xfer
1. A social networking environment for sharing any workflow
2. A Taverna workflow run environment
3. A multi-workflow launch environment
NeSC VRE Workshop
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 24
openwetware.org
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 25
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 26
What are we trying to do?
Enabling scientists to be (more) creative.
Enabling scientists to be scientists. And not programmers.
Enabling mediocre scientists to become better and thus have better science.
Enabling smart scientists to be smarter and propagate their smartness.
Accelerate dissemination, pooling, insight.
Encouraging sanctioned plagiarism.
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 27
Principles
Focus on making it easy to publish information
– Discovering and sharing experimental artefacts
– Publishing results to standard community repositories
– Publishing scholarly output
Familiar social networking / web paradigms
– Keeping it free and fluid and creative. Me-Science.
Crossing system boundaries
– Trans-workflow
Crossing discipline boundaries
– Multi-disciplinary, Inter-disciplinary, Trans-disciplinary
– Clustering expertise
– Intellectual fusion outside discipline. We-Science.
– Life Science, Social Science, Astronomy, Chemistry
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 28
Scoping exercise
Workflow warehouse / federation of repositories Open Archives Initiative. Federated myExperiments. Sharepoint.
Social space + organised rich site Social discourse + organised service / workflow space using curated semantics.
Granularity and identifiers Rolling-up provenance. Id resolution
Open vs protected content Quality, Reliability, Validation, Safety, Intellectual Property, Ownership, Secrecy, A duty of guardianship. Curation? Policing? Local data mixed with shared resources
Desktop integration Google gadgets for workflows. Interacting with workflows through Office products.
Workflow execution (WHIP) Workflows Hosted in Portals project
Evolving the myExperiment software Community development
Enabling Scientists added value through applications and collaborative tagging
NeSC VRE Workshop
Hack Fest
26/2/2007 | myExperiment | Slide 29
NeSC VRE Workshop
Q1. Workflow Warehouse orFederation of Repositories?
Everything on the myExperiment.org web site
vs
Distributed stores
Multiple myExperiments
26/2/2007 | myExperiment | Slide 32
NeSC VRE Workshop
Q2. Social Space or Shoe Shop?
26/2/2007 | myExperiment | Slide 3326/2/2007 | myExperiment | Slide
33
Shopping for Workflows and Services and Data should be as easy as shopping for shoes.
Organic growth is good and bad.
Social tagging might help discover workflows but we need good metadata for automated use.
NeSC VRE Workshop
Q3. How open is the content?
OpenWetware is open
Our users don’t want this
Provenance helps
26/2/2007 | myExperiment | Slide 34
NeSC VRE Workshop
Q4. Integration
Bring user to Web Site
vs
Bringing myExperimentness to existing interfaces
26/2/2007 | myExperiment | Slide 35
NeSC VRE Workshop
Web 2.0 Design Patterns
http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html
26/2/2007 | myExperiment | Slide 36
1. The Long Tail
2. Data is the Next Intel Inside
3. Users Add Value
4. Network Effects by Default
5. Some Rights Reserved
6. The Perpetual Beta
7. Cooperate, Don't Control
8. Software Above the Level of a Single Device
NeSC VRE Workshop
1. The Long Tail
Our target users are not just the specialist e-Scientists using computing resources to tackle major scientific breakthroughs, but also the large number of scientists conducting the routine processes of science on a daily basis.
Through sharing we have the potential to enable smart scientists to be smarter and propagate their smartness, in turn enabling other scientists to become better and conduct better science.
26/2/2007 | myExperiment | Slide 37
NeSC VRE Workshop
2. Data is the Next “Intel Inside”
myExperiment understands that scientists are focused on data, not software or one particular workflow engine.
Workflows are components of customised applications, many of which are data-oriented rather than process-oriented.
Users manipulate, through their own applications, the product (data, model) yielded by the workflow.
Furthermore, workflows themselves are the data of myExperiment and provide its unique value.
26/2/2007 | myExperiment | Slide 38
NeSC VRE Workshop
3. Users Add Value
myExperiment makes it easy to find workflows and is designed to make it useful and straightforward to share workflows and add workflows to the pool.
To succeed we draw on the insights into the incentive models of scientists gained through experience with Taverna.
26/2/2007 | myExperiment | Slide 39
NeSC VRE Workshop
4. Network Effects by Default
myExperiment aggregates user data as a side-effect of using the VRE.
The ability to execute workflows from myExperiment, and the integration of tools such as Taverna with myExperiment, further enable us to achieve increased value through usage.
26/2/2007 | myExperiment | Slide 40
NeSC VRE Workshop
5. Some Rights Reserved
myExperiment users require protection as well as sharing, but the environment is designed for maximum ease of sharing to achieve collective benefits – workflows are "hackable" and "remixable".
Initiatives such as Science Commons provide a useful context for this.
26/2/2007 | myExperiment | Slide 41
NeSC VRE Workshop
6. The Perpetual Beta
myExperiment is an online service (a collection of online services) and is continually evolving in response to its users.
To support this, the project commenced with developers being embedded in the user community.
Through day-to-day contact between designers and researchers, design is both inspired and validated.
26/2/2007 | myExperiment | Slide 42
NeSC VRE Workshop
7. Cooperate, Don't Control
myExperiment is a network of cooperating data services with simple interfaces which make it easy to work with content.
It both provides services and reuses the service of others.
It aims to support lightweight programming models so that it can easily be part of loosely coupled systems.
26/2/2007 | myExperiment | Slide 43
NeSC VRE Workshop
8. Software Above the Level of a Single Device
The current model of Taverna running on the scientist’s desktop PC or laptop is evolving into myExperiment being available through a variety of interfaces and supporting workflow execution.
26/2/2007 | myExperiment | Slide 44
NeSC VRE Workshop
Closing
e-Science is difficult – workflows and Web 2.0 make it easier.
Our design workshops and the review against Web 2.0 design patterns have revealed the relationship between myExperiment and Web 2.0.
The collective benefits of participation arise not only from the users but also from the developers – ease of use and ease of development.
It might be useful to review other VREs against the design patterns.
26/2/2007 | myExperiment | Slide 45
NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 46
Take homes
myExperiment is a Web 2.0 Environment for Scientists to share experiments
Join us!
David De Roure – [email protected]
Carole Goble – [email protected]
NeSC VRE Workshop
Credits
myGrid and CombeChem
Matt Lee
David Withers
Don Cruickshank
Rob Procter
Alex Voss
June Finch
Ed Zaluska
All the users inc. embedders26/2/2007 | myExperiment | Slide 47