Collaboration and Sharing

Preview:

DESCRIPTION

Presented by Carole Goble at the JISC Future of Research Conference, 19th October 2010

Citation preview

Collaboration and sharing computational research methods

Carole GobleUniversity of Manchester, UKcarole.goble@manchester.ac.uk

Scientific research is generally held to be of good provenance when it is documented in detail sufficient to allow reproducibility.

http://en.wikipedia.org/wiki/Provenance#Science

Materials +

Methods+

Results=

Publication

In silico Experimental Standard Operating, Procedures, Protocols, Plans

http://usefulchem.wikispaces.com/page/code/EXPLAN001

E. Science laboris: in silico experimentsAutomated, reusable scripted analysis pipelines - workflows

Data processing , data chainingData and tool integrationAnnotation pipelinesAnalyticsSimulation steering and parameter sweepsPublication miningModel and hypothesis buildingResult Validation and comparisonData cleaning, curation and preservation

Shield from clouds and clusters details.Record provenance: steps, methods, results

http://www.mygrid.org.uk/tools/taverna/

Genetic variation in cattle species. Food security, biodiversity.

Resistance to African trypanosomiasis infection (sleeping sickness)

Liverpool (Kemp), Manchester (Brass), Nairobi

Comparing new data with reference genomes, prior results and the literature to identifying interesting differences

22 million SNPs

Little Science + Big Science

http://www.genomics.liv.ac.uk/tryps/trypsindex.html

Bottom up Effectiveness in Research

• Automated, repeatable, tracked plumbing– Using institutional and community computing

infrastructures, tools and datasets• Easier access to best of breed and “surfing” results

– non-developers access to sophisticated codes and applications, shielded from nasty computing details.

• Leverage applications, services, datasets and codes shielded from computing details. – Honors original codes and applications. Heterogeneous

coding styles and tools sets. The best applications.• Extensibility, adaptability & innovation.

– My stuff. Variant design.

Reuse, Recycle, Repurpose, Mash, Trade, Publish

Dr Paul Fisher

Dr Jo Pennock

Identify biological pathways implicated in resistance to Trypanosomiasis in cattle using mouse as a model organism.

Identify the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the whipworm parasite.

Levison S.E., et al Inflammatory Bowel Diseases (2010)

Fisher P, et al Nucleic Acids Research, 2007, 35(16) 5625-5633

Global Long Tail

How do I find and share methods across the

institution, communities, the web ?

How do I connect with other authors and users?

How do I know if its any good or right for me?

Who else is using it?

Where do I comment on my experiences?

http://www.myexperiment.org

• Socially share, discover, review and reuse workflows and other scientific methods.

• Cooperative market place.• A scientific gateway.

• Commons-based Production + Social networking

• Primary contribution, reviewing and curating.

Find experts and peers, advice, workflows, packs

Contribute, review and curate workflows

Train and educateLaunch workflows Cloud Methods Commons

http://www.myexperiment.org

Facts and Figures: Boutique but Beautiful

• Public Service: 1325 workflows, 349 files, 138 packs, 4129 registered members, 235 groups, 56 different countries, ~ 3000 unique hits per month. Workflows viewed/downloaded many 1000s of times.

• Adopted by 19 workflow systems and integrated into workflow workbenches: Galaxy, Taverna– Biology, chemistry, image analysis, social science, astronomy, engineering,

music…– Specialised clones in Music & NeuroScience. – Focus of research on workflow patterns and analytics.

• JISC funding since 2007• (Other funding: Microsoft, EU, EPSRC, BBSRC)

Effectiveness and Open CollaborationOpen platform, off the shelf components, open development, open linked data, Web 3.0 funky

Google gadgets

Application plugins

Linked Data Cloud

Friends, colleagues, resources

Literature

Images

LogBook

SoftwarePresentations

Data (files, spreadsheets)

Compute resources

Backup and Archive

WorkbenchGadgets

Publishing

[Duncan Hull]

Social collaboration environments (“e-Laboratories”) Collaboration, acceleration and transparency through Human Computation

Workflow management system Collaboration, acceleration and transparency through Automated Computation.

http://www.mygrid.org.uk

Institutional challenges

Adoption of Reproducible MethodsNew Publishing and Learning Objects

Pre-and Post Publication Metadata DifferentialsCitation, Credit and Reputation

Curation Costs

Methods Matter

Reproducible (or at least

defendable) Research

many eyes

Science 2010

scientists

LocalWeb

Repositories

Graduate Students

Undergraduate Students

Virtual Learning Environment

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints &

Metadata

Certified Experimental

Results & Analyses

experimentation

Data, Metadata Provenance WorkflowsOntologies

Digital Libraries

[De Roure]

Actionable scholarly publishing & learning

Actionable Compound Research Objects

Data and Method burialSupplementary informationText miningThe rise of the Wiki

Competitive advantage.Adoption.

Credit.Help.Fame.

Reputation.

Being scooped.Scrutiny.

Misinterpretation.Cost.

Blame. Reputation.

Rew

ards

Risk

s

Nature 461, 145 (10 September 2009)

Trust

“Its not ready yet”

“I need to get (another) publication first”

“We don’t have the resources or skills to prepare it for others, esp. now we finished that project”

“Its faster/easier to do it myself, and will get the credit/control too”

“Its not described enough to be usable”

“I don’t trust the quality. Its not reliable enough. Its too noisy.

“Others won’t use it properly.” “Its not worth my while”

Sharing Governance ….

Credit & Reputation Quality & ReassuranceProvenance

Crowd Contribution

CreditRewardCareerUse ProfilesCitation

Method Building too.

Nature 2008

Credit and Reputation

T Shirts are not enough

Software

Community building

Automation

Social & Cultural

Coordination

GovernanceQUALITY for REUSE

Sustainability

Public Service

Community generated and curated Content

Open Software

Teaching aid

Methodology

Dig Library/Repository

Social network

Computer science researchSoftware engineeringComputational researchersResearchersSocial Science Social experiment

Collaboration platform

Free like puppies

Take Home: Methods Matter.

• Workflows are a transformative mechanism of connecting tools and encoding know-how– Scientists stand on the shoulders of resource experts

• myExperiment is a example of a collaborative environment for connecting workflow authors and users– Authors stand on the shoulders of each other

• The Power of Collectivism.• Rewards and risks of researchers in competitive

research.• Cultural shift in reward, adoption and support for

building, sharing and curating computational methods.

AcknowledgementsmyExperiment Director: David De Roure

Developers• Jiten Bhagat• Don Cruickshank• Danius Michaelides • David Newman• Sergejs Aleksejevs• Mark Borkum• Matt Lee• Tom Foster

Allied, Contributing Projects• Thomas Laurent• Eric Nzuobontane • Ian Dunlop• Stuart Owen• Shoaib Sufi• Sean Bechhofer• Rodrigo Lopez• Steve Pettifer• Mannie Tags• Finn Bacall• Sarah Thew• Matt Gamble• Tim Clark

Users• Katy Wolstencroft• Paul Fisher• Duncan Hull• Franck Tanoh• Andrea Wiggins• Marco Roos• Jerzy Orlowski• Olga Krebs• Wolfgang Mueller• Tony Linde

Social Scientists• Yuiwei Lin• Rob Proctor• Meik Poschen• Jonathan Foster

Sponsors• Savas Parastatidis • Roger Barga• Derick Campbell• Tony Hey

Contact

David De Rouredder@ecs.soton.ac.uk

Carole Goblecarole.goble@manchester.ac.uk

Visit myexperiment.org

Recommended