Upload
sr320
View
425
Download
3
Embed Size (px)
DESCRIPTION
Abstract Technology has significantly changed how research is done in biology. Along with this shift, it is increasingly easier and advantageous to operate in an open science framework. In this presentation I will begin by providing an overview of our research efforts with particularly attention to challenges in data analysis. Research in our lab focuses on characterizing physiological responses of shellfish to environmental change, examining impacts and adaptive potential from the nucleotide to organism level. A core component of this includes investigating the functional relationship of genetics, epigenetics, and transcription. In our research we leverage several computing infrastructure solutions that I will describe. In addition, our lab practices Open Notebook Science. I will describe the practical aspects of how we accomplish this including addressing some of the concerns and realized advantages. Beyond online lab notebooks, we are continually experimenting with different ways to use online resources to engage with a larger audience and improve science communication. I have found this is a complex balance of time and effort versus impact and will discuss how our lab group attempts to reach this balance. Bio Steven Roberts is an Associate Professor in the School of Aquatic and Fishery Sciences where his research centers around characterizing the response of aquatic organisms to environmental change. Prior to coming to the University of Washington, in 2007 he was at the Marine Biological Laboratory in Woods Hole, Massachusetts and received his PhD from the University of Notre Dame. In graduate school he spent most of his time transferring agarose gels, and now he spends most of his time transferring files.
Citation preview
Genomics on the Half Shell: Making Science more Open
Steven B. RobertsAssociate Professor
School of Aquatic and Fishery SciencesUniversity of Washington
robertslab.info
Open Science
•You are free to Share!
•Our lab practices open notebookscience
•Slides and more available @
oystergen.es/data
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
disease resistance
disease resistance
TranscriptomeProteomeDNA Methylation
disease resistance
Elevated pCO2 causes developmental delay in early larval Pacific oysters, Crassostrea gigas.Timmins-Schiffman et al 2012
Ocean Acidification
Biology
Environment
disease resistance
Ocean Acidification
Biology
EnvironmentShotgun Proteomics
10.1093/conphys/cot009
disease resistance
Ocean Acidification
Shotgun ProteomicsBiology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...eagle.fish.washington.edu/emma
disease resistance
TranscriptomeProteomeDNA Methylation
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Function?
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
mosaic
associated with gene bodies
Photo credit: Flickr, Creative Commons, dkeats
HiSeq - lane - 70G mapping - 60G
table
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Stochastic Variation
10.1093/bfgp/elt05410.6084/m9.figshare.880763
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
raw - 70G mapping - 60G tables - 40G ........
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Genome
Primary Data Table Groupings
Expressed Sequence Tags
Gene Expression Genetic Variation Epigenetic Features
transcripts
dyna
mic
RNA-Sequencing Single Nucleotide Polymorphisms
Simple Sequence Repeats
DNA Methylation
Histone Modification
miRNA ExpressionExpression Microarrays
Amplified FragmentLength Polymorphisms
Gen
omic
D
ata
Type
sData Tables
stat
ic
Gene Annotations Sequence Motifs
Gene OntologiesPathwaysOrthologs
Transposable Elements
Interactions
Other species genomes
CpG statistics
Structural Elements
SizeGrowthLocation
EnvironmentStage
TreatmentTissueTraitStrain
Transcription Factors Binding Sites
Publications
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Phenotype
Epigenetics
Genetics Environment
Increased Growth Rate Tissue Quality
DiseaseResistance
Appearance
•Amplified Fragment Length Polymorphisms
•miRNA Expression
Temperature Diet
Fecundity
Yield
•Histone Modifications
•DNA Methylation Patterns
•Simple Sequence Repeats•Single Nucleotide Polymorphisms
G e n e Ex p r e s s i o
n
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Genome
Primary Data Table Groupings
Expressed Sequence Tags
Gene Expression Genetic Variation Epigenetic Features
transcripts
dyna
mic
RNA-Sequencing Single Nucleotide Polymorphisms
Simple Sequence Repeats
DNA Methylation
Histone Modification
miRNA ExpressionExpression Microarrays
Amplified FragmentLength Polymorphisms
Gen
omic
D
ata
Type
sData Tables
stat
ic
Gene Annotations Sequence Motifs
Gene OntologiesPathwaysOrthologs
Transposable Elements
Interactions
Other species genomes
CpG statistics
Structural Elements
SizeGrowthLocation
EnvironmentStage
TreatmentTissueTraitStrain
Transcription Factors Binding Sites
Publications
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables
Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables
Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables
Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables
Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...github.com/sr320/qdod/wiki
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...github.com/sr320/qdod/wiki
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...github.com/sr320/qdod/wiki
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...github.com/sr320/qdod/wiki
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...eagle.fish.washington.edu
The Evolution of My Lab Notebook
Open Notebook Science
... there is a URL to a laboratory notebook that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world.
—Jean-Claude Bradley
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Notebook Science Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Notebook Science
Open Notebook Science
Open Notebook Science
carlboettiger.info/lab-notebook
Open Notebook Sciencegenefish.wikispaces.com
Open Notebook Science
genefish.wikispaces.com
Open Notebook Science
evernote.com/pub/che625/che625snotebook
Open Notebook Science
Open Notebook Science
Set some variables
blast
convert file format
upload to SQLShare (python client)
join in SQLShare - download
read in pandas
matplotlib generates graph of GOsllim
Open Notebook Science Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Notebook Science Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Notebook Science Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
a very new experiment
Open Notebook Science Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
a very new experiment
sr320.info
Open Notebook Science Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
a very new experiment
sr320.info
Open Science
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Science
web-native scholarship
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Photo credit: Flickr, Creative Commons, speechless
Sharing
Example
Example
Example
http://ivory.idyll.org/blog/
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
http://ivory.idyll.org/blog/
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...robertslab.info
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Science Philosophy Transparency with limited effort
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Open Science Philosophy Transparency with limited effort will try just about anything
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...
Biology
Environment
Molecular
Data Analysis
eScience
iPlant Galaxy
Notebooks
Rationale
Platforms
Open Science
Data
everything else...computationalproteomic.blogspot.com
Yasset Perez-Riverol en Wednesday, February 19, 2014
Start them early
Acknowledgements
Emma Timmins-Schiffman
Mackenzie GaveryClaire Olson
Sam WhiteBrent VadopalasJake Heare
Bill HoweDan Halperin
EPASTAR
Aquaculture Program
Saltonstall-Kennedyacidification
DNA methylation
oystergen.es/data