Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
How to Display and Download ENCODE Data
Mike Pazin
NHGRI, NIH
Goals of ENCODE
• Catalog all functional elements in the genome
• Freely available resource for all biologists
• Human as well as other species
• Project components:
– Data generation
– Data analysis
– Data repository
ENCODE Data From HaploReg
1
2 3
http://www.broadinstitute.org/mammals/haploreg/haploreg.php Ward and Kellis, Nucleic Acids Research 40-D930, 2011
Handout
ENCODE Data From RegulomeDB
1
2
3
http://regulome.stanford.edu/ ; Boyle…Snyder, Genome Research 22-1790,2012
Handout
RegulomeDB Disease Database
http://regulome.stanford.edu/GWAS; Schaub…Snyder, Genome Research 22-1748,2012
How Can The Display Be Configured?
Handout
Over 170 Publications Using ENCODE Data
Frazer lab, Nature 470-264,2011 Anderson lab, Nature Genetics 44-1137,2012
ENCODE Citation
ENCODE Consortium Brad Bernstein (Eric Lander, Manolis Kellis/Luke Ward, Tony Kouzarides)
Ewan Birney (Jim Kent, Mark Gerstein, Bill Noble, Peter Bickel, Ross Hardison, Zhiping Weng)
Greg Crawford (Ewan Birney, Jason Lieb, Terry Furey, Vishy Iyer)
Jim Kent (David Haussler, Kate Rosenbloom)
John Stamatoyannopoulos (Evan Eichler, George Stamatoyannopoulos, Job Dekker, Maynard Olson, Michael Dorschner, Patrick Navas, Phil Green)
Mike Snyder (Kevin Struhl, Mark Gerstein, Peggy Farnham, Sherman Weissman)
Rick Myers (Barbara Wold)
Scott Tenenbaum (Luiz Penalva)
Tim Hubbard (Alexandre Reymond, Alfonso Valencia, David Haussler, Ewan Birney, Jim Kent, Manolis Kellis, Mark Gerstein, Michael Brent, Roderic Guigo)
Tom Gingeras (Alexandre Reymond, David Spector, Greg Hannon, Michael Brent, Roderic Guigo, Stylianos Antonarakis, Yijun Ruan, Yoshihide Hayashizaki)
Zhiping Weng (Nathan Trinklein, Rick Myers)
NHGRI: Elise Feingold, Peter Good, Laura Dillon, Rebecca Lowdon, Leslie Adams, Caroline Kelly, Shaila Chhibba, Sherry Zhou, Katya Vaydylevich
Additional ENCODE Participants: Elliott Marguiles, Eric Green, Job Dekker, Laura Elnitski, Len Pennachio, Jochen Wittbrodt
.. and many senior scientists, postdocs, students, technicians, computer scientists, statisticians and administrators in these groups
Questions? NHGRI booth (# 825) noon- 1 PM; [email protected]
What Do The Data Mean?
• Some standard interpretations-
– RNA
– Histone modifications
– DNase
– Transcription Factor ChIP
How Can ENCODE Data Be Used?
• A standard problem: many genetic findings for human disease map to non-protein coding regions of the human genome – What is the functional variant? – What is the target gene? – What is the target cell type? – What is the function of the variant?
• Standard Browser view for loci of interest • HaploReg and RegulomeDB searches for loci of interest • Search a cell type across all data types for loci of interest • Search a data type across all cell types for loci of interest
ENCODE Browser, Locus of Interest
http://encodeproject.org
ENCODE Browser, Cells of Interest
http://encodeproject.org
NIH Roadmap Epigenomic Mapping Consortium
Range of cells/tissues covered:
Currently 125 cell/tissue types represented including….
iPS and ES cells, some differentiated forms
Fetal tissues (heart, brain, kidney, lung, others)
Adult primary cells and tissues (hematopoietic, brain regions,
breast cell types, liver, kidney, colon, muscle, adipocytes, others)
Some samples will have:
Expanded panel of histone modifications (currently 20
additional)
A public community resource of epigenomic data in primary human cells and tissues.
Most samples will have:
DNA methylation data (RRBS, MRE-seq, MeDIP-seq, whole
genome bisulfite seq)
ChIP-seq data (currently H3K27me3, H3K36me3, H3K4me1,
H3K4me3, H3K9me3)
DNase I hypersensitivity data
Gene expression data (arrays or RNA-seq)
Can download:
.wig, .bed, some .bam, some SRA, working on peak calls
http://roadmapepigenomics.org – Find and view data,
protocols, links to other sites associated with the program.
View Roadmap data at http://genome.ucsc.edu via TRACK HUB
– click ‘track hubs’ to load Roadmap data and summary data
tracks on UCSC. Where is the data?
Find data, protocols, and analysis/viewing tools from the
Roadmap Epigenomic Mapping Consortium at these sites:
http://roadmapepigenomics.org
http://ncbi.nlm.nih.gov/epigenomics
http://ncbi.nlm.nih.gov/geo/roadmap/epigenomics
http://epigenomeatlas.org
http://vizhub.wustl.edu
http://genome.ucsc.edu (via track hub)
• More than 30 papers in Nature, Genome Research, Genome Biology, Science, Cell
• Publishing innovations • ENCODE increased our
understanding of non-coding DNA, and human disease
Recent ENCODE Publications
From www.nature.com/encode
ENCODE Structure
Gene Models
RNA TF
Binding
Data Coordination Center
Data Analysis Center
Element ID
Chromatin States
Histone Mods DNase DNAme
Modified from PLoS Biol 9-e1001046,2011
What Data Are Available?
Modified from PLoS Biol 9-e1001046,2011
ENCODE Dimensions
From Ewan Birney
Raw Genomic Coverage of Elements
Biochemical Mark Genomic Coverage
Any ENCODE mark 80 %
Any ENCODE RNA 62 %
Any Histone modification 56 %
DHS or TF ChIP 19.4 %
Open chromatin 15.2 %
TF ChIP 8.1 %
DHS footprint 5.7 %
Purifying selection ~3-8 %
Exons (GENCODE) 2.94 %
Protein-coding regions (GENCODE) 1.22 %
Nature 489-57,2012