Upload
bits
View
1.626
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Module 5 Genome browsers and interpreting gene lists. Part of training session "Basic Bioinformatics concepts, databases and tools" - http://www.bits.vib.be/training
Citation preview
Basic bioinformatics concepts, databases and tools
Module 5
Genome browsers and
interpretation of gene lists
Dr. Joachim Jacob
http://www.bits.vib.be
Updated 21 July 2011http://dl.dropbox.com/u/18352887/BITS_training_material/Link%20to%20mod5-intro_H1_2011_genomebrowsers.pdf
Integrating biological information
Genome databases and browsers
– Integration on a species basis all biological information: Ensembl Genome Browser
http://www.ensembl.org/ Table Browsers
– Retrieving biological (not only sequence) data applying various criteria: Biomart
http://www.biomart.org/ Interpreting gene lists
– 'What is the biology behind my gene list': DAVID
http://david.abcc.ncifcrf.gov/
Reference genome sequences provide a standard genome sequence per species
Genomes From various sequence sources, a genome is
assembled By NCBI: currently assembly 37 in human (or
'build') (2010) By Celera: commercial
Each build differs! 1. Data freeze: all data for assembling (ignoring
new data from that point) 2. Assembly process and annotation 3. Release of the Build: Reference Sequence
Genomehttp://www.ncbi.nlm.nih.gov/Genomes/
Finding your way in genomes Annotation and terms
See also NCBI handbook Locus = place on the genome, ~ a
gene (different alleles) Location:
Rough location by staining of chromosomes e.g. 18q12.1 → chromosome 18, long arm (=q, small arm is p)
Exact bases on genomes (assembly must be mentioned!)
Genome Browsers: main players
Three main players MapViewer (NCBI) UCSC Genome Browser Ensembl Genome browser
BITS UCSC Genome Browser trainingBITS Ensembl Genome Browser training
Ensembl Genome browser We will use this browser in this
session Information is combination of
automatic annotation and manually curated sources (ENS >< Havana (Vega) genes)
All entries can be accessed through the browser, each with its own clear identifiers
28 November 2009 [email protected]
8/10
Information about the genomes
http://www.ensembl.orghttp://www.ensembl.org
http://www.ensemblgenomes.orghttp://www.ensemblgenomes.org
[email protected] 10/10!
… or click on the figure feature!
28 November 2009 [email protected] 11/10
28 November 2009 [email protected] 12/10
TAB SUMMARY
DETAILED INFORMATION
INFOR-MATION
SELEC-TOR
DATA MANAGER
tab
DAS
Ensembl Genome browser
Usefulness: One place for all information on a
particular gene / structure / location / variation
But also: Comparison to other species
The Ensembl Team has a lot of training movies and examples available. Check them out!
http://www.ensembl.org/info/index.htmlhttp://www.ensembl.org/Help/Movie?id=188
Ensembl Genome browser
Usefulness: One place for all information on a
particular gene / structure / location / variation
But also: Comparison to other species
The Ensembl Team has a lot of training movies and examples available. Check them out!
http://www.ensembl.org/info/index.htmlhttp://www.ensembl.org/Help/Movie?id=188
Tracks are a way to display information on a genome sequence
The annotation on a genome-wide scale is displayed in tracks.
– Relevant database content can be formatted in tracks and displayed on a reference genome
Genome reference
tracks Screenshot of Ensembl genome browser
Tracks are a way to display information on a genome sequence
The annotation on a genome-wide scale is displayed in tracks, most used formats:
- each base receives a value: dense continuous data: WIG format (e.g. %GC)
- annotation has a start and a stop coordinate: bed format (e.g. gene annotations)
Example
Variations in genomes are reported in vcf format
http://www.ensembl.org/info/website/upload/bed.htmlhttp://www.bits.vib.be/wiki/index.php/.vcf
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ
Biomart, your one stop portal to fetch information
Biomart http://www.biomart.org/
– These questions are easy:Hey, can you tell me how many genes in mouse exist which regulate transcription and are located on
Chromosome 19 ?
Biomart, your one stop portal to fetch information
Biomart http://www.biomart.org/
– These questions are easy:Hey, can you tell me how many genes in mouse exist which regulate transcription and are located on
Chromosome 19 ?
Ensembl Genes
Genome sequence (Ensembl)
Gene OntologyGO:0009299
Biomart, your one stop portal to fetch information
Biomart http://www.biomart.org/
Translated questions reflect in database choice and Filters
Resulting genes are counted and the output set via Attributes
Biomart is available for an increasing number of databases
Biomart
http://www.biomart.org/
Gene lists resulting from different analyses can reveal their biology
DAVID - http://david.abcc.ncifcrf.gov/
Gene lists resulting from different analyses can reveal their biology
DAVID - http://david.abcc.ncifcrf.gov/
DEMO
Alternatives
g:Profiler http://biit.cs.ut.ee/gprofiler/Babelomics http://www.babelomics.org/
Galaxy allows you to store your data and to (re)analyse it conveniently
Galaxy - http://usegalaxy.org
Galaxy allows you to store your data and to (re)analyse it conveniently
Galaxy - http://usegalaxy.org
DEMO
TOOLS RESULTSDATA SETS