25
Basic bioinformatics concepts, databases and tools Module 5 Genome browsers and interpretation of gene lists Dr. Joachim Jacob http://www.bits.vib.be Updated 21 July 2011 http://dl.dropbox.com/u/18352887/BITS_training_material/Link%20to%20mod5-intro_H1_2011_genomebrowsers. pdf

BITs: Genome browsers and interpretation of gene lists

  • Upload
    bits

  • View
    1.626

  • Download
    1

Embed Size (px)

DESCRIPTION

Module 5 Genome browsers and interpreting gene lists. Part of training session "Basic Bioinformatics concepts, databases and tools" - http://www.bits.vib.be/training

Citation preview

Page 1: BITs: Genome browsers and interpretation of gene lists

Basic bioinformatics concepts, databases and tools

Module 5

Genome browsers and

interpretation of gene lists

Dr. Joachim Jacob

http://www.bits.vib.be

Updated 21 July 2011http://dl.dropbox.com/u/18352887/BITS_training_material/Link%20to%20mod5-intro_H1_2011_genomebrowsers.pdf

Page 2: BITs: Genome browsers and interpretation of gene lists

Integrating biological information

Genome databases and browsers

– Integration on a species basis all biological information: Ensembl Genome Browser

http://www.ensembl.org/ Table Browsers

– Retrieving biological (not only sequence) data applying various criteria: Biomart

http://www.biomart.org/ Interpreting gene lists

– 'What is the biology behind my gene list': DAVID

http://david.abcc.ncifcrf.gov/

Page 3: BITs: Genome browsers and interpretation of gene lists

Reference genome sequences provide a standard genome sequence per species

Genomes From various sequence sources, a genome is

assembled By NCBI: currently assembly 37 in human (or

'build') (2010) By Celera: commercial

Each build differs! 1. Data freeze: all data for assembling (ignoring

new data from that point) 2. Assembly process and annotation 3. Release of the Build: Reference Sequence

Genomehttp://www.ncbi.nlm.nih.gov/Genomes/

Page 4: BITs: Genome browsers and interpretation of gene lists
Page 5: BITs: Genome browsers and interpretation of gene lists

Finding your way in genomes Annotation and terms

See also NCBI handbook Locus = place on the genome, ~ a

gene (different alleles) Location:

Rough location by staining of chromosomes e.g. 18q12.1 → chromosome 18, long arm (=q, small arm is p)

Exact bases on genomes (assembly must be mentioned!)

Page 7: BITs: Genome browsers and interpretation of gene lists

Ensembl Genome browser We will use this browser in this

session Information is combination of

automatic annotation and manually curated sources (ENS >< Havana (Vega) genes)

All entries can be accessed through the browser, each with its own clear identifiers

Page 8: BITs: Genome browsers and interpretation of gene lists

28 November 2009 [email protected]

8/10

Information about the genomes

http://www.ensembl.orghttp://www.ensembl.org

Page 9: BITs: Genome browsers and interpretation of gene lists

http://www.ensemblgenomes.orghttp://www.ensemblgenomes.org

Page 10: BITs: Genome browsers and interpretation of gene lists

[email protected] 10/10!

… or click on the figure feature!

Page 11: BITs: Genome browsers and interpretation of gene lists

28 November 2009 [email protected] 11/10

Page 12: BITs: Genome browsers and interpretation of gene lists

28 November 2009 [email protected] 12/10

[email protected]

Page 13: BITs: Genome browsers and interpretation of gene lists

TAB SUMMARY

DETAILED INFORMATION

INFOR-MATION

SELEC-TOR

DATA MANAGER

tab

DAS

Page 14: BITs: Genome browsers and interpretation of gene lists

Ensembl Genome browser

Usefulness: One place for all information on a

particular gene / structure / location / variation

But also: Comparison to other species

The Ensembl Team has a lot of training movies and examples available. Check them out!

http://www.ensembl.org/info/index.htmlhttp://www.ensembl.org/Help/Movie?id=188

Page 15: BITs: Genome browsers and interpretation of gene lists

Ensembl Genome browser

Usefulness: One place for all information on a

particular gene / structure / location / variation

But also: Comparison to other species

The Ensembl Team has a lot of training movies and examples available. Check them out!

http://www.ensembl.org/info/index.htmlhttp://www.ensembl.org/Help/Movie?id=188

Page 16: BITs: Genome browsers and interpretation of gene lists

Tracks are a way to display information on a genome sequence

The annotation on a genome-wide scale is displayed in tracks.

– Relevant database content can be formatted in tracks and displayed on a reference genome

Genome reference

tracks Screenshot of Ensembl genome browser

Page 17: BITs: Genome browsers and interpretation of gene lists

Tracks are a way to display information on a genome sequence

The annotation on a genome-wide scale is displayed in tracks, most used formats:

- each base receives a value: dense continuous data: WIG format (e.g. %GC)

- annotation has a start and a stop coordinate: bed format (e.g. gene annotations)

Example

Variations in genomes are reported in vcf format

http://www.ensembl.org/info/website/upload/bed.htmlhttp://www.bits.vib.be/wiki/index.php/.vcf

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ

Page 18: BITs: Genome browsers and interpretation of gene lists

Biomart, your one stop portal to fetch information

Biomart http://www.biomart.org/

– These questions are easy:Hey, can you tell me how many genes in mouse exist which regulate transcription and are located on

Chromosome 19 ?

Page 19: BITs: Genome browsers and interpretation of gene lists

Biomart, your one stop portal to fetch information

Biomart http://www.biomart.org/

– These questions are easy:Hey, can you tell me how many genes in mouse exist which regulate transcription and are located on

Chromosome 19 ?

Ensembl Genes

Genome sequence (Ensembl)

Gene OntologyGO:0009299

Page 20: BITs: Genome browsers and interpretation of gene lists

Biomart, your one stop portal to fetch information

Biomart http://www.biomart.org/

Translated questions reflect in database choice and Filters

Resulting genes are counted and the output set via Attributes

Page 21: BITs: Genome browsers and interpretation of gene lists

Biomart is available for an increasing number of databases

Biomart

http://www.biomart.org/

Page 22: BITs: Genome browsers and interpretation of gene lists

Gene lists resulting from different analyses can reveal their biology

DAVID - http://david.abcc.ncifcrf.gov/

Page 23: BITs: Genome browsers and interpretation of gene lists

Gene lists resulting from different analyses can reveal their biology

DAVID - http://david.abcc.ncifcrf.gov/

DEMO

Alternatives

g:Profiler http://biit.cs.ut.ee/gprofiler/Babelomics http://www.babelomics.org/

Page 24: BITs: Genome browsers and interpretation of gene lists

Galaxy allows you to store your data and to (re)analyse it conveniently

Galaxy - http://usegalaxy.org

Page 25: BITs: Genome browsers and interpretation of gene lists

Galaxy allows you to store your data and to (re)analyse it conveniently

Galaxy - http://usegalaxy.org

DEMO

TOOLS RESULTSDATA SETS