View
438
Download
0
Category
Preview:
Citation preview
User-friendly web tools for the Arabidopsis thaliana 1001
genomesBeth Rowan
Max Planck Institute for Developmental BiologyPlant and Animal Genomes XXIV
January 11, 2016
Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
Why sequence 1001 Arabidopsis thaliana genomes?
Goals
-understand genome variation in the species
-reconstruct demographic history
-identify geographic and genetic subsets
-generate a powerful resource for genome-wide association studies
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
Why sequence 1001 Arabidopsis thaliana genomes?
Goals
-understand genome variation in the species
-reconstruct demographic history
-identify geographic and genetic subsets
-generate a powerful resource for genome-wide association studies
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
>80 wild strains resequenced
Why sequence 1001 Arabidopsis thaliana genomes?
Goals
-understand genome variation in the species
-reconstruct demographic history
-identify geographic and genetic subsets
-generate a powerful resource for genome-wide association studies
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
>80 wild strains resequenced
1135 wild strains resequenced
1001 Arabidopsis genomes: an overview
1001 Genomes Consortium, in review
Final Set: 1135 Accessions
1001 Arabidopsis genomes: an overview
1001 Genomes Consortium, in review
Final Set: 1135 Accessions
1001 Arabidopsis genomes: an overview
1001 Genomes Consortium, in review
1001 Arabidopsis genomes: an overview
1001 Genomes Consortium, in review
Nearly-identical pairs
North America & British Isles
1001 Arabidopsis genomes: an overview
1001 Genomes Consortium, in review
Highly divergent pairs
26 “relict” accessions-Iberian peninsula-Cape Verde & Canary Islands
Nearly-identical pairs
North America & British Isles
1001 Arabidopsis genomes: an overview
ADMIXTURE analysis identifies 9 genetic groups
1001 Genomes Consortium, in review
1001 Arabidopsis genomes web tools
Current toolshttp://1001genomes.org/tools/-easyGWAS-GWAPP-1001 Proteomes-GBrowse-POLYMORPH-BLAST-Alivie-TAIR converter-Col-0 DB
New tools http://tools.1001genomes.org/
-Admixture map-Pseudogenomes-Strain ID
Admixture Map
http://1001genomes.github.io/admixture-map/
Admixture map
Admixture map
Admixture map
Pseudogenomes
http://tools.1001genomes.org/pseudogenomes
Pseudogenomes
Select all
Filter on the fly
Pseudogenomes
Format check
Autocomplete
Pseudogenomes
Multi-FASTA
Strain ID
http://tools.1001genomes.org/strain_id
Strain ID
Strain ID
Col-0(6909)
Tsu-0(7373)
X
F1
F2
Strain ID
Strain ID
Strain ID
Integrating tools with Araport
JBrowse
Extend to full dataset
Variant tracks for each strain
(Geographic location for each strain)
Integrating tools with Araport
JBrowse
Hover over variant to get info
Integrating tools with Araport
JBrowse
Left click on variant to see annotation and accession information
Integrating tools with Araport
Future plans
1. Get all SNPs in region2. Get all indels in region3. Get SnpEff info for given SNP4. Get VCF subset for given region5. Get pseudogenomes6. Helper function: Translate gene
id to coordinates
7. Get allele frequencies for variants8. Identify allele/haplotype groups9. Find ADMIXTURE cluster membership10. Experimental design tool for subsetting 1001 collection
examples:-subset with greates genetic diversity-accessions with similar climates but from different geographical areas-accessions with different population histories
Integrating tools with Araport
Future plans
1. Get all SNPs in region2. Get all indels in region3. Get SnpEff info for given SNP4. Get VCF subset for given region5. Get pseudogenomes6. Helper function: Translate gene
id to coordinates
7. Get allele frequencies for variants8. Identify allele/haplotype groups9. Find ADMIXTURE cluster membership10. Experimental design tool for subsetting 1001 collection
examples:-subset with greates genetic diversity-accessions with similar climates but from different geographical areas-accessions with different population histories
https://www.surveymonkey.com/r/8DTCVQF
Acknowledgements
Joffrey Fitz
1001 Genomes Consortium1001 Genomes Consortium
Web ToolsWeb Tools
Project coordinators
Detlef Weigel Magnus NordborgMPI for Developmental
BiologyGregor Mendel Institute
Joy Bergelson, University of ChicagoJoe R. Ecker, Salk InstituteMitchell Sudkamp, Monsanto
Database creationCongmao Wang, Zhejiang Acad. of Agri. SciencesAlexander Platzer, Gregor Mendel Institute+All Consortium Contributors
Ümit Seren
Recommended