Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Application of genome assembly to Bovidae species
is an equal opportunity provider and employer
AGRICULTURAL RESEARCH SERVICE
Tim Smith U.S. Meat Animal Research Center
Clay Center, Nebraska
September 20, 2018 PacBio User Group Meeting
St. Louis, MO
Science complex
U.S. Meat Animal Research Center Clay Center, Nebraska
7800 breeding cows/heifers 800 swine litters/year 2000 breeding ewes
8 mi (11 km)
The Bovinae Family Tree Branch lengths are not proportional to time
(From Hernandez-Fernandez and Vrba, 2005)
Nilgai (India)
Four-horned antelope, Chousingha (India)
Lesser kudu (Ethiopia)
Bushbuck (Senegal)
Nyala (South Africa)
Sitatunga (Tanzania)
Bongo (West Africa)
Greater kudu (South Africa)
Mountain Nyala (Ethiopia)
Giant eland (Gambia)
Common eland (South Africa)
Saola (Laos)
African Buffalo
Tamaraw (Mindoren island, Phillipines)
Lowland anoa (Indonesia)
Mountain anoa (Phillipines)
Gaur (Bangladesh)
Banteng (Indonesia, Java)
Kouprey (Cambodia)
Yak (Boreal Asia; also Bos grunniens)
American Bison (North America)
Wisent (Poland)
Domestic Water Buffalo, Bubalus Bubalis
Progenitor of Bos taurus
The Bovinae Family Tree Branch lengths are not proportional to time
(From Hernandez-Fernandez and Vrba, 2005)
Nilgai (India)
Four-horned antelope, Chousingha (India)
Lesser kudu (Ethiopia)
Bushbuck (Senegal)
Nyala (South Africa)
Sitatunga (Tanzania)
Bongo (West Africa)
Greater kudu (South Africa)
Mountain Nyala (Ethiopia)
Giant eland (Gambia)
Common eland (South Africa)
Saola (Laos)
African Buffalo
Tamaraw (Mindoren island, Phillipines)
Lowland anoa (Indonesia)
Mountain anoa (Phillipines)
Gaur (Bangladesh)
Banteng (Indonesia, Java)
Kouprey (Cambodia)
Yak (Boreal Asia; also Bos grunniens)
American Bison (North America)
Wisent (Poland)
Domestic Water Buffalo, Bubalus Bubalis
Progenitor of Bos taurus
Selective breeding has substantially changed bovid species
“the ideal animal” South Devon circa 1835
Modern South Devon bull
The “ideal” Durham ca. 1819
The idea of creating specialized breeds through selective matings began in the 1700s
First principle of genetics was known as early as mid-17th century :
hindquarters inherited from cow, forequarters from bull
Modern Durham (Shorthorn) bull
Even within breed, substantial variation affecting traits exists
ca. 1900 : average milk yield/cow = 1,800 kg/yr
ca. 2000 : average milk yield/cow = 8,500 kg/yr
Are these large phenotypic differences a result of accumulated SNP variation by selection?
Comparisons of genomes of breeds needed to reveal -- mapping of short reads to a single
reference may miss significant differences
“Breeds of cattle” chromolithograph ca. 1879
USDA, NASS, 2010
Genetic selection accounts for about 1/3 of increased beef production
• U.S. cattle herd size
• amount of beef produced
Long-read Hereford reference assembly
L1 Dominette 01449
0.21% heterozygous
Total sequence length 2.71 Gb Total assembly gap length 28,162 Number of scaffolds 2,211 Scaffold N50 103.31 Mb Scaffold L50 12 Number of contigs 2,597 Contig N50 25.90 Mb Contig L50 32 Total number of chromosomes and plasmids 31
RSII, P6/C4 75x coverage
Long-read Hereford versus Nelore short-read reference assembly
L1 Dominette 01449
0.21% heterozygous
Total sequence length 2.71 Gb Total assembly gap length 28,162 Number of scaffolds 2,211 Scaffold N50 103.31 Mb Scaffold L50 12 Number of contigs 2,597 Contig N50 25.9 Mb Contig L50 32 Total number of chromosomes and plasmids 31
Futuro
Total sequence length 2.67 Gb Total assembly gap length 198.13 Mb Number of scaffolds 32 Scaffold N50 106.31 Mb Scaffold L50 11 Number of contigs 253,770 Contig N50 28 kb Contig L50 25,227 Total number of chromosomes and plasmids 32
RSII, P6/C4 75x coverage
Long-read Hereford reference assembly
L1 Dominette 01449
0.21% heterozygous
RSII, P6/C4 75x coverage
Total sequence length 2.77 Gb Total assembly gap length NA Number of scaffolds NA Scaffold N50 NA Scaffold L50 NA Number of contigs 1,831 Contig N50 46.75 Mb Contig L50 32 Total number of chromosomes and plasmids 31
Jersey Sequel, v2.0 63x coverage
0.57% heterozygous
Total sequence length 2.71 Gb Total assembly gap length 28,162 Number of scaffolds 2,211 Scaffold N50 103.31 Mb Scaffold L50 12 Number of contigs 2,597 Contig N50 25.9 Mb Contig L50 32 Total number of chromosomes and plasmids 31
Cattle subspecies – Bos taurus taurus and Bos taurus indicus
Brahman, Bos taurus indicus Angus, Bos taurus taurus
Domesticated ≈ 11,000 ya Domesticated ≈ 9,000 ya
Auroch
TrioCanu for F1 Angus x Brahman
0.9% heterozygous
Total sequence length 2.68 Gb Number of contigs 1,585 Haplotig N50 23.26 Mb
Sequel, v2.0 ≈135x coverage
Total sequence length 2.57 Gb Number of haplotigs 1,747 Haplotig N50 26.65 Mb
Angus
Brahman
66.9x
67.3x
TrioCanu for F1 Angus x Brahman
0.9% heterozygous
Total sequence length 2.68 Gb Number of contigs 1,585 Haplotig N50 23.26 Mb
Sequel, v2.0 ≈135x coverage
Total sequence length 2.57 Gb Number of haplotigs 1,747 Haplotig N50 26.65 Mb
Angus
Brahman
66.9x
67.3x
Scaffolding ongoing : Haplotype-resolved HiC (Phase Genomics and Arima Genomics) Haplotype-resolved optical maps (Bionano)
X-chromosome PAR – single contig Y-chromosome PAR – single scaffold (4 contigs)
Other domesticated and wild bovinae
Water buffalo, Bubalus Bubalis
Cape Buffalo, Syncerus caffer
Banteng, Bos javanicus
Gaur, Bos gaurus
Plains Bison, Bison bison bison
Yak, Bos grunniens (Bos mutus)
Riverine water buffalo
Olimpia
Total sequence length 2.66 Gb Total assembly gap length 373,500 Number of scaffolds 509 Scaffold N50 117.22 Mb Scaffold L50 9 Number of contigs 919 Contig N50 22.44 Mb Contig L50 36 Total number of chromosomes and plasmids 26
Sequel, v2.0 69x coverage
Riverine water buffalo
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
Olimpia
Total sequence length 2.66 Gb Total assembly gap length 373,500 Number of scaffolds 509 Scaffold N50 117.22 Mb Scaffold L50 9 Number of contigs 919 Contig N50 22.44 Mb Contig L50 36 Total number of chromosomes and plasmids 26
Gaur genome
Omaha zoo, blood collection 2001
Total sequence length 2,700,417,543 Number of contigs 2,868 Contig N50 13,257,066 Contig L50 64 Total number of chromosomes and plasmids 28
Sequel, v2.0 53x coverage
Interspecies crosses maximizes contrast between parental genome contributions
“Duke” Scottish Highland
“Molly” Imperial Yak
“Esperanza” Yaklander cattle/yak F1
heterozygosity = 1.3%
Yaklander genome assembly
“Esperanza” Yaklander cattle/yak F1
heterozygosity = 1.3%
Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30
Yaklander genome assembly
“Esperanza” Yaklander cattle/yak F1
heterozygosity = 1.3%
Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30
Scaffolding : Scaffolding ? We don’t need no stinkin’ scaffolding !
Yaklander genome assembly
“Esperanza” Yaklander cattle/yak F1
heterozygosity = 1.3%
Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30
Scaffolding : Scaffolding ? We don’t need no stinkin’ scaffolding !
The NG90 is 95.3% (dam) and 95.5% (sire) in 58 and 55 haplotigs, respectively
Yaklander genome assembly
“Esperanza” Yaklander cattle/yak F1
heterozygosity = 1.3%
Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30
Salsa_default with HiC read pairs
Total sequence length 2.66 Gb Number of scaffolds 527 Scaffold NG50 86.25 Mb Scaffold LG50 12
Yaklander interspecies F1 has best assembly EVAH !
Scottish Highland (paternal) genome Yak (maternal) genome
Courtesy : Sergey Koren
Yaklander interspecies F1 has best assembly EVAH !
Scottish Highland (paternal) genome Yak (maternal) genome
Important note : these are the initial haplotigs – no scaffolding, gap-filling, etc. Just, alignment to cattle reference
Record haplotig/contig N50 >70 Mb
Except X chromosome, as good as current human assembly
Yaklander interspecies F1 has best assembly EVAH !
Record longest haplotig/contig (155 Mb) < compared to previous record for human, 143 Mb >
Record haplotig/contig N50 >70 Mb
Except X chromosome, as good as current human assembly
Yaklander interspecies F1 has best assembly EVAH !
Record longest haplotig/contig (155 Mb) < compared to previous record for human, 143 Mb >
Human genome (GRCh38)
Yak (maternal) genome
Ben Rosen, Juan Medrano, Derek Bickhart, Bob Schnabel, Sergey Koren, Richard Hall
Ben Rosen, Christine Couldrey
Sergey Koren, Arang Rhie, Adam Phillippy, Wai-Yee Low, John Williams, Stefan Hiendleder, Derek Bickhart, Ben Rosen, Rick Tearle, Sarah Kingan
Mike Heaton, Peter Hackett, Tim Hardy, Jessica Petersen, Ed Rice, Sergey Koren
Mention of trade names or commercial products in this presentation is solely for the
purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture