37
Application of genome assembly to Bovidae species is an equal opportunity provider and employer AGRICULTURAL RESEARCH SERVICE Tim Smith U.S. Meat Animal Research Center Clay Center, Nebraska September 20, 2018 PacBio User Group Meeting St. Louis, MO

Tim Smith U.S. Meat Animal Research Center Clay Center ... · Application of genome assembly to Bovidae species is an equal opportunity provider and employer AGRICULTURAL RESEARCH

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Application of genome assembly to Bovidae species

is an equal opportunity provider and employer

AGRICULTURAL RESEARCH SERVICE

Tim Smith U.S. Meat Animal Research Center

Clay Center, Nebraska

September 20, 2018 PacBio User Group Meeting

St. Louis, MO

Science complex

U.S. Meat Animal Research Center Clay Center, Nebraska

7800 breeding cows/heifers 800 swine litters/year 2000 breeding ewes

8 mi (11 km)

The Bovinae Family Tree Branch lengths are not proportional to time

(From Hernandez-Fernandez and Vrba, 2005)

Nilgai (India)

Four-horned antelope, Chousingha (India)

Lesser kudu (Ethiopia)

Bushbuck (Senegal)

Nyala (South Africa)

Sitatunga (Tanzania)

Bongo (West Africa)

Greater kudu (South Africa)

Mountain Nyala (Ethiopia)

Giant eland (Gambia)

Common eland (South Africa)

Saola (Laos)

African Buffalo

Tamaraw (Mindoren island, Phillipines)

Lowland anoa (Indonesia)

Mountain anoa (Phillipines)

Gaur (Bangladesh)

Banteng (Indonesia, Java)

Kouprey (Cambodia)

Yak (Boreal Asia; also Bos grunniens)

American Bison (North America)

Wisent (Poland)

Domestic Water Buffalo, Bubalus Bubalis

Progenitor of Bos taurus

The Bovinae Family Tree Branch lengths are not proportional to time

(From Hernandez-Fernandez and Vrba, 2005)

Nilgai (India)

Four-horned antelope, Chousingha (India)

Lesser kudu (Ethiopia)

Bushbuck (Senegal)

Nyala (South Africa)

Sitatunga (Tanzania)

Bongo (West Africa)

Greater kudu (South Africa)

Mountain Nyala (Ethiopia)

Giant eland (Gambia)

Common eland (South Africa)

Saola (Laos)

African Buffalo

Tamaraw (Mindoren island, Phillipines)

Lowland anoa (Indonesia)

Mountain anoa (Phillipines)

Gaur (Bangladesh)

Banteng (Indonesia, Java)

Kouprey (Cambodia)

Yak (Boreal Asia; also Bos grunniens)

American Bison (North America)

Wisent (Poland)

Domestic Water Buffalo, Bubalus Bubalis

Progenitor of Bos taurus

Cave painting of Aurochs ca. 14,000 BC Aurochs drawing ca. 1885

Selective breeding has substantially changed bovid species

“the ideal animal” South Devon circa 1835

Modern South Devon bull

The “ideal” Durham ca. 1819

The idea of creating specialized breeds through selective matings began in the 1700s

First principle of genetics was known as early as mid-17th century :

hindquarters inherited from cow, forequarters from bull

Modern Durham (Shorthorn) bull

Bos indicus (Zebu, Bos taurus indicus) Bos taurus (Taurine, Continental or British)

Even within breed, substantial variation affecting traits exists

ca. 1900 : average milk yield/cow = 1,800 kg/yr

ca. 2000 : average milk yield/cow = 8,500 kg/yr

Are these large phenotypic differences a result of accumulated SNP variation by selection?

Comparisons of genomes of breeds needed to reveal -- mapping of short reads to a single

reference may miss significant differences

“Breeds of cattle” chromolithograph ca. 1879

USDA, NASS, 2010

Genetic selection accounts for about 1/3 of increased beef production

• U.S. cattle herd size

• amount of beef produced

Reference-quality cattle genomes

Brahman

Angus Hereford Jersey

Holstein Nelore

Long-read Hereford reference assembly

L1 Dominette 01449

0.21% heterozygous

Total sequence length 2.71 Gb Total assembly gap length 28,162 Number of scaffolds 2,211 Scaffold N50 103.31 Mb Scaffold L50 12 Number of contigs 2,597 Contig N50 25.90 Mb Contig L50 32 Total number of chromosomes and plasmids 31

RSII, P6/C4 75x coverage

Long-read Hereford versus Nelore short-read reference assembly

L1 Dominette 01449

0.21% heterozygous

Total sequence length 2.71 Gb Total assembly gap length 28,162 Number of scaffolds 2,211 Scaffold N50 103.31 Mb Scaffold L50 12 Number of contigs 2,597 Contig N50 25.9 Mb Contig L50 32 Total number of chromosomes and plasmids 31

Futuro

Total sequence length 2.67 Gb Total assembly gap length 198.13 Mb Number of scaffolds 32 Scaffold N50 106.31 Mb Scaffold L50 11 Number of contigs 253,770 Contig N50 28 kb Contig L50 25,227 Total number of chromosomes and plasmids 32

RSII, P6/C4 75x coverage

Long-read Hereford reference assembly

L1 Dominette 01449

0.21% heterozygous

RSII, P6/C4 75x coverage

Total sequence length 2.77 Gb Total assembly gap length NA Number of scaffolds NA Scaffold N50 NA Scaffold L50 NA Number of contigs 1,831 Contig N50 46.75 Mb Contig L50 32 Total number of chromosomes and plasmids 31

Jersey Sequel, v2.0 63x coverage

0.57% heterozygous

Total sequence length 2.71 Gb Total assembly gap length 28,162 Number of scaffolds 2,211 Scaffold N50 103.31 Mb Scaffold L50 12 Number of contigs 2,597 Contig N50 25.9 Mb Contig L50 32 Total number of chromosomes and plasmids 31

Cattle subspecies – Bos taurus taurus and Bos taurus indicus

Brahman, Bos taurus indicus Angus, Bos taurus taurus

Domesticated ≈ 11,000 ya Domesticated ≈ 9,000 ya

Auroch

F1 Angus x Brahman

0.9% heterozygous

TrioCanu for F1 Angus x Brahman

0.9% heterozygous

Total sequence length 2.68 Gb Number of contigs 1,585 Haplotig N50 23.26 Mb

Sequel, v2.0 ≈135x coverage

Total sequence length 2.57 Gb Number of haplotigs 1,747 Haplotig N50 26.65 Mb

Angus

Brahman

66.9x

67.3x

TrioCanu for F1 Angus x Brahman

0.9% heterozygous

Total sequence length 2.68 Gb Number of contigs 1,585 Haplotig N50 23.26 Mb

Sequel, v2.0 ≈135x coverage

Total sequence length 2.57 Gb Number of haplotigs 1,747 Haplotig N50 26.65 Mb

Angus

Brahman

66.9x

67.3x

Scaffolding ongoing : Haplotype-resolved HiC (Phase Genomics and Arima Genomics) Haplotype-resolved optical maps (Bionano)

X-chromosome PAR – single contig Y-chromosome PAR – single scaffold (4 contigs)

Other domesticated and wild bovinae

Water buffalo, Bubalus Bubalis

Cape Buffalo, Syncerus caffer

Banteng, Bos javanicus

Gaur, Bos gaurus

Plains Bison, Bison bison bison

Yak, Bos grunniens (Bos mutus)

Riverine water buffalo

Olimpia

Total sequence length 2.66 Gb Total assembly gap length 373,500 Number of scaffolds 509 Scaffold N50 117.22 Mb Scaffold L50 9 Number of contigs 919 Contig N50 22.44 Mb Contig L50 36 Total number of chromosomes and plasmids 26

Sequel, v2.0 69x coverage

Riverine water buffalo

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

Olimpia

Total sequence length 2.66 Gb Total assembly gap length 373,500 Number of scaffolds 509 Scaffold N50 117.22 Mb Scaffold L50 9 Number of contigs 919 Contig N50 22.44 Mb Contig L50 36 Total number of chromosomes and plasmids 26

Gaur genome

Omaha zoo, blood collection 2001

Total sequence length 2,700,417,543 Number of contigs 2,868 Contig N50 13,257,066 Contig L50 64 Total number of chromosomes and plasmids 28

Sequel, v2.0 53x coverage

Interspecies crosses maximizes contrast between parental genome contributions

“Duke” Scottish Highland

“Molly” Imperial Yak

“Esperanza” Yaklander cattle/yak F1

heterozygosity = 1.3%

Yaklander genome assembly

“Esperanza” Yaklander cattle/yak F1

heterozygosity = 1.3%

Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30

Yaklander genome assembly

“Esperanza” Yaklander cattle/yak F1

heterozygosity = 1.3%

Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30

Scaffolding : Scaffolding ? We don’t need no stinkin’ scaffolding !

Yaklander genome assembly

“Esperanza” Yaklander cattle/yak F1

heterozygosity = 1.3%

Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30

Scaffolding : Scaffolding ? We don’t need no stinkin’ scaffolding !

The NG90 is 95.3% (dam) and 95.5% (sire) in 58 and 55 haplotigs, respectively

Yaklander genome assembly

“Esperanza” Yaklander cattle/yak F1

heterozygosity = 1.3%

Total sequence length 2.67 Gb Number of contigs 625 Contig NG50 69.88 Mb Contig LG50 15 Total number of chromosomes and plasmids 30

Salsa_default with HiC read pairs

Total sequence length 2.66 Gb Number of scaffolds 527 Scaffold NG50 86.25 Mb Scaffold LG50 12

Yaklander interspecies F1 has best assembly EVAH !

Scottish Highland (paternal) genome Yak (maternal) genome

Courtesy : Sergey Koren

Yaklander interspecies F1 has best assembly EVAH !

Scottish Highland (paternal) genome Yak (maternal) genome

Important note : these are the initial haplotigs – no scaffolding, gap-filling, etc. Just, alignment to cattle reference

Record haplotig/contig N50 >70 Mb

Except X chromosome, as good as current human assembly

Yaklander interspecies F1 has best assembly EVAH !

Record longest haplotig/contig (155 Mb) < compared to previous record for human, 143 Mb >

Record haplotig/contig N50 >70 Mb

Except X chromosome, as good as current human assembly

Yaklander interspecies F1 has best assembly EVAH !

Record longest haplotig/contig (155 Mb) < compared to previous record for human, 143 Mb >

Human genome (GRCh38)

Yak (maternal) genome

YAK

( Haplotig per chromosome )

THERE CAN BE ONLY ONE

F1 bison x Simmental

American Simmental Association Wade Shafer Fred Schuetze Brad Stroud

Ben Rosen, Juan Medrano, Derek Bickhart, Bob Schnabel, Sergey Koren, Richard Hall

Ben Rosen, Christine Couldrey

Sergey Koren, Arang Rhie, Adam Phillippy, Wai-Yee Low, John Williams, Stefan Hiendleder, Derek Bickhart, Ben Rosen, Rick Tearle, Sarah Kingan

Mike Heaton, Peter Hackett, Tim Hardy, Jessica Petersen, Ed Rice, Sergey Koren

Mention of trade names or commercial products in this presentation is solely for the

purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture