32
The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY E-mail: [email protected] Phone: 607-254-2838

The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Embed Size (px)

Citation preview

Page 1: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

The Whole Genome Sequencing Revolution

Martin WiedmannGellert Family Professor of Food Safety

Department of Food ScienceCornell University, Ithaca, NYE-mail: [email protected]

Phone: 607-254-2838

Page 2: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Outline• Subtyping for disease surveillance: from PFGE to WGS• WGS challenges: when are two isolates the same or

different? Can we find identical isolates in different locations?

• Looking in the future

Page 3: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

PulseNetPulseNet

USAUSA

PulseNet PulseNet EuropeEurope

PulseNet PulseNet Asia Asia PacificPacific

PulseNetPulseNetLatin America Latin America & Caribbean& Caribbean

PulseNetPulseNetMiddle EastMiddle East

PulseNet CanadaPulseNet Canada

PulseNetPulseNet

USAUSA

PulseNet PulseNet EuropeEurope

PulseNet PulseNet Asia Asia PacificPacific

PulseNetPulseNetLatin America Latin America & Caribbean& Caribbean

PulseNetPulseNetMiddle EastMiddle East

PulseNet CanadaPulseNet Canada

PulseNet allows international outbreak

detection and traceback – a hypothetical example

Food isolate, deposited into PulseNet

Human case

Human case

Page 4: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY
Page 5: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Whole Genome Sequencing• It all started with the human genome project• Sequencing of a bacterial genome is now

feasible at costs of <$100/isolate• Costs will continue to drop

• Commonly used platforms include• Roche 454• Illumina HiSeq/MiSeq• Applied Biosystems SOLiD Systems• Life Technologies/Thermofisher Ion

Torrent; • PacBio RS• Nanopore based systems (e.g., Oxford

Nanopore MinION)

Page 6: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

The genome sequence revolution

Page 7: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

DNA sequencing-based subtyping

Isolate 1 AACATGCAGACTGACGATTCGACGTAGGCTAGACGTTGACTGIsolate 2 AACATGCAGACTGACGATTCGTCGTAGGCTAGACGTTGACTGIsolate 3 AACATGCAGACTGACGATTCGACGTAGGCTAGACGTTGACTGIsolate 4 AACATGCATACTGACGATTCGTCGAAGGCTAGACGTTGACTG

SNP: single nucleotide polymorphism

1

3

2

4

Page 8: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Challenges with use of PFGE as a subtyping method in outbreak

investigations • Two isolates may show the same PFGE type even

though they are genetically distinct• PFGE only interrogates small part of the genome

• Two isolates may show “slightly” (?? - the “3-band rule”) different PFGE patterns despite sharing a very recent common ancestor• Could be due to lateral genes transfer, loss of

plasmid, rearrangements, point mutations etc.

Page 9: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Xbal SpeI

L

Den Bakker et al. 2011. AEM.

Includes isolates form Salmonella outbreak linked to sausages (Rhode Island) and isolates from pistachios

Page 10: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Tip-dated maximum clade credibility tree based on SNP data for 47 Montevideo isolates

Page 11: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

MLVA type frequency BGBQFJWIDAIBNACEAGVABAFBDAD

98 MLVA types

• Salmonella Enteritidis is most common cause of human salmonellosis – poorly resolved by current subtyping technologies.

PFGE type frequency 4342215819692562332788231899879199184

52 PFGE types

MLVA-PFGE type frequencyB4B34G4B21BQ8I5W4J4D4BN692AI19AC2F2V4AG56J21AB2AF4

163 combined MLVA-PFGE types

Page 12: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY
Page 13: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Full genome sequencing identified the following differences between these isolates:(i) 28 single nucleotide polymorphisms

(SNPs) and (ii) three indels, including a 33 kbp

prophage that accounted for the observed difference in AscI PFGE patterns.

Both isolates were found to harbor a 50 kbp putative mobile genomic island encoding translocation and efflux functions that has not been observed in other Listeria genomes.

Gilmour et al. BMC Genomics 2010, 11:120

Page 14: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY
Page 15: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

In addition, whole genome sequencing showed that 5 Listeria isolates collected in 2010 from the same facility were also closely related genetically to isolates from ill people.

Page 16: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Listeria Outbreaks and Incidence, 1983-2014

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

0

1

2

3

4

5

6

7

8

0

1

2

3

4

5

6

7

8

9Total Out-breaksIncidence

Pre-PulseNet

0.369

Early PulseNet

2.311

Listeria Initiative

2.95.5

No. outbreaksIncidence (per million pop)

EraOutbreaks per

yearMedian cases per

outbreak

WGS8

4.5

Data are preliminary and subject to change

Page 17: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

March 2015: Listeriosis cases linked to Blue Bell ice cream

Page 18: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Outline• Subtyping for disease surveillance: from PFGE to WGS• WGS challenges: when are two isolates the same or

different? Can we find identical isolates in different locations?

• Looking in the future

Page 19: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

The challenge

• Identical bacteria (100% match over the whole genome) can be found in different places that can be potential sources of foodborne disease outbreaks

Page 20: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

The theoretical background• Bacteria divide asexually: Bacterial populations can be seen as large

populations of “identical twins”• Mutation rate during replication is low: extremes of the suggested

mutation rates range from 2.25 × 10-11 to 4.50 × 10-10 per bp per generation– With a genome size of around 5 Million bp per bacterial genome (5 × 106) between

approx. 450 and 9,000 generations are needed for a single SNP difference– Eyre et al. estimated evolutionary rate of 0.74 SNVs per successfully sequenced

genome per year for C. difficile (N. Engl. J. Med. 2013)• “Whole-genome sequencing … identified 13% of cases that were genetically

related (≤2 SNVs) but without any evidence of plausible previous contact through a hospital, residential area, or family doctor.”

– Unknown bacterial generation time in different environments complicates interpretation

Page 21: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

2000 US outbreak - Environmental persistence of L. monocytogenes

• 1988: one human listeriosis case linked to hot dogs produced by plant X• 2000: 29 human listeriosis cases linked to sliced turkey meats from plant X

Page 22: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Real world observations

Page 23: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY
Page 24: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Real world observations

In one case, isolates with < 3 SNP differences were found in retail delis in there different states

Page 25: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Conclusions• Even with WGS, epidemiological data are still essential• Number of SNP differences/allele differences that is meaningful

differs by organism, strain, outbreak/cluster, and growth environment– Number of bacterial generations per calendar year can differ

hugely (think dry environment versus active infection in an animal population)

• Best way to determine “meaningful” SNP differences is through combination of phylogenetic and epidemiological data

Page 26: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Looking in the future• WGS will get cheaper and will be used more

– STEC next, probably Salmonella Enteritidis after that– Detection of more clusters and outbreaks

• WGS database will grow rapidly with inclusion of environmental isolates– More outbreak will be linked to source by using WGS matches

between food or environmental isolates and human isolates as stating point

• More broad application of WGS by private labs, maybe customers and consumers?

Page 27: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY
Page 28: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Conclusions• WGS is a game changer and will significantly improve

detection of outbreaks, adulteration, etc.– False alarms will occur though

• Pathogen detection in environments, by regulatory agencies, will lead to inclusion of WGS data in CDC/FDA/USDA databases (GenomeTrakr)– Environmental pathogen monitoring by industry will

become even more important

Page 29: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY
Page 30: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

30

Page 31: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Analysis of genome wide SNPs (wgSNPs)• Identifies all high confidence SNPs over whole

genome (approx. 3 to 5 million nucleotides)

Page 32: The Whole Genome Sequencing Revolution Martin Wiedmann Gellert Family Professor of Food Safety Department of Food Science Cornell University, Ithaca, NY

Whole genome multilocus sequence typing (MLST)

• Allows for simpler analysis and clear naming of subtypes

• Performs comparison on a gene by gene levelIsolate A Isolate B Isolate C

Gene 1 1 1 1

Gene 2 8 8 12

Gene 3 5 5 2

Etc.

Gene 1,005 4 4 4

wgMLST type A A B