26
National Center for Emerging and Zoonotic Infectious Diseases Updates from CDC: Cluster Detection and Reporting Guidelines Molly Leeper Salmonella Database Manager PulseNet Western Regional Meeting February 2019

Updates from CDC: Cluster Detection and Reporting Guidelines · cluster/analysis) • Next, map to the reference strain (if using an existing denovo assembly, click on the denovo

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

  • National Center for Emerging and Zoonotic Infectious Diseases

    Updates from CDC: Cluster Detection and Reporting Guidelines

    Molly LeeperSalmonella Database Manager

    PulseNet Western Regional MeetingFebruary 2019

  • Update to PulseNet’s Transition to WGS for Foodborne Surveillance WGS is the standard subtyping method for Listeria Campylobacter, Salmonella and STEC/Shigella surveillance will begin to use

    WGS as the standard subtyping method this year. (expected timeline: March 2019). At this time, laboratories will be requested to transition to WGS.

    For other PulseNet organisms (Vibrio, Yersinia) laboratories may continue to pulse isolates or perform WGS as funding allows

    Laboratories are in the process of converting existing PFGE databases to BioNumerics 7.6 – Expectation is that all labs will be converted by March

  • BioNumerics v7.6 Conversion Labs are in the process of converting their PFGE databases to BioNumerics v7.6

    Once labs are converted they can request WGS analysis certification sets; to do this: email [email protected]

    The PulseNet team at CDC has posted training documents covering PFGE and WGS analysis and management of data in BioNumerics v7.6 to the PulseNet SharePoint site Library of PulseNet DocumentsWGS PHL Upgrade to BioNumerics v7.6

    WGS analysis in BioNumerics v7.6 is expected to be available to certified individuals in March

    mailto:[email protected]

  • Database ValidationOutbreaks detected by PFGE with good epi data were compared using hqSNP, cgMLST and wgMLSTanalyses to determine which method worked best to separate outbreak vs. sporadic cases

    0 – 74 SNPs

    0-5 SNPs

    0 SNPs

    1 SNP

    PNUSAS002596

    PNUSAS002603

    PNUSAS002601

    PNUSAS002605

    PNUSAS002600

    PNUSAS002606

    PNUSAS002610

    PNUSAS002607

    PNUSAS002609

    PNUSAS002611

    PNUSAS002608

    PNUSAS002597

    PNUSAS002602

    PNUSAS002604

    PNUSAS002598

    100

    100

    87

    100

    100

    100

    76

    71

    100

    0.0038

    19.5 [0, 35]

    1.0 [0, 4]

    100

    9998

    PNUSAS002601PNUSAS002603PNUSAS002605PNUSAS002596PNUSAS002604PNUSAS002598PNUSAS002597PNUSAS002602PNUSAS002607PNUSAS002608PNUSAS002609PNUSAS002610PNUSAS002611PNUSAS002606PNUSAS002600

    1.0 [1, 1]

    31.0 [0, 51]

    1.0 [0, 4]

    100

    9998

    PNUSAS002601PNUSAS002603PNUSAS002596PNUSAS002604PNUSAS002598PNUSAS002605PNUSAS002597PNUSAS002602PNUSAS002607PNUSAS002609PNUSAS002610PNUSAS002608PNUSAS002611PNUSAS002606PNUSAS002600

    hqSNP: cluster is 0 – 5 SNPs wgMLST: cluster is 0 – 4 allelescgMLST: cluster is 0 – 4 alleles

  • Thresholds for Detecting WGS Clusters cgMLST will be used to detect clusters (targets the core genome)

    wgMLST may be used to further discriminate if necessary (targets the entire genome)

    Look for local clusters of sequences within 10 alleles by cgMLST, with at least two of those sequences being within 5 alleles May want to report historical sequences that are closely related to newly

    detected clusters Allele differences within a cluster may be larger or smaller depending on the

    organism and epi data There can be similar strains by WGS that may not be epidemiologically linked More clonal species/serotypes may have smaller allele differences Zoonotic outbreaks may have larger allele differences

  • Cluster Detection Methods

    Tool Listeria Salmonella Escherichia CampylobacterCore genome MLST (cgMLST) Yes Yes Yes YesWhole genome MLST (wgMLST): if further analysis needed Yes Yes Yes Yes

    SNP analysis: if further analysis needed Yes Yes Yes YesFast Character Matching Yes Yes Yes Yes

    Allele code nomenclature YesAvailable March

    2019Available March

    2019Available March

    2019

    Find Clusters ToolAvailable late February 2019

    Available March 2019

    Available March 2019

    Available March 2019

    Sheet1

    Tool ListeriaSalmonellaEscherichiaCampylobacter

    Core genome MLST (cgMLST)üYesüYesüYesüYes

    Whole genome MLST (wgMLST): if further analysis neededüYesüYesüYesüYes

    SNP analysis: if further analysis neededüYesüYesüYesüYes

    Fast Character MatchingüYesüYesüYesüYes

    Allele code nomenclatureüYesAvailable March 2019Available March 2019Available March 2019

    Find Clusters ToolAvailable late February 2019Available March 2019Available March 2019Available March 2019

  • Cluster Detection Method: 60 or 120 Day Search

    Select entries uploaded in the past 60 or 120 days

    Can choose different allele schemes – cgMLST, wgMLST, etc.

  • Cluster Detection Method: 60 or 120 day dendrogram

    Allele differences can be viewed by right clicking on nodes

    The number of allele differences are shown by median and range [minimum – maximum]

    Similarity matrix and differences between clades can be exported into Excel

  • Cluster Detection Method: wgSNP Analysis in BioNumerics Use for further analysis of clusters

    • First create a new experiment/sequence type (must do for each cluster/analysis)

    • Next, map to the reference strain (if using an existing denovo assembly, click on the denovo experiment “green dot” for the reference you want to use, or import a closed genome as a reference)

    • Next, select entries for comparison and submit to CE. When analysis is finished, retrieve jobs.

    • Next, run SNP analysis (Choose Analysis→Sequence types→Start SNP Analysis). Apply customized SNP filters.

    • Last, export entries to a comparison and create SNP tree.

  • Cluster Detection Method: wgSNP Analysis in BioNumerics

    Selecting the node in the tree gives you the SNP differences between cases

    These strains are 24 SNPs different based on the reference chosen

  • Cluster Detection Method: Fast Character Matching (FCM)

    cgMLST is the default character view

    The results can be restricted to only include entries in a specific date range or database field

    Can search for a specific number of allele differences

    Can choose how results are shown

  • Cluster Detection Method: Allele Codes Allele codes are built on percent similarity thresholds between core

    genomes (cgMLST) to form a stable “Allele Code,” similar to a pattern name We can use these names for cluster detection by knowing how related

    isolates are based on their name Names can be complete or partial depending on how they relate on the tree

    from which the nomenclature was built QC is built in nomenclature so that strains will not be named if the core

    genome falls below 95% or genome size is incorrect Naming and thresholds of relatedness will vary by organism

  • All uploads that pass quality will receive an allele code which can then be downloaded into local databases

    Poor quality sequences will receive a “failed” allele code and should be re-sequenced– FAILED QC: CORE– FAILED QC: LENGTH– FAILED QC: CORE, LENGTH

    Compare entries within the past 60 or 120 days that share allele codes up to the cluster detection threshold (may vary depending on organism)

    Allele codes will be available in SEDRIC

    Cluster Detection Method: Allele Codes

  • Cluster Detection Method: Find Clusters Tool

    Note: the below recommendations are for Listeria

    1. Recommend to look at allele code up to the 5th digit (~7 allele difference)

    2. Human entries only when looking for clusters of cases3. Cluster size—dependent on lab

    a) National level, we look at 3b) Local level may need to change to 2, for example

    4. Use the last 120 days for Listeria5. Found clusters are displayed below with Allele code and

    number of casesNote Defaults can be changed to match on any digit in the allele

    code (1st, 2nd, 3rd, 4th, 5th, or 6th), historical cases, non-human, cluster size, and/or number of days to check

    1

    3

    2

    4

    5

  • Cluster Detection Method: Find Clusters Tool

    6. Select clusters and choose OK to open a comparison of those entries in BioNumerics

    6

  • Cluster Detection Method: Find Clusters Tool

    Another option: search for the allele code (up to the 4th or 5th digit) of the identified cluster using the “find entries in list” option. Only include the numbers of the allele code.

    Select your identified cluster and create a dendrogram

  • Now that I know how to find clusters, which method should I use?This will probably vary by lab, but a combination of methods may be helpful

    FCM• FCM all new entries • Use the FCM parameters to search within the

    past 60 or 120 days or within a certain database field

    • Note: may be helpful to narrow down the search by species/serotype

    cgMLST Dendrogram• Keep a saved comparison of the past 60 or 120

    days• Add all new entries to the comparison and

    create a cgMLST dendrogram• Note: may be helpful to save multiple

    comparisons based on species/serotype

    Allele Codes and Find Clusters Tool• Wait for allele code to be assigned to uploaded

    entry• Download allele code• Use find clusters tool to perform cluster search

    Allele Codes and FCM• Wait for allele code to be assigned to uploaded

    entry• Download allele code• Search for closely related entries uploaded in the

    past 60 or 120 days using the allele code or FCM

  • Query the National Database for Closely Related Matches

    Query a field in the national database to temporarily download allele calls and metadata uploaded by labs other than your own

  • Query the National Database for Closely Related Matches

    Can use “Fast Match Selection Against Complete Server” to find closely related matches to your entry

    Note: if searching by wgMLST, allele differences may be higher

  • Post to SharePoint Once a cluster has been detected post the cluster to SharePoint Include key numbers, allele code(s), collection dates, epi information if

    available Bundle files do not need to be posted since all good quality uploads will

    receive allele codes within 24 hours CDC database managers will review postings and respond with a cluster

    code, line list and sequencing data

    1806GAGX6-1WGS Cluster #Organism

    CodeLabID*Year Month

    *ML is used for multi-state clusters

  • What should I send to my epidemiologists? Reports describing the clusters

    – Number of isolates included – Outbreak code (if available) and allele code(s) involved in the cluster

    • Both can be downloaded from the national database– Allele differences for the cluster– Information regarding any relevant historical matches (past outbreaks,

    non-human, etc.)– Closely related sequences in other states

  • What should I send to my epidemiologists? Line lists containing allele codes and relevant demographic information

    – Allele codes will also be available in SEDRIC

    Key WGS_id NCBI_ACCESSION SRR_ID Allele_code Outbreak SourceType SourceSiteCO___4201755 PNUSAL004534 SAMN10395990 SRR8175766 LMO1.0 - 46.1.1.5.11.1 1812TXGX6-1 Human BloodCO___4207324 PNUSAL004567 SAMN10486976 SRR8249567 LMO1.0 - 46.1.1.5.11.1 1812TXGX6-1 Human BloodCO___4214485 PNUSAL004633 SAMN10621540 SRR8361094 LMO1.0 - 46.1.1.5.11 1812TXGX6-1 Human Abdominal Fluid

    PatientAgeYears PatientSex SourceCounty SourceCity IsolatDate ReceivedDate PulseNet_UploadDate60 FEMALE 10/19/2018 10/26/2018 11/8/201866 FEMALE Harris Houston 11/6/2018 11/15/2018 11/28/201874 FEMALE Houston 11/21/2018 12/7/2018 12/27/2018

    Sheet1

    KeyWGS_idNCBI_ACCESSIONSRR_IDAllele_codeOutbreakSourceTypeSourceSite

    CO___4201755PNUSAL004534SAMN10395990SRR8175766LMO1.0 - 46.1.1.5.11.11812TXGX6-1HumanBlood

    CO___4207324PNUSAL004567SAMN10486976SRR8249567LMO1.0 - 46.1.1.5.11.11812TXGX6-1HumanBlood

    CO___4214485PNUSAL004633SAMN10621540SRR8361094LMO1.0 - 46.1.1.5.111812TXGX6-1HumanAbdominal Fluid

    PatientAgeYearsPatientSexSourceCountySourceCityIsolatDateReceivedDatePulseNet_UploadDate

    60FEMALE10/19/1810/26/1811/8/18

    66FEMALEHarrisHouston11/6/1811/15/1811/28/18

    74FEMALEHouston11/21/1812/7/1812/27/18

  • What should I send to my epidemiologists? Dendrograms or similarity matrices exported into PowerPoint or Excel

    – Mark allele differences using BioNumerics– Use groups to highlight entries of interest– Clade differences if two clusters are closely related or are being investigated together

    0.0 [0, 1]

    1.0 [0, 3]

    wgMLST_v3 (Core)

    10Cluster #1Cluster #1Cluster #1Cluster #1Cluster #1Cluster #2Cluster #2Cluster #2Cluster #2Cluster #2Cluster #2Cluster #3Cluster #3Cluster #3Cluster #3Cluster #3Cluster #3Cluster #3

    PNUSAS055284PNUSAS055285PNUSAS055287PNUSAS055288PNUSAS055289PNUSAS058626PNUSAS059779PNUSAS059781PNUSAS058515PNUSAS058517PNUSAS059325PNUSAS051077PNUSAS051078PNUSAS051079PNUSAS051080PNUSAS054168PNUSAS055255PNUSAS058559

    PNU

    SAS058626

    PNU

    SAS055255

    PNU

    SAS051077

    PNU

    SAS051078

    PNU

    SAS051079

    PNU

    SAS051080

    PNU

    SAS054168

    PNU

    SAS058559

    PNU

    SAS059779

    PNU

    SAS059781

    PNU

    SAS055284

    PNU

    SAS055285

    PNU

    SAS055287

    PNU

    SAS055288

    PNU

    SAS055289

    PNU

    SAS058515

    PNU

    SAS058517

    PNU

    SAS059325

    PNUSAS058626 0 27 27 25 27 27 28 28 0 0 15 15 14 13 14 0 0 0PNUSAS055255 27 0 1 1 1 1 3 2 29 28 24 24 19 22 20 25 26 27PNUSAS051077 27 1 0 0 0 0 0 1 27 27 23 23 20 21 21 24 26 25PNUSAS051078 25 1 0 0 0 0 0 1 25 25 22 22 19 19 20 24 24 24PNUSAS051079 27 1 0 0 0 0 2 1 27 27 23 23 20 21 21 24 26 25PNUSAS051080 27 1 0 0 0 0 1 1 28 27 23 23 19 22 20 25 26 26PNUSAS054168 28 3 0 0 2 1 0 3 30 27 26 25 20 22 21 24 25 27PNUSAS058559 28 2 1 1 1 1 3 0 30 28 26 26 21 23 22 26 27 28PNUSAS059779 0 29 27 25 27 28 30 30 0 0 16 16 14 14 14 0 0 0PNUSAS059781 0 28 27 25 27 27 27 28 0 0 16 17 14 15 14 0 0 1PNUSAS055284 15 24 23 22 23 23 26 26 16 16 0 0 0 0 0 14 15 15PNUSAS055285 15 24 23 22 23 23 25 26 16 17 0 0 0 0 0 14 15 15PNUSAS055287 14 19 20 19 20 19 20 21 14 14 0 0 0 0 0 14 14 14PNUSAS055288 13 22 21 19 21 22 22 23 14 15 0 0 0 0 0 13 14 13PNUSAS055289 14 20 21 20 21 20 21 22 14 14 0 0 0 0 0 14 14 14PNUSAS058515 0 25 24 24 24 25 24 26 0 0 14 14 14 13 14 0 0 0PNUSAS058517 0 26 26 24 26 26 25 27 0 0 15 15 14 14 14 0 0 0PNUSAS059325 0 27 25 24 25 26 27 28 0 1 15 15 14 13 14 0 0 0

    Cluster #1 Cluster #2 Cluster #3Cluster #1 0.0 [0, 0] 14.0 [13, 17] 22.0 [19, 26]Cluster #2 14.0 [13, 17] 0.0 [0, 1] 27.0 [24, 30]Cluster #3 22.0 [19, 26] 27.0 [24, 30] 1.0 [0, 3]

    export

    PNUSAS058626PNUSAS055255PNUSAS051077PNUSAS051078PNUSAS051079PNUSAS051080PNUSAS054168PNUSAS058559PNUSAS059779PNUSAS059781PNUSAS055284PNUSAS055285PNUSAS055287PNUSAS055288PNUSAS055289PNUSAS058515PNUSAS058517PNUSAS059325

    PNUSAS058626027272527272828001515141314000

    PNUSAS05525527011113229282424192220252627

    PNUSAS05107727100000127272323202121242625

    PNUSAS05107825100000125252222191920242424

    PNUSAS05107927100002127272323202121242625

    PNUSAS05108027100001128272323192220252626

    PNUSAS05416828300210330272625202221242527

    PNUSAS05855928211113030282626212322262728

    PNUSAS059779029272527283030001616141414000

    PNUSAS059781028272527272728001617141514001

    PNUSAS0552841524232223232626161600000141515

    PNUSAS0552851524232223232526161700000141515

    PNUSAS0552871419201920192021141400000141414

    PNUSAS0552881322211921222223141500000131413

    PNUSAS0552891420212021202122141400000141414

    PNUSAS058515025242424252426001414141314000

    PNUSAS058517026262426262527001515141414000

    PNUSAS059325027252425262728011515141314000

    export

    Cluster #1Cluster #2Cluster #3

    Cluster #10.0 [0, 0]14.0 [13, 17]22.0 [19, 26]

    Cluster #214.0 [13, 17]0.0 [0, 1]27.0 [24, 30]

    Cluster #322.0 [19, 26]27.0 [24, 30]1.0 [0, 3]

  • What should I send to my epidemiologists? Notify epis when new isolates are included Exporting weekly dendrograms may not be necessary

    Allele Differences WGS_id Key Outbreak PFGE-XbaI-pattern SourceSite PatientSex IsolatDate0-3 alleles PNUSAS060116 TX___TXAML1803331 1810MLJKX-1 JKXX01.0004 Stool UNKNOWN 10/10/20180-3 alleles PNUSAS060115 TX___TXAML1803285 1810MLJKX-1 JKXX01.0004 Stool MALE 10/2/20180-3 alleles PNUSAS060113 NM___2018028417 1810MLJKX-1 JKXX01.0004 Stool MALE 10/1/20180-3 alleles PNUSAS060114 NM___2018029123 1810MLJKX-1 JKXX01.0004 Stool FEMALE 10/11/20180-3 alleles PNUSAS060112 NM___2018025313 1810MLJKX-1 JKXX01.0004 Stool FEMALE 9/6/20180-3 alleles PNUSAS060106 CA___M18X03091 1810MLJKX-1 JKXX01.0004 Stool FEMALE 9/14/20180-3 alleles PNUSAS060107 CA___M18X03166 1810MLJKX-1 JKXX01.0004 Stool MALE 9/13/20180-3 alleles PNUSAS060108 CA___M18X03202 1810MLJKX-1 JKXX01.0004 Stool MALE 9/10/20180-3 alleles PNUSAS056878 CAOC_BE182500219 1810MLJKX-1 JKXX01.0004 Stool FEMALE 9/7/20180-3 alleles PNUSAS058362 CAOC_BE182530240 1810MLJKX-1 JKXX01.0004 Stool FEMALE 9/10/20180-3 alleles PNUSAS060279 CAOC_BE182900285 1810MLJKX-1 JKXX01.0004 Stool FEMALE 10/17/20180-3 alleles PNUSAS060109 LAC__T3729_Salmonella 1810MLJKX-1 JKXX01.0004 Stool FEMALE 9/27/20180-3 alleles PNUSAS060110 LAC__T4351_Salmonella 1810MLJKX-1 JKXX01.0004 Stool FEMALE 10/5/20180-3 alleles PNUSAS058365 CAOC_BE182540245 1810MLJKX-1 JKXX01.0004 Stool MALE 9/11/20180-3 alleles PNUSAS060111 LAC__W16906_Salmonella 1810MLJKX-1 JKXX01.0004 Stool MALE 9/14/2018

    Sheet1

    Allele DifferencesWGS_idKeyOutbreakPFGE-XbaI-patternSourceSitePatientSexIsolatDate

    0-3 allelesPNUSAS060116TX___TXAML18033311810MLJKX-1JKXX01.0004StoolUNKNOWN10/10/18

    0-3 allelesPNUSAS060115TX___TXAML18032851810MLJKX-1JKXX01.0004StoolMALE10/2/18

    0-3 allelesPNUSAS060113NM___20180284171810MLJKX-1JKXX01.0004StoolMALE10/1/18

    0-3 allelesPNUSAS060114NM___20180291231810MLJKX-1JKXX01.0004StoolFEMALE10/11/18

    0-3 allelesPNUSAS060112NM___20180253131810MLJKX-1JKXX01.0004StoolFEMALE9/6/18

    0-3 allelesPNUSAS060106CA___M18X030911810MLJKX-1JKXX01.0004StoolFEMALE9/14/18

    0-3 allelesPNUSAS060107CA___M18X031661810MLJKX-1JKXX01.0004StoolMALE9/13/18

    0-3 allelesPNUSAS060108CA___M18X032021810MLJKX-1JKXX01.0004StoolMALE9/10/18

    0-3 allelesPNUSAS056878CAOC_BE1825002191810MLJKX-1JKXX01.0004StoolFEMALE9/7/18

    0-3 allelesPNUSAS058362CAOC_BE1825302401810MLJKX-1JKXX01.0004StoolFEMALE9/10/18

    0-3 allelesPNUSAS060279CAOC_BE1829002851810MLJKX-1JKXX01.0004StoolFEMALE10/17/18

    0-3 allelesPNUSAS060109LAC__T3729_Salmonella1810MLJKX-1JKXX01.0004StoolFEMALE9/27/18

    0-3 allelesPNUSAS060110LAC__T4351_Salmonella1810MLJKX-1JKXX01.0004StoolFEMALE10/5/18

    0-3 allelesPNUSAS058365CAOC_BE1825402451810MLJKX-1JKXX01.0004StoolMALE9/11/18

    0-3 allelesPNUSAS060111LAC__W16906_Salmonella1810MLJKX-1JKXX01.0004StoolMALE9/14/18

  • For more information, contact CDC1-800-CDC-INFO (232-4636)TTY: 1-888-232-6348 www.cdc.gov

    The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

    Thank you

    Telephone: 404-639-4558E-mail: [email protected] Web: www.cdc.gov/pulsenet

    #PulseNet

    Updates from CDC: Cluster Detection and Reporting GuidelinesUpdate to PulseNet’s Transition to WGS for Foodborne SurveillanceBioNumerics v7.6 ConversionSlide Number 4Database ValidationThresholds for Detecting WGS ClustersCluster Detection MethodsSlide Number 8Cluster Detection Method: 60 or 120 day dendrogramCluster Detection Method: wgSNP Analysis in BioNumericsCluster Detection Method: wgSNP Analysis in BioNumericsCluster Detection Method: Fast Character Matching (FCM)Cluster Detection Method: Allele CodesCluster Detection Method: Allele CodesCluster Detection Method: Find Clusters Tool�Note: the below recommendations are for ListeriaCluster Detection Method: Find Clusters ToolCluster Detection Method: Find Clusters ToolNow that I know how to find clusters, which method should I use?Slide Number 21Query the National Database for Closely Related MatchesPost to SharePointWhat should I send to my epidemiologists?What should I send to my epidemiologists?What should I send to my epidemiologists?What should I send to my epidemiologists?Slide Number 28