33
NCBI Pathogen Detection Pipeline Jessie Marus Logan Fink

NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

NCBI Pathogen Detection

PipelineJessie Marus

Logan Fink

Page 2: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

What is the NCBI pathogen detection pipeline?

NCBI has developed a pipeline that analyzes short read sequence data to

determine the nearest reference genome, builds a combined assembly from a

suite of third-party de novo assemblers as well as a reference-assisted assembly,

and generates a phylogenetic distance tree based on kmers (available on FTP).

SNP comparisons are done to find clusters of closely related isolates with isolates

placed in clusters by single-linkage clusters with a maximum 50 SNP distance.

Individual phylogenetic trees for each SNP clusters are available on FTP as well as

the NCBI Pathogen Detection Isolates Browser. Details of the analysis system will

be published at a future date.

Page 3: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 4: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 5: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Restaurant

Page 6: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Later at home...

Page 7: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 8: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

We start seeing cases come through the lab….

PNUSAS032085

PNUSAS032086

Page 9: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 10: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 11: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 12: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 13: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 14: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 15: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 16: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 17: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 18: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 19: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 20: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 21: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 22: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Downloads

Page 23: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 24: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 25: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Search a window of time

Page 26: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 27: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest
Page 28: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Tips, tricks, and best

practices for using NCBI

Page 29: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Queries and searches

taxgroup_name: “name of pathogen” - select the organism

“Salmonella enterica”

“E.coli and Shigella”

“Campylobacter jejuni”

“Listeria monocytogenes”

new:1 - specify only new added isolates

mindiff:[0 to 5] - specify the range of SNP differences between any clinical or

food/environmental isolate (brackets for ranges, or just the number)

minsame:0 - specify SNP difference between isolates of the same type

Geo_loc_name: - specify geographic location (usually country)

AMR_genotype: - specify AMR genes

*You can save queries to be notified of new isolates!

Page 30: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Searching in NCBI

If you’re not sure the exact name of a property you want to search on, you can

hover over the name of that column in the isolates browser and it will prompt you

on how to search that variable

You can also do open searches, or refine them slightly. For example:

taxgroup_name:"Listeria monocytogenes" AND cheese

Click on the column header to sort or reverse sort

When trying to look up a specific isolate, the easiest way is to just search the

WGS ID (PNUSA#)

Page 31: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Things to keep in mind

Dates in the tree - date of upload into NCBI, not necessarily indicative of isolation

date

*Cross reference in SEDRIC or other database

To highlight a clade on the tree and see SNP differences for that clade - use

subtree view or select all leaves

Saved queries - information overload!

Page 32: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Questions?

Page 33: NCBI Pathogen Detection Pipeline - APHL...What is the NCBI pathogen detection pipeline? NCBI has developed a pipeline that analyzes short read sequence data to determine the nearest

Try it yourself!

Pull up NCBI Pathogen Detection Pipeline - Isolates Browser

Search for an isolate of interest, see if there is a SNP cluster

Try a query - play around with a few different search functions