24
João André Carriço [email protected] Twitter: @jacarrico

Eccmid meet the expert 2015

Embed Size (px)

Citation preview

Page 1: Eccmid meet the expert 2015

João André Carriç[email protected]: @jacarrico

Page 2: Eccmid meet the expert 2015

Microbial Typing : discriminating strains bellow species/subspecies level

Genomics : antibiotic resistance/ virulence factor gene presence/absence, Mobile genetic elements detection

Page 3: Eccmid meet the expert 2015

http://en.wikipedia.org/wiki/File:ChronicleOfADeathForetold.JPG

WGS in molecular typing:

Gene-by-gene: wgMLST, cgMLST,rMLST,MLST,eMLST, MLST+

SNP comparison approaches: comparison with reference strains

Ability to recover most of the present sequence based typing information in a single experimental procedure

Page 4: Eccmid meet the expert 2015

Microbiological

Sample

The Ideal Scenario

Magic Box of

NGS Wonders for

Microbiology

Completely characterized strain:

• Antibiotic resistance profile• Multilocus Sequence Typing (MLST)• Virulence factors present• Other SBTM information .Ex:

• spa (S. aureus)• emm (Group A Streptococcus)

Desired End result:

Risk Assessment of the strain and

Useful application of the data to clinical practice

Comparison between groups of strains

Page 5: Eccmid meet the expert 2015

https://pmcvariety.files.wordpress.com/2014/06/eli-wallach-dead-good-bad-ugly.jpg?w=670&h=377&crop=1

Page 6: Eccmid meet the expert 2015

My Goals/ Areas that I want to apply WGS to: • Microbial population structure• Microbial Evolution• Microbial Genomics : gene structure, genome synteny,

Mobile Genetic Elements detection

My toolbox is chosen based on my questions and what I want to do !

Trying to avoid:“I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.” - Abraham H. Maslow (1962), Toward a Psychology of Being

Page 7: Eccmid meet the expert 2015

Sequence QA/QCFastQChttp://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Adaptor and Quality trimming:trimmomatichttp://www.usadellab.org/cms/?page=trimmomatic

AssemblySPAdeshttp://bioinf.spbau.ru/spades

Velvet http://www.ebi.ac.uk/~zerbino/velvet/

MappingBowtie2http://bowtie-bio.sourceforge.net/bowtie2/index.shtml

Annotation:Prokkahttp://www.vicbioinformatics.com/software.prokka.shtml

Whole genome comparisonBRIG (Blast Ring Generator)http://bowtie-bio.sourceforge.net/bowtie2/index.shtml

MAUVEhttp://darlinglab.org/mauve/mauve.html

Page 8: Eccmid meet the expert 2015

http://rugbyea.com/wp-content/uploads/2013/05/blast.jpghttp://www.ecohealthypets.com/writable/pet_report_photos/photo/480x/ball_python_2.jpg

Page 9: Eccmid meet the expert 2015
Page 10: Eccmid meet the expert 2015

- Perform the same analysis over tens, hundreds or thousands of strains : your own and publicly available

- Integrate multiple analysis in a single pipeline- Pipelines = reproducibility (if not something is very wrong)

http://www.ebi.ac.uk/ena

http://www.ncbi.nlm.nih.gov/sra

Page 11: Eccmid meet the expert 2015

Gene-by-Gene /extended MLST approaches are my favorite

Why? Allele based classification “buffers” the effect of

recombination in the analysis

Stable nomenclature for alleles facilitates data exchange by schema creation

Easy to expand and visualize up to thousands of genomes with MST- like approaches

Lower computing requirements

Page 12: Eccmid meet the expert 2015

Bacterial Isolate Genome Sequence Database Jolley & Maiden 2010, BMC Bioinformatics 11:595 -

http://pubmlst.org/software/database/bigsdb/

PROs: Freely available, open-source, handles thousands of genomes, has several schemas implemented for MLSTfor several bacterial species, and some extended MLST and core genome MLST (mainly Neisseria sp. but soon to be expanded)

CONs: Requires Perl knowledge to install and maintain

Ridom SeqSphere+ http://www.ridom.com/seqsphere/ Commercial software with client server solutions from assembly to allele

calling and visualization for core genome MLST (MLST+/ cgMLST)

Applied Maths - Bionumerics 7.5 http://www.applied-maths.com/news/bionumerics-version-75-released Commercial software with client server solutions from assembly to allele

calling and visualization for whole genome MLST (wgMLST)

Page 13: Eccmid meet the expert 2015

Schema = set of loci to be used

What is a locus?gene or part of a gene

How to choose the locus:1. Start from reference genomes2. Decide if you want core genes only or core+accessory genes3. Use a method to compare CDS/ORF of reference genomes:

1. OrthoMCL - www.orthomcl.org

2. CD-HIT-cd-hit.org4. Parse the output to:

1. Remove paralogous genes2. Decide which are core genes and which are accessory genes

Page 14: Eccmid meet the expert 2015

At this point different algorithms/software use:- BLAST(n/p/x)- Different criteria and parameters are used to call an

alleles as a coding sequence or part of a coding sequence

Page 15: Eccmid meet the expert 2015

Self BLAST – Calculate BSR

BLAST

Run prodigal on genome

Translate CDSto protein

Translate genefile to protein

Gene BLASTdatabase

No blast match

or BSR<=0.6BSR =1 &

same DNA seq?LOT? BSR>0.6

Add new allele to gene file

Calculate BSR of the new allele

Calculate BSR

Re-do

Gene BLAST database

LNF Exact Match LOTInferred

Allele

Allelic profile

Prodigal (Prokaryotic Dynamic Programming Gene finding Algorithm):

BSR: Blast Score Ratio

LOT: Locus On the Tip (of a contig)

Page 16: Eccmid meet the expert 2015

Core Genome addressing synteny:

Page 17: Eccmid meet the expert 2015

Core Genome Addressing synteny and paralogy:

Page 18: Eccmid meet the expert 2015

http://www.phyloviz.net

Open source and Freely available!

Page 19: Eccmid meet the expert 2015

Can be easily applied to:- MLST- MLVA- SNP data*- Gene Presence/absence

*Conversion of VCF to PHYLOViZ: https://github.com/nickloman/misc-genomics-tools/blob/master/scripts/vcf2phyloviz.py(Thanks Nick!)

Page 20: Eccmid meet the expert 2015

PROs: Handles thousands of profilesFast calculationEasy to annotate and explore metadataAllows for basic statistics on profiles and metadataAllows for advanced statistics on MSTs(PLoS One. 2015 Mar 23;10(3):e0119315) Exports high quality graphical formatsAllows plugin development

CONs: goeBURST and goeBURST MST only

(Neighbour Joining and UPGMA soon)JAVA knowledge to code new plugins

Page 21: Eccmid meet the expert 2015

MEGA (http://www.megasoftware.net/)

Splitstree (http://www.splitstree.org/)

Geneious (http://www.geneious.com/)

Multipurpose software: very useful for sequence alignment visualization, tree building and annotation visualization

(commercial software)

Page 22: Eccmid meet the expert 2015

No need to take sides on choosing an approach. Gene-by-gene, SNP, K-mer methods should be used depending on the problem at hand and the questions

The still evolving tool and sequencing methodology development makes the creation of easy-to-use “big red button” approaches difficult to implement

Beware of differences in software /algorithm version that can lead to different results

Always be critical for the results you have and try to understand if you have a nail or a screw before picking up the hammer at hand

Page 23: Eccmid meet the expert 2015

UMMI Members: Mickael Silva Sergio Santos Bruno Gonçalves Adriana Policarpo Mário Ramirez José Melo-Cristino

FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/): Dag Harmsen (Univ. Muenster) Stefan Niemann (Research Center Borstel) Keith Jolley, James Bray and Martin Maiden (Univ. Oxford) Joerg Rothganger (RIDOM) Hannes Pouseele (Applied Maths)

Genome Canada IRIDA project (www.irida.ca) Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar (NLM ,

PHAC) Ed Taboada and Peter Kruczkiewicz (Lab Foodborne Zoonoses, PHAC) Fiona Brinkman (SFU) William Hsiao (BCCDC)

INESC-ID Members:Alexandre FranciscoCátia VazPedro Tiago Monteiro

INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS

Twitter Microbial Bioinf community:Nick LomanTorsteen SeemanWill SchaikMick WatsonJennifer GardyMany, many others….

Page 24: Eccmid meet the expert 2015

Draft Scientific Programme:

Plenaries:

1) Small Scale Microbial Epidemiology

2) Large Scale Microbial Epidemiology

3) Bioinformatics for Genome-based Microbial Epidemiology

4) Population Genetics: Pathogen Emergence

5) Population Dynamics : Transmission networks and

surveillance

6) Molecular Epidemiology for Global Health and One

Health

Parallel Sessions

1) Food and Environmental pathogens

2) Microbial Forensics

3) Virus

4) Fungi and Yeasts

5) Novel Diagnostics methodologies

6) Novel Typing approaches

7) Phylogenetic Inference

8) Interactive Illustration Platforms