Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Workshop11:MetagenomicsAnalysis
Shi,BaochenDepartmentofPharmacology,UCLA
Flowchart
1. SFF(raw454data,op1onal)2. fasta/qualfiles3. demul1plexing/qualityfiltering
4. OTUpicking5. representa1vesequences6. taxonomicassignments/treebuilding
7. OTUtableanddownstreamprocessing
(b)Sequencedataprepara1on
(c)Opera1onalTaxonomicUnits(OTU)picking,Taxonomicassignment&inferringphylogeny (d)microbiomediversityanalyses
Flowchart
(b) (c)
(d)
d)microbiomediversityanalyses Thisworkflowconsistsofthefollowingsteps:
alphadiversity(microbialcommunityevennessandrichness)d1)GeneraterarefiedOTUtables(mul1ple_rarefac1ons.py)d2)ComputemeasuresofalphadiversityforeachrarefiedOTUtable(alpha_diversity.py)d3)Collatealphadiversityresults(collate_alpha.py)d4)Generatealphararefac1onplots(make_rarefac1on_plots.py)betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneity(single_rarefac1on.py)d6)Computebetadiversity(beta_diversity.py)d7)RunPrincipalCoordinatesAnalysis(principal_coordinates.py)d8)GeneratePCoAplots(make_3d_plots.pyormake_2d_plots.py)d9)Sta1s1calanalyses
d)microbiomediversityanalyses
d)microbiomediversityanalyses Thisworkflowconsistsofthefollowingsteps:
alphadiversity(microbialcommunityevennessandrichness)d1)GeneraterarefiedOTUtables(mul1ple_rarefac1ons.py)d2)ComputemeasuresofalphadiversityforeachrarefiedOTUtable(alpha_diversity.py)d3)Collatealphadiversityresults(collate_alpha.py)d4)Generatealphararefac1onplots(make_rarefac1on_plots.py)betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneity(single_rarefac1on.py)d6)Computebetadiversity(beta_diversity.py)d7)RunPrincipalCoordinatesAnalysis(principal_coordinates.py)d8)GeneratePCoAplots(make_3d_plots.pyormake_2d_plots.py)d9)Sta1s1calanalyses
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)AlphadiversitymeasuresinQIIME:(h]p://scikit-bio.org/docs/latest/generated/skbio.diversity.alpha.html)AnumberofalphadiversitymetricsarecurrentlysupportedinQIIME:non-phylogene1c:Shannon-Wienerdiversityindex
alpha_diversity.py-s
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)phylogene1c:Phylogene1cDiversity(PD)•Sumofbranchesleadingtosequencesinasample•Samplewithlineagesspanningthemostbranchlengthintreecontainsthemostphylogene1callydiversecommunity
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)rarefac1oncurve:Phylogene1c&non-phylogene1c
• comparisonofalphadiversitybetweensamples16SrRNAgenesurveysrevealreduceddiversityonmen’spalmsurfacesFiereretal.PNAS,2008•Shapeofcurveallowses1ma1onofhowfarwearefromobservingallofthealphadiversityinsamples
d)microbiomediversityanalyses Thisworkflowconsistsofthefollowingsteps:
alphadiversity(microbialcommunityevennessandrichness)d1)GeneraterarefiedOTUtables(mul1ple_rarefac1ons.py)d2)ComputemeasuresofalphadiversityforeachrarefiedOTUtable(alpha_diversity.py)d3)Collatealphadiversityresults(collate_alpha.py)d4)Generatealphararefac1onplots(make_rarefac1on_plots.py)betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneity(single_rarefac1on.py)d6)Computebetadiversity(beta_diversity.py)d7)RunPrincipalCoordinatesAnalysis(principal_coordinates.py)d8)GeneratePCoAplots(make_3d_plots.pyormake_2d_plots.py)d9)Sta1s1calanalyses
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)d1)GeneraterarefiedOTUtables,Performmul1plesubsamplingsonanOTUtable-m,--minMinimumnumberofseqs/sampleforrarefac1on.-x,--maxMaximumnumberofseqs/sample(inclusive)forrarefac1on.-s,--stepSizeofeachstepsbetweenthemin/maxofseqs/sample(e.g.min,min+step...forlevel<=max).-n,--num_repsThenumberofitera1onsateachstep.[default:10]AnysamplecontainingfewersequencesintheinputfilethantherequestednumberofsequencespersampleisremovedfromtheoutputrarefiedOTUtable.--maxshouldnotbemorethannumberofsequencesinthesamplewithmostcoverage/depthrarefac1on_##_#.txt:thefirstsetofnumbersrepresentsthenumberofsequencessampled,andthelastnumberrepresentstheitera1onnumber.Ineachsamplethesumofthecountsequalsthenumberofsamplestaken.
mul1ple_rarefac1ons.py-iotu_table.biom-m100-x140-s5-n2-orarefied_otu_tables/
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)d1)GeneraterarefiedOTUtablesd2)ComputemeasuresofalphadiversityforeachrarefiedOTUtableThisscriptprocessessingleOTUtableThescriptprocessesmul1pleOTUtablesinthegivenfolder
alpha_diversity.py-iotu_table.biom-mobserved_species,shannon,PD_whole_tree-oalpha_div.txt-trep_phylo.tre
alpha_diversity.py-irarefied_otu_tables/-mobserved_species,shannon,PD_whole_tree-orarefied_alpha_diversity/-trep_phylo.tre
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)d1)GeneraterarefiedOTUtablesd2)ComputemeasuresofalphadiversityforeachrarefiedOTUtabled3)Collatealphadiversityresultsonefileforeveryalphadiversitymetricused.
collate_alpha.py-irarefied_alpha_diversity/-orarefied_alpha_diversity_summary/
d)microbiomediversityanalyses alphadiversity(microbialcommunityevennessandrichness,orthewithin-sample)d1)GeneraterarefiedOTUtablesd2)ComputemeasuresofalphadiversityforeachrarefiedOTUtabled3)Collatealphadiversityresultsd4)Generatealphararefac1onplots
make_rarefac1on_plots.py-irarefied_alpha_diversity_summary/--generate_average_tables--generate_per_sample_plots-mFas1ng_Map.txt-orarefied_alpha_plot/
Flowchart
(d)
(d)
d)microbiomediversityanalyses Thisworkflowconsistsofthefollowingsteps:
alphadiversity(microbialcommunityevennessandrichness)d1)GeneraterarefiedOTUtables(mul1ple_rarefac1ons.py)d2)ComputemeasuresofalphadiversityforeachrarefiedOTUtable(alpha_diversity.py)d3)Collatealphadiversityresults(collate_alpha.py)d4)Generatealphararefac1onplots(make_rarefac1on_plots.py)betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneity(single_rarefac1on.py)d6)Computebetadiversity(beta_diversity.py)d7)RunPrincipalCoordinatesAnalysis(principal_coordinates.py)d8)GeneratePCoAplots(make_3d_plots.pyormake_2d_plots.py)d9)Sta1s1calanalyses
d)microbiomediversityanalyses betadiversity(similaritybetweenindividualmicrobialcommuni1es)Betadiversitymetricsassessthedifferencesbetweenmicrobialcommuni1es.Thefundamentaloutputofthesecomparisonsisasquarematrixwherea“distance”ordissimilarityiscalculatedbetweeneverypairofcommunitysamples,reflec1ngthedissimilaritybetweenthosesamples.ThedatainthisdistancematrixcanbevisualizedwithanalysessuchasPrincipalCoordinatesAnalysis(PCoA)andhierarchicalclustering.Likealphadiversity,therearemanypossiblebetadiversitymetricsthatcanbecalculatedwithQIIME.Beatdiversitymeasures:phylogene1c&non-phylogene1cphylogene1cmeasures:weighted&unweightedUniFrac,whichareusedextensivelyinrecentprojects.
beta_diversity.py-s
d)microbiomediversityanalyses betadiversity(similaritybetweenindividualmicrobialcommuni1es)UniqueFrac1on(UniFrac)metric•Abranchlength-based,qualita1vephylogene1cβdiversitymeasure•Distance=frac1onofthetotalbranchlengththatisuniquetoanysample
ApplEnvironMicrobiol.(2005)71(12):8228-35
d)microbiomediversityanalyses betadiversity(similaritybetweenindividualmicrobialcommuni1es)UniqueFrac1on(UniFrac)metric•Abranchlength-based,qualita1vephylogene1cβdiversitymeasure•Distance=frac1onofthetotalbranchlengththatisuniquetoanysample
d)microbiomediversityanalyses betadiversity(similaritybetweenindividualmicrobialcommuni1es)•Qualita1ve:unweightedUniFracsensi1vetofactorsthataffectpresence/absence•Quan1ta1ve:weightedUniFracsensi1vetofactorsthataffectrela1veabundance
d)microbiomediversityanalyses betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneity(op1onal)Tocomparesamplesatequalsequencingdepth,itcreatesasubsampledOTUtablebyrandomsamplingoftheinputOTUtable.Samplesthathavefewersequencesthantherequestedrarefac1ondepthareomi]ed.-d,--depthNumberofsequencestosubsamplepersample.Thisisone1mesubsamplingonOTUtable…..differentfrommakingrarefac1oncurveforalphadiversity
single_rarefaction.py -i otu_table.biom -o otu_table_100.biom -d 100
multiple_rarefactions.py -i otu_table.biom -m 100 -x 140 –s 5 -n 2 -o rarefied_otu_tables/
d)microbiomediversityanalyses betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneityd6)ComputebetadiversitySingleFileBetaDiversity(non-phylogene1c):SingleFileBetaDiversity(phylogene1c):Mul1pleFile(batch)BetaDiversity(phylogene1c):
beta_diversity.py-iotu_table_100.biom-mbray_cur1s-obeta_div
beta_diversity.py-iotu_table_100.biom-mweighted_unifrac,unweighted_unifrac-obeta_div-trep_phylo.tre
beta_diversity.py –i otu_tables/ -m weighted_unifrac,unweighted_unifrac -o beta_div/ -t rep_phylo.tre
d)microbiomediversityanalysesvisualiza1ons betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneityd6)Computebetadiversityd7)RunPrincipalCoordinatesAnalysisPCoAisatechniquethathelpstoextractandvisualizeafewhighly-informa1vecomponentsofvaria1onfromcomplex,mul1-dimensionaldata.Thisisatransforma1onthatmapsthesamplespresentinthedistancematrixtoanewsetoforthogonalaxessuchthatamaximumamountofvaria1onisexplainedbythefirstprincipalcoordinate,etc.Theprincipalcoordinatescanbeplo]edintwoorthreedimensionstoprovideanintui1vevisualiza1onofdifferencesbetweensamples.
d)microbiomediversityanalysesvisualiza1ons betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneityd6)Computebetadiversityd7)RunPrincipalCoordinatesAnalysisPCoAisatechniquethathelpstoextractandvisualizeafewhighly-informa1vecomponentsofvaria1onfromcomplex,mul1-dimensionaldata.Thisisatransforma1onthatmapsthesamplespresentinthedistancematrixtoanewsetoforthogonalaxessuchthatamaximumamountofvaria1onisexplainedbythefirstprincipalcoordinate,etc.Theprincipalcoordinatescanbeplo]edintwoorthreedimensionstoprovideanintui1vevisualiza1onofdifferencesbetweensamples.
principal_coordinates.py -i beta_div/-opcoa/
d)microbiomediversityanalysesvisualiza1ons betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneityd6)Computebetadiversityd7)RunPrincipalCoordinatesAnalysisd8)GeneratePCoAplotsMake2DPCoAPlotsaspecificcategorytocoloranycombina1onofcategories
make_2d_plots.py-ipcoa/pcoa_weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-o2d_plots/
make_2d_plots.py-ipcoa/pcoa_weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-o2d_plots/-b"Treatment"
make_2d_plots.py-ipcoa/pcoa_weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-b"Treatment&&DOB”-o2d_plots/
d)microbiomediversityanalysesvisualiza1ons betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneityd6)Computebetadiversityd7)RunPrincipalCoordinatesAnalysisd8)GeneratePCoAplotsMake3DPCoAPlotsmake_emperor.py-ipcoa/pcoa_weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-o3d_plots/
d)microbiomediversityanalyses Thisworkflowconsistsofthefollowingsteps:
alphadiversity(microbialcommunityevennessandrichness)d1)GeneraterarefiedOTUtables(mul1ple_rarefac1ons.py)d2)ComputemeasuresofalphadiversityforeachrarefiedOTUtable(alpha_diversity.py)d3)Collatealphadiversityresults(collate_alpha.py)d4)Generatealphararefac1onplots(make_rarefac1on_plots.py)betadiversity(similaritybetweenindividualmicrobialcommuni1es)d5)RarefyOTUtabletoremovesamplingdepthheterogeneity(single_rarefac1on.py)d6)Computebetadiversity(beta_diversity.py)d7)RunPrincipalCoordinatesAnalysis(principal_coordinates.py)d8)GeneratePCoAplots(make_3d_plots.pyormake_2d_plots.py)d9)Sta1s1calanalyses
Flowchart
(d)
(d)
(d)
d)microbiomediversityanalyses d9.1)Sta1s1calanalysesCrea1ngDistanceComparison&PlotsPlotngWithinandBetweenDistances
make_distance_boxplots.py-dbeta_div/weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-o./-f'Treatment'--save_raw_data
Comparisonsbasedontwo-sidedStudent'stwo-samplet-test
d)microbiomediversityanalyses d9.2)Sta1s1calanalysesComparingCategorieswithsta1s1calmethods:Analyzessta1s1calsignificanceofsamplegroupingsusingdistancematricesAmajorityofthecomparisonarebasedontheANOVAfamily,determinewhetherthegroupingofsamplesbyagivencategoryissta1s1callysignificant.ANOSIMisnonparametric,sta1s1calsignificanceisdeterminedthroughpermuta1ons.Itonlyworkswithacategoricalvariable.Thep-valueof0.001indicatesthatatanalphaof0.05,thegroupingofsamplesbyindividualissta1s1callysignificant.TheRvalueof0.794isfairlycloseto+1,indica1ngdissimilaritybetweenthegroups.
compare_categories.py--methodanosim-ibeta_div/weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-c'Treatment'-n1000-oanosim_out
d)microbiomediversityanalyses d9.3)Sta1s1calanalysesComparingCategorieswithsta1s1calmethodsAdoniscreatesasetbyfirstiden1fyingtherelevantcentroidsofdataandthencalcula1ngthesquareddevia1onsfromthesepoints.Itcanaccepteithercategoricalorcon1nuousvariablesinthemetadatamappingfile.SignificancetestsareperformedusingF-testsbasedonsequen1alsumsofsquaresfrompermuta1onsoftherawdata.Thisscriptallowsfortheanalysisofthestrengthandsta1s1calsignificanceofsamplegroupingsusingadistancematrixastheprimaryinput.Severalsta1s1calmethodsareavailable:adonis,ANOSIM,BEST,Moran'sI,MRPP,PERMANOVA,PERMDISP,anddb-RDA.
compare_categories.py--methodadonis-ibeta_div/weighted_unifrac_otu_table_100.txt-mFas1ng_Map.txt-c'Treatment'-n1000-oanosim_out
d)microbiomediversityanalyses d9.4)Sta1s1calanalysesComparingDistanceMatricesbasedontheManteltest,anon-parametricsta1s1calmethodthatcomputesthecorrela1onbetweentwodistancematrices.Onecommonapplica1onofdistancematrixcomparisonistodetermineifcorrela1onexistsbetweenacommunitydistancematrix(e.g.UniFracdistancematrix)andasecondmatrixderivedfromanenvironmentalparameter(e.g.differenceinpH).Ifcommuni1esthatareatdissimilarpHlevelsaremoredifferentfromoneanotherthancommuni1esthatareatverysimilarpHlevels.Ifso,thiswouldindicateposi1vecorrela1onbetweenthetwodistancematrices.nonparametricmeanstheyusepermuta1onstodeterminethep-value,orsta1s1calsignificance.compare_distance_matrices.py--methodmantel-iweighted_unifrac_otu_table.txt,weighted_unifrac_ph_table.txt-n1000-omantel_out
Flowchart
(b) (c)
(d)
Outlines
Wedemonstratedstepsfor
a)QIIMEinstalla1onorRunonHoffman2
b)Sequencedataprepara1on
c)Opera1onalTaxonomicUnits(OTU)picking,Taxonomicassignment&inferringphylogeny
d)microbiomediversityanalyses