36
Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 2015 1 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Embed Size (px)

Citation preview

Page 1: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 1

Basic Microbiome Analysis with QIIME

Bryan White

PowerPoint by Casey Hanson

Page 2: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 2

Exercise

In this exercise we will do the following:

1. Calculate sample diversity (-diversity), and test if different sample types have different numbers of OTUs (species).

2. Calculate differences in microbial community structure (-diversity); in particular, we will compare OTU composition and abundance between samples and sample types.

3. Compute statistical support for observed differences between sample types.

4. Plot taxonomy composition across samples.

5. Test for potential microbial markers.

Page 3: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 3

Step 0A: Accessing the IGB BioclusterOpen Putty.exe

In the hostname textbox type:

biocluster.igb.illinois.edu

Click Open

If popup appears, Click Yes

Enter login credentials assigned to you; example, user class00.

Now you are all set!

Page 4: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 4

Step 0B: Lab Setup

The lab is located in the following directory:

/home/classroom/mayo/2015/06_Metagenomics/

This directory contains the data and the finished version of the lab (i.e. the version of the lab after the tutorial). Consult it if you unsure about your runs. You don’t have write permissions to the lab directory. Create a working directory of this lab in your home directory for your output to be stored. Note ~ is a symbol in unix paths referring to your home directory. Copy the files

Make sure you login to a machine on the cluster using the qsub command. The exact syntax for this command is given below. This particular command will login you into a computer with 2 cpus with an interactive session. You only need to do this once.

Page 5: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 5

Step 0C: Local Files

For viewing and manipulating the files needed for this laboratory exercise, insert your flash drive.

Denote the path to the flash drive as the following:

[course_directory]

We will use the files found in:

[course_directory]/06_Metagenomics/results/

Page 6: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 6

Step 0D: Lab Setup

$ qsub -I -q classroom -l ncpus=2 # Login to a computer on cluster.

$ mkdir -p ~/06_Metagenomics/results

# Make results directory in our working directory.

# -p indicates to create ~/06_Metagenomics if it doesn’t exist.

$ cp /home/classroom/mayo/2015/06_Metagenomics/data/* ~/06_Metagenomics/

# Copy data to your working directory.

$ cd ~/06_Metagenomics # Change directory to our working directory.

$ module load qiime # We will need QIIME for this lab.

Page 7: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 7

Interstitial CystitisInterstitial cystitis (IC) is a chronic inflammation of the bowels. In this exercise, we will examine differences between the microbiota of women with and without IC to understand the effect IC has on the community.

Our data consists of 16S sequencing of stools samples from 8 women with IC and 7 without it. Using QIMME 1.8.0, we will examine

Using this data, we will test the hypothesis that IC induces significant change in gut microbiota. Additionally, we will examine whether or not there is a change in the community and what bacteria are implicated in causing such change.

Page 8: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 8

Step 1A: Dataset Characteristics ICF.biom

The ICF.biom file is an OTU observation file.

It is a matrix of observed OTUs, or species, for each sample, annotated with their taxonomy.

The ICF.biom file was created using our own TORNADO pipeline for 16S reads: quality check, chimera check, align, assign taxonomy and cluster to 97% similarity to find OTUs

The TORNADO pipeline can take from HOURS to DAYS depending on the complexity of the project.

Page 9: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 9

Step 1B: Dataset Characteristics ICF.mapping.txt

The mapping file contains metadata associated with samples.

Let us examine the file using the Unix cat command.$ cat ICF.mapping.txt # print file contents to screen

#SampleID Barcode Dx SubjectID Description

ICF-1 GGATCGCAGATC Control 1 IC_fecal1

ICF-2 GCTGATGAGCTG Control 2 IC_fecal2

ICF-3 AGCTGTTGTTTG Control 3 IC_fecal3

ICF-4 GGATGGTGTTGC IC 4 IC_fecal4

The most important column to us.

Output:

Page 10: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 10

Step 1C: Dataset Characteristics ICF.tree

The ICF.tree file is a Newick-formatted phylogenetic tree file.

It contains phylogenetic relationships between the OTUs found in our samples.

It is another output of the 16S pipeline required for various comparison metrics.

Page 11: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 11

Step 1D: Dataset Characteristics params.txt

The params.txt file contains alternative parameters to run QIIME.

Let us examine the file using the Unix cat command.$ cat params.txt# print file contents to screen

beta_diversity:metrics bray_curtis,unweighted_unifrac,weighted_unifrac

alpha_diversity:metrics chao1,goods_coverage,observed_species,shannon,simpson,PD_whole_tree

Output:

Page 12: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 12

Step 2: Get Basic Statistics

The first step we will do is to get some basic statistics on our ICF.biom file.We will use the biom summarize-table command in QIIME to do this.$ biom summarize-table -i ICF.biom -o results/summary.txt

$ cat results/summary.txt # Show

stats.

Num samples: 15

Num observations: 260

Total count: 399985

Table density (fraction of non-zero values): 0.608

Table md5 (unzipped): be4b6e26ff80ca9ff173d6bbfeda162b

Counts/sample summary:

Min: 10267.0

Max: 48123.0

Output:

Page 13: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 13

Step 3: Calculating Diversity

For this next step, let us measure the diversity of the samples.

We will use the number from the previous slide so that, for comparison purposes, all samples will have the same number of sequences.

We will use the alpha_rarefaction.py script in QIIME to do this.

Results are located in

~/06_Metagenomics/results/alpha_diversity

$ alpha_rarefaction.py -i ICF.biom -t ICF.tree -m ICF.mapping.txt

-o results/alpha_diversity -p params.txt -e 10267

This calculation will take from 5 - 7 min to complete.

Page 14: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 14

Step 4: Calculating Diversity

For this next step, let us compare samples using their composition.

We will use the beta_diversity_through_plots.py script in QIIME to do this.

Results are located in :

~/06_Metagenomics/results/beta_diversity

We will use these results later in the tutorial.

$ beta_diversity_through_plots.py -i ICF.biom -t ICF.tree -m

ICF.mapping.txt -o results/beta_diversity -p params.txt -e 10267

This calculation will take from 1 - 5 min to complete.

Page 15: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 15

Step 5: Taxonomy Computations

For this next step, we will create a graphical summary of the taxonomical composition of the samples.

Let us do the same thing as above, only this time merging the control and IC samples using the Dx column.

Results are located in :~/06_Metagenomics/results/taxonomy (1st command) ~/06_Metagenomics/results/taxonomy_Dx (2nd command).

$ summarize_taxa_through_plots.py -i ICF.biom -m ICF.mapping.txt -o

results/taxonomy

$ summarize_taxa_through_plots.py -i ICF.biom -m ICF.mapping.txt -o

results/taxonomy_Dx -c Dx

Page 16: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 16

Step 6: ANOVA Tests

ANOVA stands for Analysis of Variance. It is a standard suite of statistical tests aimed at explaining differences between groups of data.

We will use ANOVA in this step to see if there are any OTUs that explain the differences between sample categories.

We will use the group_significance.py script in QIIME to do this.

The resulting file, ~/06_Metagenomics/results/ANOVA.txt, sorts the OTUs in the data according to how likely they are driving the differences between samples.

The file includes probabilities (uncorrected and corrected), as well as abundance information and lineage of the OTU.

$ group_significance.py -i ICF.biom -m ICF.mapping.txt -o

results/ANOVA.txt -s ANOVA -c Dx

Page 17: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 17

Statistical TestsIn this exercise, we will test our hypotheses. In particular, if the control and IC samples cluster together, the following tests will measure the significance of such clustering based on the metrics that we just calculated.

Page 18: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 18

Step 7A: Statistical Tests - Diversity

In this step, we will see whether or not the IC and control samples differ significantly using the diversity results computed earlier.

We will use the compare_alpha_diversity.py script in QIIME to do this.

The result file is located in:

~/06_Metagenomics/results/signif

compare_alpha_diversity.py -i

results/alpha_diversity/alpha_div_collated/observed_species.txt -c

Dx -o results/signif -d 10260 -m ICF.mapping.txt

Page 19: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 19

Step 7B: Statistical Tests - Diversity

Let us take a look at the results file using the cat command:

~/06_Metagenomics/results/signif/Dx_stats.txt

It seems that the categories are very different. Note: your output may be slightly different

We will confirm this later when looking at diversity plots

$ cat results/signif/Dx_stats.txt

Group1 Group2 … t stat p-value

Control IC … 3.57527959 0.003

Output:

Page 20: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 20

Step 8A: Statistical Tests - Diversity

In this step, we will see whether or not the IC and control samples differ significantly using the diversity results computed earlier. We will use the UniFrac matrix and the ANOSIM test.

We will use the compare_categories.py script in QIIME to do this.

The result file is located in:

~/06_Metagenomics/results/anosim/anosim_results.txt

$ compare_categories.py –-method anosim –i

results/beta_diversity/unweighted_unifrac_dm.txt –m

ICF.mapping.txt –c Dx –o results/anosim –n 9999

Page 21: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 21

Step 8B: Statistical Tests - Diversity

Let us take a look at the results file using the cat command :

~/06_Metagenomics/results/anosim/anosim_results.txt

Although the p-value is significant, the R statistic says that the clustering is only moderately strong. Note: your output may be slightly different

$ cat results/anosim/anosim_results.txt

Method name R statistic p-value Number of permutations

ANOSIM 0.4069 0.0009 9999

Output:

Page 22: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 22

AnalysisWe will now analyze the files we generated during the and diversity runs and tests.

Note: the output you generated in lab may be slightly different.

Page 23: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 23

Step 9A: a Diversity Results

Access the downloaded results directory:

[course_directory]/06_Metagenomics/results

Inside the results directory, open the following file:

alpha_diversity/alpha_rarefaction_plots/rarefaction_plots.html

Select observed_species as metric, and Dx as category.

A graph will be displayed.

Page 24: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 24

Step 9A: a Diversity Results

Control is significantly different than IC!

Page 25: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 25

Step 10A: Diversity Results

Access the downloaded results directory:

[course_directory]/06_Metagenomics/results

Inside the results directory, open the HTML file in the following dir:

beta_diversity/unweighted_unifrac_emperor_pcoa_plot/index.html

This will open a 3D PCA plot, based on unweighted UniFrac distances, colored by sample type (Dx, Control)

Page 26: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 26

Step 10B: Diversity Results

Rotate the plot to see if the points separate in when viewed from other directions.Identify individual samples from using the ‘Key’ tab

Page 27: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 27

Step 10C: Diversity Results

Control and IC samples segregate, but only moderately. This is in agreement with the ANOSIM results (R = 0.4069 , p = 0.0009 from Slide 21 ).

Page 28: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 28

Step 11A: Taxonomy Results

Access the downloaded results directory:

[course_directory]/06_Metagenomics/results

Inside the results directory, open the HTML file in the following dir:

taxonomy/taxa_summary_plots/area_charts.html

Page 29: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 29

Step 11B: Taxonomy Results

This is the taxonomy at phylum level, for all samples. Hover over each color to find out about each color (colors may differ from this plot).

Page 30: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 30

Step 11C: Taxonomy Results

These look like otherwise normal stool samples, with Firmicutes and Bacteroides dominating. Note the Fusobacteria in sample 2, a control!

Page 31: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 31

Hover over each color to see its taxonomy information.

Step 11D: TaxonomyThings get more complex as we go down the taxonomy hierarchy.

This is the plot at the genus level, typical of stool samples.

Page 32: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 32

Hover over each color to see its taxonomy information.

Step 11D: TaxonomyThere seems to be no obvious pattern (which is the usual case unless there’s something very wrong or a known pathogen).

Page 33: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 33

Step 11E: Taxonomy

Let’s see if there is something hidden in the taxonomy.

In the results directory, open the ANOVA.txt file.

Below is the readout from one significant genus, Odoribacter.

OTU 111Test-Statistic 11.82051724P 0.004407693FDR_P 0.313682109Bonferroni_P 1Control_mean 92.71428571IC_mean 9.125taxonomy k__Bacteria;p__Bacteroidete

s;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Odoribacter;s__unclassified

Page 34: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 34

Step 11F: Taxonomy

Odoribacter has 0.3% mean abundance in controls and 0.02% mean abundance in IC. (Plot below from the bottom of area_plots.html)

Indeed, it seems to be a good marker despite its low relative abundance. (Look at abundances in red vs blue columns)

Its absence seems correlated with IC(samples 4,7,8,9,10,12,14,15).

Page 35: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 35

Analysis Conclusions

Microbial composition and structure significantly different in stool between IC patients and controls:

IC stool microbiota significantly less diverse

Overall IC microbiota different (it clusters away from controls)

Potential marker found:

Lack of Odoribacter associated with IC

Page 36: Basic Microbiome Analysis with QIIME Bryan White Basic Microbiome Analysis with QIIME | Bryan White | 20151 PowerPoint by Casey Hanson

Basic Microbiome Analysis with QIIME | Bryan White | 2015 36

Exercise Conclusions

Basic Microbiome analysis:

1. Calculate various diversity metrics for samples

2. Calculate statistical support for differences found between samples types

3. Plot taxonomy composition of samples

4. Basic tests for potential microbial markers