Upload
jordanpeccia
View
523
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Information for producing phylogenetic/taxonomic libraries of airborne bacteria and fungi. Includes fundamental background information, approaches for sequencing and data analysis, two case studies, and a review of sampling methods
Citation preview
Molecular Biology-‐Based Bioaerosol Analysis
Jordan Peccia
Yale University Chemical and
Environmental Engineering [email protected]
1
General Outline:
Overview of geneDcs
The new world of DNA sequencing
Molecular methods for idenDficaDon
Molecular methods for quanDficaDon
PhylogeneDcs overview
Aerosol sampling for molecular analysis
2
Review of GeneDcs
3
GeneDcs DefiniDons:
Genome: The complete set of geneDc material (DNA) of an organism or a virus.
Gene: A segment of DNA specifying a parDcular protein, or other funcDonal molecule (tRNA or rRNA).
Transcriptome: The complement of mRNAs produced in an organism under a specific set of condiDons.
Metagenome: The total geneDc complement of all the cells present in a parDcular environment.
Proteome: The total set of proteins encoded by a genome
4
Central Dogma of Biology:
DNA RNA Protein
Genomic DNA is blueprint set of instruc8ons
Messenger RNAs (mRNAs) are the specific, short-‐lived, gene transcripts
Proteins perform structural and cataly8c func8ons
transcrip8on a.k.a. “gene expression”
Transla8on occurs in ribosomes: (1) mRNA aNaches to ribosome, (2) polypep8des are produced, polypep8des are folded in to proteins
5
GeneDc Code:
Gene8c Code: Correspondence between nucleic acids and amino acids (monomers of protein)
DNA bases: Adenine (A) Thymine (T) Cytosine (C) Guanine (G)
RNA bases: Adenine (A) Uracil (U) Cytosine (C) Guanine (G)
DNA: GTTGCGGGATATTTATCTTAG
Amino acid: Val-‐Ala-‐Gly-‐Tyr-‐Leu-‐Ser-‐STOP 6
Genome Size (base pairs):
viruses
bacteria
Fungi/molds
mammals
plants
103 104 105 106 107 108 109 1010 1011
7
DNA Sequencing
8
Cost of DNA Sequencing:
!"#$
#$
#!$
#!!$
#!!!$
#!!!!$
%!!#$ %!!&$ %!!'$ %!!($ %!!)$ %!##$
!"#$%$"
%#&'(
&)*&%+
,--,")%./
0%12
#&#%345% Moore’s law
TradiDonal method is Sanger sequencing:
-‐advantage: longer (up to 800 bp long sequences) -‐disadvantage: slow and costly
Next generaDon sequencing:
-‐advantage: low cost and rapid -‐disadvantage: sequences are short (75 to 400 bp)
9
(A) DNA is fragmented into pieces ~500 bp long and made single stranded;
(B) Adaptors are added to single strands and 1 strand is aNached to 1 microbead;
(C) PCR is performed and mul8ple copies of the strand are produced;
Next GeneraDon Sequencing Example (454 Pyrosequencing):
A B
C
10
D
(D) Beads are placed into wells (1.5 x 106 wells per plate);
(E) The seconds strand is synthesized and added bases are recorded.
Next GeneraDon Sequencing Example (454 Pyrosequencing) ConDnued:
E
11
Some DNA Sequencing OpDons (as of 2012):
Illumina HiSeq technology -‐one lane produces ~50 million reads -‐reads are ~100 nucleoDdes long -‐cost is ~$2,000 per lane
454 Pyrosequencing -‐one gasket produces 150,000 reads -‐reads are ~500 nucleoDdes long -‐cost is ~$2,000 per gasket
Lab “personal”sequencers -‐Ion Torrent: 60-‐80 millions reads, 200 nt long -‐MiSeq: 15 million reads, up to 250 nt long
12
PhylogeneDcs
13
PhylogeneDcs:
Phylogeny: The evoluDonary history of organisms
PhylogeneDcs: A framework for idenDficaDon and quanDficaDon of microbial communiDes.
Habitat Culturability (%) Seawater 0.001-‐0.1 Freshwater 0.25 Mesotrophic lake 0.1-‐1 Estuarine waters 0.1-‐3 Ac8vated sludge 1-‐15 Sediments 0.25 Soil 0.3 Air ~1
The great plate count anomaly (see Amann et al. (1995), Microbiol. Rev. v59, p143.)
14
16S rRNA is the EvoluDonary Chronometer
~1500 nucleoDdes long a structural porDon of the ribosome
present in all organisms
evolved slowly and includes conserved, variable and hypervariable
15
Structure for Ribosomal RNA:
Eukaryotes Bacteria
Total 80S size 70S size
LSU 60S 50S
SSU 40S 30S
LSU rRNA 5.8S, 28S 5S, 23S
SSU rRNA 18S 16S
5.8S 28S 18S ITS1 ITS2
transcribed intragenic spacer regions (important for fungi)
16
variable conserved
Hyper-‐variable
Some Important Regions of the 16S rRNA:
17
Variable Regions of the 16S rRNA:
potenDal PCR primer sites
18
For IdenDficaDon:
1) Sequences derived from one or many microorganism in an aerosol sample can be produced
ACGTATAGGACGATACCATG……………
2) Using a search algorithm, the sequence is matched against a databases of rDNA gene sequences from known organisms.
3) IdenDficaDon at the highest taxonomic level that can be confidently assigned is provided. eg. assignment of E. coli to genus level would yield:
Bacteria Proteobacteria gammaProteobacteria Enterobacteriales Enterobacteraceae Escherichia
domain phylum class order family genus
19
SSU rRNA Alignment Forms the Tree of Life and a Basis for IdenDficaDon
rRNA-‐based Taxonomy:
Domain
Phylum
Class
Order
Family
Genus
Species
Pace, 1997, Science v276, p734 20
Molecular Methods for QuanDficaDon
21
Why Not QuanDfy by Culturability?
Habitat Culturability (%) Seawater 0.001-‐0.1 Freshwater 0.25 Mesotrophic lake 0.1-‐1 Estuarine waters 0.1-‐3 Ac8vated sludge 1-‐15 Sediments 0.25 Soil 0.3 Air ~1
The great plate count anomaly:
22
Viable Spore
Dead Spore
Spore that can not grow on media
Unidentifiable
Culturing Cannot Capture Fungal Diversity:
Other fungal fragments
23
Methods for QuanDficaDon:
QuanDtaDve polymerase chain reacDon
Direct microscopy and staining
Immuno-‐based methods and proteomics
24
First: Polymerase Chain ReacDon (PCR)
1) Reagents: forward and reverse primers, dNTP mix (A,T,C,G), water and Mg2+, template, DNA polymerase
2) Thermal cycler: runs temperature program for Denatura8on (~95oC), primer annealing (40-‐60oC), extension (72oC). Typically 20 to 30 cycle is adequate, don’t go above 45 cycles.
PCR performs two funcDons: (1) it selects a gene or segment of DNA from a background of total extracted DNA, and (2) it makes many copies of the selected DNA (amplicons)
25
PCR is Confirmed by Gel Electrophoresis:
1000 bp
500 bp
100 bp
Ladder
-‐ control
sample
+ control
26
PCR for Aerosol Samples is Challenging!
27
QuanDtaDve (PCR), a.k.a Real-‐Time PCR
(a) PCR reagents include a fluorescent dye that increases in emissions as amplicon number increases each cycle
(b) Thermal cycler blocks are equipped with fluorometers to detect changes in emission, thus track amplicon number as cycles progress
Rela8ve fluorescence
Increase in sample concentra8on
28
How is Amplicon Number Converted to Fluorescent Signal?
Method 1: TaqMan® Method 2: SYBR green
SYBR is a DNA intercala8ng agent that fluoresces only when bound to double stranded DNA. As more amplicons are produced, more SYBR green binds and fluoresces.
29
qPCR QuanDficaDon Methods –CalibraDon
CT (cycle threshold value set in linear region
Replicate samples, known concentraDon of cells or amplicon targets
101 105 104 103 102
30
qPCR QuanDficaDon Methods Cont… CalibraDon
!"#"$%&'()*+","%%&'(-"./"#"0&))(-"
0&00"
1&00"
20&00"
21&00"
-0&00"
-1&00"
%0&00"
%1&00"
'0&00"
$-" 0" -" '" *" ("
!"#$%&'(
#
)*+,-(&&./#
CT Value
31
Reproducibility and Repeatability Reproducibility and Repeatability Reproducibility Near Detection Level limit
~103 cells ~104 cells
Copyright © American Society for Microbiology, [doi: 10.1128/�AEM.01240-‐10 Appl. Environ. Microbiol. November 2010 vol. 76 no. 21 7004-‐701] 32
Reproducibility and Repeatability
Coefficient of variation, n=7
Reproducibility ~103 , ~104
Coefficient of variation, n=7
Repeatability ~103 , ~104
True difference 95% confidence n=7
E. coli Quartz 78%, 60% 36%, 44% 3.2 times
PCTE 79%, 70% 11%, 26%
B. atrophaeus Quartz 64%, 47% 57%, 41% 2.4 times
PCTE 60%, 57% 58%, 51%
A. fumigatus Quartz 61%, 67% 17%, 61% 2.5 times
PCTE 28%, 49% 15%, 21 % 33
Molecular Methods for IdenDficaDon
34
Methods for IdenDficaDon
PhylogeneDc libraries: a library of of all SSU rDNA sequences that exist in an environmental sample.
Microbial diversity methods and tools
35
§ For bacterial libraries: PCR primers typically target the 16S rRNA encoding gene variable regions;
§ For fungal libraries: PCR primers typically target genes encoding the ITS region of ribosomal RNA;
PhylogeneDc Libraries for Bacteria, Fungi, and Viruses:
36
§ GS-‐FLX 454 sequencing planorm;
§ Primers targe8ng 16SrDNA regions crea8ng ~500 basepair long amplicons;
§ Data analysis pipeline called QIIME (quan8ta8ve insights into molecular biology).
Isolate DNA Produce amplicons
DNA clean-up
Ampure clean-up
Pool DNA
Scheme for CreaDng PhylogeneDc Libraries:
Send to sequencer
37
Pyrosequencing Detail for PhylogeneDc Libraries Primers ConstrucDon:
!"#!"#$"#
$"#
%$%#&'&()*+#
%$%#&'&()*+#
,&+-*'.#
,&+-*'.# /01230#4#
/01230#0#
+056#7.8.#
9%::#,(#&;(<=-*8#>=)?#-@++.8)#[email protected]=87#).-?8*<*7C#
38
§ SorDng sequences in to sample bins and trimming primers and adaptors;
§ Producing a phylogeneDc placement or idenDficaDon for each sequence;
§ Determining relaDve abundances of taxa for each sequence (alpha diversity);
§ Use phylogeneDcs to compare one sample populaDon with other populaDons (beta diversity).
Sequence Data Analysis Includes:
39
SorDng/Trimming/Denoising:
1) Raw sequencer files are input into sopware that recognizes the barcodes and sorts sequences into their original sample bin.
2) Primers are recognized and primer, and adaptors are removed
3) 454 sequencing is suscep8ble to mistakes due to homopolymers (AAAAAA). Denoising “fixes” these errors
40
PhylogeneDc Placement or IdenDficaDon:
1) Sequences derived from one or many microorganisms in an aerosol sample are first produced
ACGTATAGGACGATACCATG……………
2) Using search algorithms, the sequenced is matched against a databases of rDNA gene sequences from known organisms.
3) IdenDficaDon at the highest taxonomic level that can be confidently assigned is provided. eg. Assignment of an E. coli sequence to a genus level would yield the result:
Bacteria Proteobacteria gammaProteobacteria Enterobacteriales Enterobacteraceae Escherichia
domain phylum class order family genus
41
PhylogeneDc Placement or IdenDficaDon:
For Bacteria: Sequences are placed into a MASTER phylogene8c tree (Greengenes tree). The are then iden8fied based on their placement.
97% similarity in sequence is generally accepted as the same species (also called phylotype or opera8onal taxonomic unit (OTU))
Pace, 1997, Science v276, p734 42
PhylogeneDc Placement or IdenDficaDon:
For Fungi: Sequences are compared against a database of known ITS fungal sequences (by BLAST (Basic Local Alignment Search Tool)), and “best matches” are determined
TGCGGAAGGATCATTACCGAGTGAGGGCCCTCTGGGTCCAACCTCCCACCCGTGTCTATCGTACCTTGTTGCTTCGGCGGGCCCGCCGTTTCGACGGCCGCCGGGGAGGCCTTGCGCCCCCGGGCCCGCGCCCGCCGAAGACCCCAACATGAACGCTGTTCTGAAAGTATGCAGTCTGAGTTGATTATCGTAATCAGTTAAAACTTTCAACAACGGATCTCTTGGTTCCGGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAGTCTTTGAACGCACATTGCGCCCCCTGGTATTCCGGGGGGCATGCCTGTCCGAGCGTCATTGCTGCCCTCAAGCACGGCTTGTGTGTTGGGCCCCCGTCCCCCTCTCCCGGGGGACGGGCCCGAAAGGCAGCGGCGGCACCGCGTCCGGTCCTCGAGCGTATGGGGCTTTGTCACCTGCTCTGTAGGCCCGGCCGGCGCCAGCCGACACCCAACTTTATTTTTCTAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGAACTTAAGCATATCAATAAGGCGGA
BLAST nucleo8de search
43
n What are the origins of this material that is associated with human occupancy? shedding
resuspension resuspension
Case Study #1:
occupied vs. vacant
44
Hospodsky D, Qian J, Nazaroff WW, Yamamoto N, et al. (2012) Human Occupancy as a Source of Indoor Airborne Bacteria. PLoS ONE 7(4): e34867. doi:10.1371/journal.pone.0034867 hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0034867
Case Study #1: RarefacDon Curves, the First Step in alpha Diversity Analysis:
45
Case Study #1: RelaDve Abundances of Bacterial Taxa:
Hospodsky D, Qian J, Nazaroff WW, Yamamoto N, et al. (2012) Human Occupancy as a Source of Indoor Airborne Bacteria. PLoS ONE 7(4): e34867. doi:10.1371/journal.pone.0034867 hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0034867
46
Hospodsky D, Qian J, Nazaroff WW, Yamamoto N, et al. (2012) Human Occupancy as a Source of Indoor Airborne Bacteria. PLoS ONE 7(4): e34867. doi:10.1371/journal.pone.0034867 hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0034867
Case Study #1: Beta Diversity, Comparing Aerosol PopulaDons with PotenDal Source PopulaDons:
47
Case Study #2: Microbial Ecology of Public Restroom Surfaces
48
Flores GE, Bates ST, Knights D, Lauber CL, et al. (2011) Microbial Biogeography of Public Restroom Surfaces. PLoS ONE 6(11): e28132. doi:10.1371/journal.pone.0028132 hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0028132
Case Study #2: Taxonomic ComposiDon of Public Restroom Surfaces:
49
Flores GE, Bates ST, Knights D, Lauber CL, et al. (2011) Microbial Biogeography of Public Restroom Surfaces. PLoS ONE 6(11): e28132. doi:10.1371/journal.pone.0028132 hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0028132
Case Study #2: Beta diversity-‐ Comparison Among Different Surface Samples
50
Flores GE, Bates ST, Knights D, Lauber CL, et al. (2011) Microbial Biogeography of Public Restroom Surfaces. PLoS ONE 6(11): e28132. doi:10.1371/journal.pone.0028132 hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0028132
Case Study #2: Beta diversity-‐Source Tracker Program in QIIME
51
Aerosol Sampling for Molecular Biology
52
Aerosol Sampling Concept: ImpacDon
Impaction: The inertia of a particle causes drift across bending fluid streamlines.
53
Aerosol Sampling Concept: Impingement
Impingement: entrapment of particles in liquid.
54
Aerosol Sampling Concept: FiltraDon
Filtration: Straining, interception, impaction, diffusion.
55
Sampler CharacterisDcs: Impactors
Sampling rate
Size resolved sampling
Viability Sample suitable for molecular methods
Advantages/disadvantages
Cascade impactors Mechanism: The sampling air stream makes a sharp bend and particles are stripped based on their aerodynamic diameter. Typical models: -Anderson Cascade Impactor; -MOUDI cascade impactor; -BGI 900 L/min high volume cascade impactor.
Typically 10 to 28 L/min. Some samplers allow for > 500 L/min.
Provides the best size distribution information. Different models offer between 1 and 12 stages for collecting aerosols with aerodynamic diameters from 10 nm to >18 µm.
Only at 28 L/min collection rates and requires direct sampling onto agar plates.
Stages can be covered with filters, membranes, or plates and samples can then be extracted from these materials. The panel did not recommend use of foam as a sampling medium due to the low efficiencies associate with cell and DNA extraction.
Advantages: -Best ability to define particle size distributions; -Models available to perform culturing;\. Disadvantages: -High cost per sampler, especially for high volume samplers; -Sampling inefficiencies due to particle bounce; -Not sensitive as total sampled mass is divided among multiple stages.
!
Sampling Size resolved Viability Sample Suitable for Advantages/disadvantages dfddd rate sampling molecular methods
56
Common Impactors:
Andersen multistage impactor
Micro-Orifice Uniform-Deposit Impactor
BGI High Vol Impactor
57
Available Sampler CharacterisDcs: Impingement
Liquid impingement Mechanism: Sampled air is passed through a small opening and captured into a liquid medium. Typical Models: -SKC swirl impingers; -Omni 3000 high volume impinge.
14 L/min for glass impingers, new high volume models are capable of >100 liters per minute.
Very limited information on the size ranges that are collected. Efficiency drops in low volume glass impingers below aerodynamic diameters of 1 µm. High volume samplers have not been characterized for sampling efficiencies as a function of particle sizes.
Impingers are flexible since organisms are impinged into liquid media or buffer and can be used for culturing or molecular analysis.
Samples are impinged into 10 to 20 ml of liquid, which may required concentration by filtration.
Advantages: -Sample is collected into liquid and does not require extraction from a solid collection medium; -Low cost of low flow glass impingers. Disadvantages: -Limited information on efficiencies, and the particle sizes that are sampled; -High volume impingers are high cost; -Glass impingers suffer from low sampling rate and limited sampling times due to evaporation; -High volume impingers have complex systems for collecting the sample and rewetting surfaces, and there is large concern about effectively decontaminating the equipment.
!
Sampling Size resolved Viability Sample Suitable for Advantages/disadvantages dfddd rate sampling molecular methods
58
Common Liquid Impinger Samplers:
SKC BioSampler
Omni 3000 Hi Vol. Impinger
59
Aerosol Sampler CharacterisDcs: FiltraDon
Filtration Mechanism: Aerosols are captured on filters by impaction or diffusional forces. Typical Models: -Anderson High volume PM samplers; -SKC IMPACT samplers.
Ranges from 4 L/min and up to 1,000 L/min.
Filtration samplers typically have size selective inlets that allow for sampling 10 µm and below (PM10) and 2.5 µm and below (PM2.5) size fractons. Because of high diffusional forces, filters are efficient at sampling sizes down to the 20 nm range of viruses and microbial fragments
Not recommended for viability due to high stresses from impaction and desiccation.
Requires extraction from filter material, often Teflon or polycarbonate membranes, quartz fiber filters, or gelatin filters.
Advantages: -High sampling rates available; -Most common and robust form of high volume sampling; -Very small particles can be sampled, most efficient way to sample viruses; -Can be used as personal samplers; -low cost compared to impingers and impactors; -Preferred method for sampling PM for regulatory compliance. Disadvantages: -No possibility for viable determination; -High volume samples are not suitable for sampling in most occupied environments; -Limited ability to produce particle size distributions.
!
Sampling Size resolved Viability Sample Suitable for Advantages/disadvantages dfddd rate sampling molecular methods
60
Common Filter Samplers:
SKC Personal Environmental Monitor
Andersen Hi Vol PM10 sampler
61
Important Resources
62
Tools for Sequence Analysis: Some useful basic tools for gexng started with bacterial and fungal phylogene8c analysis:
RDP Pyrosequencing pipeline: Easy to use pipeline for viewing histograms of raw sequences and sor8ng data based on barcodes. hNp://pyro.cme.msu.edu/
UniFrac: Beta diversity measurements including PCoA plots of microbial popula8ons. hNp://bmf2.colorado.edu/fastunifrac/
FHiTINGS: Automa8cally selects best BLAST hit for fungal iden8fica8on, assigns taxonomy, and parses data into tables. hNp://sourceforge.net/projects/yi8ngs/
All in One tool boxes, that contain a variety of programs for complete sequence analysis:
QIIME: Quan8ta8ve Insights Into Microbial Ecology: hNp://qiime.sourceforge.net/
VAMPS: Visualiza8on and Analysis for Microbial Popula8on Structure: hNp://vamps.mbl.edu/index.php
MOTHUR: hNp://www.mothur.org/ 63
To learn more:
Procedures for phylogeneDc sequencing using Illumina-‐based DNA sequencing: Caporaso et al. (2012)” Ultra-‐high-‐throughput microbial community analysis on the Illumina HiSeq and MiSeq planorms. ISME J 6: 1621-‐1624.”
Reviews on aerosol science and molecular biology: Peccia et al., (2011) "New Direc8ons: A revolu8on in DNA sequencing …”, Atm. Environ., 45: 1896-‐1897. AND Peccia, J., Hernandez, M. (2006) "Incorpora8ng Polymerase chain reac8on-‐based iden8fica8on …", Atm Environ., 40: 3941-‐3961.
Good fungal aerosol next gen sequencing paper. Adams et al.(2013) Dispersal in microbes: fungi in indoor air are dominated by outdoor air and show dispersal limita8on at short distances. ISME J. doi.org/10.1038/ismej.2013.28
Brocks Biology of Microorganisms (11th ediDon or higher): easy to understand textbook that covers microbial gene8cs and phylogene8cs
64
Good viral aerosol/qPCR paper. Yang et al., (2011). “Concentra8ons and size distribu8ons of airborne influenza A viruses measured indoors at a health centre…” Journal of the Royal Society Interface, 8, 1176-‐1184.