16
Next Generation Sequencing By Amir Bagheri Supervisor: Dr. S. J. Mowla Brief introduction to

Introduction to NGS

Embed Size (px)

Citation preview

Page 1: Introduction to NGS

Next Generation Sequencing

By Amir BagheriSupervisor: Dr. S. J. Mowla

Brief introduction to

Page 2: Introduction to NGS

$1,000 genome challenge

Page 3: Introduction to NGS

• Advances in sequencing technologies paved the way for launching the $1,000 genome challenge in 2005, an almost impossible goal to imagine at the time. In fact, the cost of sequencing the first human genome was about $3 billion, and it took several international institutes, hundreds of researchers and 13 years to complete. However, in the past few years the cost of sequencing has declined exponentially: James Watson’s genome was completed for less than $1 million; by 2009 the cost for a whole-genome sequence dropped to $100,000. Hence, today, a mere 10 years after the completion of the first draft of the human genome, the goal of the $1,000 genome seems surprisingly close, and it is now conceivable that this will be a step towards even cheaper genomes.

Page 4: Introduction to NGS
Page 5: Introduction to NGS
Page 6: Introduction to NGS

Applications of NGS technologies

Page 7: Introduction to NGS

Experiment Source DNA (input) Description

WGS gDNAIdentifies an individual's complete genome sequence including copy number variation (e.g., repeats, indels) and structural rearragements (e.g., translocations)

Targeted “exome” sequencing Protein-encoding gDNA (i.e., exons) Identifies the sequence for all coding regions including copy number

variation and structural rearrangements

RNA-seq cDNA made from various sources of RNACan identify all transcribed sequences or just coding RNA sequences; can also provide information on sequence content (e.g., splicing variants) and copy number/abundance

Bisulfite-seq Bisulfite-treated DNA Identifies sites of DNA methylation (e.g., genetic imprinting)

ChIP-seq Immunoprecipitated DNA Identifies sites of protein–DNA interactions such as transcription factor–binding sites

RIP-seq cDNA made from immunoprecipitated RNA Identifies sites of protein–RNA interactions; a ChIP-seq for RNA-binding proteins

DNase-seq DNase-digested chromatin DNA Identifies genomic regions susceptible to enzymatic cleavage by DNase,

FAIRE-seq Open/accessible chromatin DNA Identifies open/accessible chromatin regions,

MNase-seq Nucleosome-associated DNA Identifies nucleosome positions on genomic DNA; also provides information on histone/nucleosome density at each location

Hi-C/5C-seq Captured chromosome conformations Identifies intra- and interchromosomal interactions; determines the spatial organization of chromosomes at high resolution

Metagenomics Microbial DNA populationsGenomic analysis of microbial communities; identifies bacterial/viral populations present in specific environments (e.g., human gut and tumor samples)

Page 8: Introduction to NGS

Current NGS technologies

Page 9: Introduction to NGS

Roche 454 GS FLX sequencing

• Template DNA is fragmented, end-repaired, ligated to adapters, and clonally amplified by emulsion PCR. After amplification, the beads are deposited into picotiter-plate wells with sequencing enzymes. The picotiter plate functions as a flow cell where iterative pyrosequencing is performed. A nucleotide-incorporation event results in pyrophosphate (PPi) release and well-localized luminescence.

Page 10: Introduction to NGS

Illumina Genome Analyzer sequencing

• Adapter-modified, single-stranded DNA is added to the flow cell and immobilized by hybridization. Bridge amplification generates clonally amplified clusters. Clusters are denatured and cleaved; sequencing is initiated with addition of primer, polymerase and 4 reversible dye terminators. Post-incorporation fluorescence is recorded. The fluor and block are removed before the next synthesis cycle.

Page 11: Introduction to NGS

Applied Biosystems SOLiD sequencing by ligation

• Top: SOLiD color-space coding. Each interrogation probe is an octamer, which consists of (3 -to-5 direction) 2 probe-specific ′ ′bases followed by 6 degenerate bases (nnnzzz) with one of 4 fluorescent labels linked to the 5 ′end. The 2 probe-specific bases consist of one of 16 possible 2-base combinations. Bottom: (A), The P1 adapter and template with annealed primer (n) is interrogated by probes representing the 16 possible 2-base combinations. In this example, the 2 specific bases complementary to the template are AT. (B), After annealing and ligation of the probe, fluorescence is recorded before cleavage of the last 3 degenerate probe bases. The 5 end of the ′cleaved probe is phosphorylated (not shown) before the second sequencing step. (C), Annealing and ligation of the next probe. (D), Complete extension of primer (n) through the first round consisting of 7 cycles of ligation. (E), The product extended from primer (n) is denatured from the adapter/template, and the second round of sequencing is performed with primer (n − 1).

Page 12: Introduction to NGS

Common Sequencing steps

Page 13: Introduction to NGS

Comparison of sequencing platforms

Roche 454 GS FLX Illumina Genome Analyzer

Applied Biosystems

SOLiDSanger

Sequencing method Pyrosequencing Reversible dye

terminatorsSequencing by

ligation Dye terminators

Read lengths 400 bases 36 bases 35 bases 800 bp

Sequencing run time 10 h 2.5 days 6 days 3 h

Total bases per run 500 Mb 1.5 Gb 4 Gb 800 bp

Page 14: Introduction to NGS

Billions of sequences100s of sequences

Page 15: Introduction to NGS

Contribution of different factors to the overall cost of a sequencing project across time. Left, the four-step process: (i) experimental design and sample collection, (ii) sequencing, (iii) data reduction and management, and (iv) downstream analysis. Right, the changes over time of relative impact of these four components of a sequencing experiment.

Page 16: Introduction to NGS

lets discover the new horizon of deep se(a,q)