Upload
dangtuong
View
219
Download
0
Embed Size (px)
Citation preview
w w w. p a c b .c o m / i s o s e q
HUMAN BIOMEDICAL RESEARCH
PLANT AND ANIMAL SCIENCES
Non-size SelectedIso-Seq Libraries
Full-length Transcript Size
Non-Size-Selected SMRTbell
Library
PacBio RS IISequel System
Depicted on the left is a histogram plot of number of full-length sequences by transcript length for a Magbead-loaded, non-size selected Iso-Seq library sequenced on both the PacBio RS II and the Sequel System. The full-length cDNA sequences run on the Sequel System closely resemble the size distribution of the input SMRTbell library (shown on the right).
FROM RNA TO ACCURATE GENE MODELS
LONG-READ RNA SEQUENCING BEST PRACTICES
With Single Molecule, Real-Time (SMRT®) Sequencing and the Sequel® System, you can easily and affordably sequence transcript isoforms of up to 10 kb in their entirety. The Iso-Seq® method allows users to generate full-length cDNA sequences - with no assembly required - in order to confidently characterize the full complement of transcript isoforms within targeted genes, or across an entire transcriptome.
SAMPLE PREPARATION RECOMMENDATIONS - Prepare full-length transcripts using the Clontech®
SMARTer® PCR cDNA Synthesis Kit with as little as 1 ng of poly-A+ RNA or 2 ng of total RNA1
- The Sequel Sequencing Kit and protocols eliminate the need for size selection for transcripts <4 kb2
- Optional size-selection protocols to enrich for transcripts >4 kb
- Compatible with standard target enrichment methods, such as NimbleGen SeqCap EZ3 or IDT xGen Lockdown Probes4
- Multiplex with sample barcoding5
- Scalable throughput - Up to 20 Gb, or 250k-350k full-length non-chimeric
reads, per SMRT Cell 1M* - Profile transcripts from multiplexed samples in a
single SMRT Cell - Survey transcriptomes in 1–2 SMRT Cells on the
Sequel System - Increase sequencing depth for more comprehensive
transcriptome characterization
MORE CONSISTENT LOADING ON SEQUEL SYSTEM REDUCES NEED FOR SIZE SELECTION
Analyze with SMRT Analysis Software Suite
poly-A+ RNA
Total RNA
Optional Poly-A Selection
Reverse Transcription
Full Length1st Strand cDNA
Large-scale Amplification
Amplified cDNA >4 kb
Combined SMRTbell Library
SMRT Sequencing on Sequel System
OptionalSize Selection
* Read lengths, number of reads, data per SMRT Cell, and other sequencing performance results vary based on sample quality/type and insert size, among other factors.
KEY REFERENCES1. Procedure & Checklist – Iso-Seq™ Template Preparation for Sequel® Systems2. Clark, T. et al. (2017) Full-Length cDNA Sequencing on the PacBio Sequel Platform. Poster presented at Plant and Animal Genome Conference. San
Diego, CA.3. Full-length cDNA Target Sequence Capture Using SeqCap® EZ Libraries4. Full-length cDNA Target Sequence Capture Using IDT xGen® Lockdown® Probes5. Barcoding Samples for Isoform Sequencing (Iso-Seq Analysis)6. Best practice for analyzing multiplexed Iso-Seq data7. PacBio Support: Software Downloads8. Running SMRT Analysis on Amazon 9. Tutorial: Iso-Seq Analysis Application10. Abdel-Ghany, S.E. et al. (2016) A survey of the sorghum transcriptome using single-molecule long reads. Nature Communications. 7, e11706.
For Research Use Only. Not for use in diagnostic procedures. © Copyright 2018, Pacific Biosciences of California, Inc. All rights reserved. Information in this document is subject to change without notice. Pacific Biosciences assumes no responsibility for any errors or omissions in this document. Certain notices, terms, conditions and/or use restrictions may pertain to your use of Pacific Biosciences products and/or third party products. Please refer to the applicable Pacific Biosciences Terms and Conditions of Sale and to the applicable license terms at http://www.pacb.com/legal-and-trademarks/terms-and-conditions-of-sale/. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. All other trademarks are the sole property of their respective owners. PN: BP103-020818
The Iso-Seq method allows you to make evidence-based genome annotations, discover novel genes and isoforms, identify promoters and splice sites to understand gene regulation, improve accuracy of RNA-seq quantification for gene expression studies, and distinguish important stress response, developmental, or tissue-specific isoforms.
Splice isoform analysis in the sorghum transcriptome using the Iso-Seq method greatly improved genome annotation, with >11,000 novel splice isoforms and >2,100 novel genes identified. In this example, a gene was discovered to produce 13 novel alternatively spliced isoforms, where the previous gene model contained only a single isoform10.
The Iso-Seq protocol, available in SMRT Analysis, generates consensus sequences and determines those transcripts that are full-length by detecting and identifying the 5’ primer, Poly-A sequence, and the 3’ primer of the reads.
INFORMATICS PIPELINE FOR ISO-SEQ ANALYSIS
DATA ANALYSIS SOLUTIONS WITH PACBIO SMRT ANALYSIS - Use the Iso-Seq protocol in SMRT Analysis to output high quality, full-length transcript sequences, with no assembly
required, to characterize transcripts and splice variants and map transcripts back to a reference genome - ‘IsoSeq’ option recommended for analysis of targeted Iso-Seq experiments; ‘IsoSeq2 (beta)’ recommended for whole
transcriptome analysis6
- Run Iso-Seq analysis in either de novo (no genome reference required) or reference-based mode - Install SMRT Analysis locally7 or access it via Amazon Cloud8
- View tutorial9 for running the Iso-Seq protocol in SMRT Analysis
DETERMINATION OF TRANSCRIPT ISOFORMS
SIGNIFICANTLY IMPROVE EXISTING GENOME ANNOTATIONS
Gene
Short-read technologies:
Reads spanning splice junctions
Insufficient ConnectivitySplice Isoform Uncertainty
Iso-Seq solution:
Full-length cDNA Sequence ReadsSplice Isoform Certainty – No Assembly Required
mRNA isoforms
Remove adaptersRemove artifacts
Reads clustering
Consensus calling
Quality filtering
Map to reference genome
Experimental pipeline Informatics pipeline
PacBio raw sequence reads
Figure 1
a b
AAAA
AAAA
AAAAA
AAAAA
AAAAA
AAAAA
AAAAA
Size partitioning &PCR amplification
cDNA synthesiswith adapters
SMRTbell ligation
RS sequencing
Remove adaptersRemove artifacts
Reads clustering
Quality filtering
Cleansequence reads
Nonredundant transcript isoforms
Final isoforms
TTTT
TTTT
Consensus calling
Isoform clusters
Map to reference genome
Evidence-based gene models
polyA mRNA
AAAA
AAAA
TTTT
TTTT
AAAATTTT
AAAATTTT
AAAATTTT
AAAATTTT
Evidenced-based gene models
PacBio raw sequence reads
1
Classify sequence reads
2
Isoform clusters
3
High-quality transcript isoforms
4
Final isoforms
5