2
www.pacb.com/isoseq HUMAN BIOMEDICAL RESEARCH PLANT AND ANIMAL SCIENCES Non-size Selected Iso-Seq Libraries Full-length Transcript Size Non-Size- Selected SMRTbell Library PacBio RS II Sequel System Depicted on the left is a histogram plot of number of full-length sequences by transcript length for a Magbead-loaded, non-size selected Iso-Seq library sequenced on both the PacBio RS II and the Sequel System. The full-length cDNA sequences run on the Sequel System closely resemble the size distribution of the input SMRTbell library (shown on the right). FROM RNA TO ACCURATE GENE MODELS LONG-READ RNA SEQUENCING BEST PRACTICES With Single Molecule, Real-Time (SMRT ® ) Sequencing and the Sequel ® System, you can easily and affordably sequence transcript isoforms of up to 10 kb in their entirety. The Iso-Seq ® method allows users to generate full-length cDNA sequences - with no assembly required - in order to confidently characterize the full complement of transcript isoforms within targeted genes, or across an entire transcriptome. SAMPLE PREPARATION RECOMMENDATIONS - Prepare full-length transcripts using the Clontech® SMARTer® PCR cDNA Synthesis Kit with as little as 1 ng of poly-A+ RNA or 2 ng of total RNA 1 - The Sequel Sequencing Kit and protocols eliminate the need for size selection for transcripts <4 kb 2 - Optional size-selection protocols to enrich for transcripts >4 kb - Compatible with standard target enrichment methods, such as NimbleGen SeqCap EZ 3 or IDT xGen Lockdown Probes 4 - Multiplex with sample barcoding 5 - Scalable throughput - Up to 20 Gb, or 250k-350k full-length non-chimeric reads, per SMRT Cell 1M* - Profile transcripts from multiplexed samples in a single SMRT Cell - Survey transcriptomes in 1–2 SMRT Cells on the Sequel System - Increase sequencing depth for more comprehensive transcriptome characterization MORE CONSISTENT LOADING ON SEQUEL SYSTEM REDUCES NEED FOR SIZE SELECTION Analyze with SMRT Analysis Software Suite poly-A+ RNA Total RNA Optional Poly-A Selection Reverse Transcription Full Length 1 st Strand cDNA Large-scale Amplification Amplified cDNA >4 kb Combined SMRTbell Library SMRT Sequencing on Sequel System Optional Size Selection * Read lengths, number of reads, data per SMRT Cell, and other sequencing performance results vary based on sample quality/type and insert size, among other factors.

Read full-length transcripts – no assembly required (2015)

Embed Size (px)

Citation preview

Page 1: Read full-length transcripts – no assembly required (2015)

w w w. p a c b .c o m / i s o s e q

HUMAN BIOMEDICAL RESEARCH

PLANT AND ANIMAL SCIENCES

Non-size SelectedIso-Seq Libraries

Full-length Transcript Size

Non-Size-Selected SMRTbell

Library

PacBio RS IISequel System

Depicted on the left is a histogram plot of number of full-length sequences by transcript length for a Magbead-loaded, non-size selected Iso-Seq library sequenced on both the PacBio RS II and the Sequel System. The full-length cDNA sequences run on the Sequel System closely resemble the size distribution of the input SMRTbell library (shown on the right).

FROM RNA TO ACCURATE GENE MODELS

LONG-READ RNA SEQUENCING BEST PRACTICES

With Single Molecule, Real-Time (SMRT®) Sequencing and the Sequel® System, you can easily and affordably sequence transcript isoforms of up to 10 kb in their entirety. The Iso-Seq® method allows users to generate full-length cDNA sequences - with no assembly required - in order to confidently characterize the full complement of transcript isoforms within targeted genes, or across an entire transcriptome.

SAMPLE PREPARATION RECOMMENDATIONS - Prepare full-length transcripts using the Clontech®

SMARTer® PCR cDNA Synthesis Kit with as little as 1 ng of poly-A+ RNA or 2 ng of total RNA1

- The Sequel Sequencing Kit and protocols eliminate the need for size selection for transcripts <4 kb2

- Optional size-selection protocols to enrich for transcripts >4 kb

- Compatible with standard target enrichment methods, such as NimbleGen SeqCap EZ3 or IDT xGen Lockdown Probes4

- Multiplex with sample barcoding5

- Scalable throughput - Up to 20 Gb, or 250k-350k full-length non-chimeric

reads, per SMRT Cell 1M* - Profile transcripts from multiplexed samples in a

single SMRT Cell - Survey transcriptomes in 1–2 SMRT Cells on the

Sequel System - Increase sequencing depth for more comprehensive

transcriptome characterization

MORE CONSISTENT LOADING ON SEQUEL SYSTEM REDUCES NEED FOR SIZE SELECTION

Analyze with SMRT Analysis Software Suite

poly-A+ RNA

Total RNA

Optional Poly-A Selection

Reverse Transcription

Full Length1st Strand cDNA

Large-scale Amplification

Amplified cDNA >4 kb

Combined SMRTbell Library

SMRT Sequencing on Sequel System

OptionalSize Selection

* Read lengths, number of reads, data per SMRT Cell, and other sequencing performance results vary based on sample quality/type and insert size, among other factors.

Page 2: Read full-length transcripts – no assembly required (2015)

KEY REFERENCES1. Procedure & Checklist – Iso-Seq™ Template Preparation for Sequel® Systems2. Clark, T. et al. (2017) Full-Length cDNA Sequencing on the PacBio Sequel Platform. Poster presented at Plant and Animal Genome Conference. San

Diego, CA.3. Full-length cDNA Target Sequence Capture Using SeqCap® EZ Libraries4. Full-length cDNA Target Sequence Capture Using IDT xGen® Lockdown® Probes5. Barcoding Samples for Isoform Sequencing (Iso-Seq Analysis)6. Best practice for analyzing multiplexed Iso-Seq data7. PacBio Support: Software Downloads8. Running SMRT Analysis on Amazon 9. Tutorial: Iso-Seq Analysis Application10. Abdel-Ghany, S.E. et al. (2016) A survey of the sorghum transcriptome using single-molecule long reads. Nature Communications. 7, e11706.

For Research Use Only. Not for use in diagnostic procedures. © Copyright 2018, Pacific Biosciences of California, Inc. All rights reserved. Information in this document is subject to change without notice. Pacific Biosciences assumes no responsibility for any errors or omissions in this document. Certain notices, terms, conditions and/or use restrictions may pertain to your use of Pacific Biosciences products and/or third party products. Please refer to the applicable Pacific Biosciences Terms and Conditions of Sale and to the applicable license terms at http://www.pacb.com/legal-and-trademarks/terms-and-conditions-of-sale/. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. All other trademarks are the sole property of their respective owners. PN: BP103-020818

The Iso-Seq method allows you to make evidence-based genome annotations, discover novel genes and isoforms, identify promoters and splice sites to understand gene regulation, improve accuracy of RNA-seq quantification for gene expression studies, and distinguish important stress response, developmental, or tissue-specific isoforms.

Splice isoform analysis in the sorghum transcriptome using the Iso-Seq method greatly improved genome annotation, with >11,000 novel splice isoforms and >2,100 novel genes identified. In this example, a gene was discovered to produce 13 novel alternatively spliced isoforms, where the previous gene model contained only a single isoform10.

The Iso-Seq protocol, available in SMRT Analysis, generates consensus sequences and determines those transcripts that are full-length by detecting and identifying the 5’ primer, Poly-A sequence, and the 3’ primer of the reads.

INFORMATICS PIPELINE FOR ISO-SEQ ANALYSIS

DATA ANALYSIS SOLUTIONS WITH PACBIO SMRT ANALYSIS - Use the Iso-Seq protocol in SMRT Analysis to output high quality, full-length transcript sequences, with no assembly

required, to characterize transcripts and splice variants and map transcripts back to a reference genome - ‘IsoSeq’ option recommended for analysis of targeted Iso-Seq experiments; ‘IsoSeq2 (beta)’ recommended for whole

transcriptome analysis6

- Run Iso-Seq analysis in either de novo (no genome reference required) or reference-based mode - Install SMRT Analysis locally7 or access it via Amazon Cloud8

- View tutorial9 for running the Iso-Seq protocol in SMRT Analysis

DETERMINATION OF TRANSCRIPT ISOFORMS

SIGNIFICANTLY IMPROVE EXISTING GENOME ANNOTATIONS

Gene

Short-read technologies:

Reads spanning splice junctions

Insufficient ConnectivitySplice Isoform Uncertainty

Iso-Seq solution:

Full-length cDNA Sequence ReadsSplice Isoform Certainty – No Assembly Required

mRNA isoforms

Remove adaptersRemove artifacts

Reads clustering

Consensus calling

Quality filtering

Map to reference genome

Experimental pipeline Informatics pipeline

PacBio raw sequence reads

Figure 1

a b

AAAA

AAAA

AAAAA

AAAAA

AAAAA

AAAAA

AAAAA

Size partitioning &PCR amplification

cDNA synthesiswith adapters

SMRTbell ligation

RS sequencing

Remove adaptersRemove artifacts

Reads clustering

Quality filtering

Cleansequence reads

Nonredundant transcript isoforms

Final isoforms

TTTT

TTTT

Consensus calling

Isoform clusters

Map to reference genome

Evidence-based gene models

polyA mRNA

AAAA

AAAA

TTTT

TTTT

AAAATTTT

AAAATTTT

AAAATTTT

AAAATTTT

Evidenced-based gene models

PacBio raw sequence reads

1

Classify sequence reads

2

Isoform clusters

3

High-quality transcript isoforms

4

Final isoforms

5