Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
For Research Use Only. Not for use in diagnostic procedures. © Copyright 2019 by Pacific Biosciences of California, Inc. All rights reserved. PN 101-784-500 Version 01 (April 2019)
SMRTbell Library Preparation for High
Fidelity (HiFi/CCS) Long Read SequencingSequel System II v7.0.0 / SMRT Link v7.0.0 / Sequel II Chemistry v1.0
April.23.2019
SMRTbell Library Preparation for High
Fidelity (HiFi /CCS) Long Read Sequencing
1. HiFi Long Read Sequencing Features & Applications Overview
2. HiFi Library Sample Preparation & Sequencing Workflow Details
3. HiFi Long Read Sequencing Performance Example Data (Sequel II System)
4. Variant Detection Application Best Practices
5. Technical Documentation & Software Download Resources
HiFi Long Read Sequencing Features &
Applications Overview
HiFi Sequencing Workflow Provides High
Accuracy and Long Read Lengths
HIGHLY ACCURATE LONG READS: A NEW PARADIGM IN DNA
SEQUENCING
Advantages of High-Fidelity
(HiFi / CCS) Long Reads
- Generate higher quality genome
assemblies using a single technology
- Call all variant types – from single
nucleotides to structural variants
- Phase allele-specific haplotypes
- Identify full-length isoform transcripts
– no assembly required
HIFI READS: FROM DNA TO HIGH-QUALITY DATA
Start with high-quality
double stranded DNA
Ligate SMRTbell
Adapters and Size
Select
Anneal Primers
and Bind DNA
PolymeraseDNA is now circularized and ready
for SMRT Sequencing
Circularized DNA is sequenced in
repeated passes
The polymerase reads are trimmed
of adapters to yield subreads
Consensus is called from subreads
to generate highly accurate long
readsHiFi Read
HIFI READS: FROM DNA TO HIGH-QUALITY DATA (CONT.)
99% accuracy with 4 passes
Sequel System (1M)
Sequel II System (8M)
Read length (kb)
0 50 100 150 200 250 300
Yie
ld p
er
un
it r
ea
d le
ng
th (
kb
)
0
25
50
100
150
175
75
125
HIFI READ YIELD AND ACCURACY PERFORMANCE ON SEQUEL
SYSTEMS
Passes
5 10 15 200
30
0
10
20
40
50
Ac
cu
rac
y (P
hre
d)
30
0
10
20
40
50
85 15 200 10
Passes
Sequel (1M)
Sequel II (8M)
Subread Yield 318 Gb
CCS Yield 16 Gb
CCS Accuracy 99.8%
Data shown above from a 12 kb size-selected human library using the SMRTbell Template Prep Kit 1.0 on a Sequel II
System (v1.0 Chemistry, Sequel II System Software v7.0, 30-hour movie). Read lengths, reads/data per SMRT Cell 8M
and other sequencing performance results vary based on sample quality/type and insert size.
- For Researchers who want to detect variants comprehensively
in a whole human genome, the Sequel II System provides high
precision and recall for single-nucleotide variants, indels,
structural variants, and copy-number variants, including in
repetitive regions of the genome.
- HiFi reads from the Sequel II System combine the benefits of
short and long reads: high accuracy and long read length.
- With the Sequel II System, comprehensive detection of variants
in a whole human genome requires 3-4 SMRT Cells 8M.
- Variant calling with HiFi reads uses standard software like
Google DeepVariant and GATK.
VARIANT DETECTION APPLICATION
POWERED BY HIFI READS
Gain complete views of human genetic diversity
with PacBio high-fidelity long-read sequencing
HiFi Library Sample Preparation &
Sequencing Workflow Details
PROCEDURE & CHECKLIST – PREPARING SMRTBELL LIBRARIES
FOR HIFI LONG READ SEQUENCING ON SEQUEL AND SEQUEL II
SYSTEMS
- Procedure for constructing SMRTbell libraries
suitable for generating high accuracy long reads
on the Sequel and Sequel II Systems
- Protocol document contains:
1. Recommendations for gDNA QC and quantification
2. Recommendations for shearing gDNA to the
desired target mode size using the Megaruptor
System from Diagenode
3. Enzymatic steps for preparation of a HiFi
SMRTbell library
4. Instructions for size-selection of the HiFi SMRTbell
library using the SageELF System from Sage
Science
PROCEDURE & CHECKLIST – PREPARING HIFI LIBRARIES USING
THE STANDARD SMRTBELL TEMPLATE PREPARATION KIT 1.0
List of Required Materials and Equipment for Library Construction and Size
Selection
PACIFIC BIOSCIENCES® CONFIDENTIAL
HIFI SMRTBELL LIBRARY PREPARATION WORKFLOW OVERVIEW
1. gDNA QC & Shearing
- CHEF Mapper, FEMTO Pulse or Pippin Pulse sizing QC
- Qubit DNA concentration measurement QC
- Megaruptor shearing
2. SMRTbell Library Construction
- Standard SMRTbell Template Preparation Kit 1.0
- 2-Day LC workflow includes overnight ligation step
3. Size-select & Purify SMRTbell Library
- 10 – 15 kb size selection using SageELF
- AMPure PB purification of final SMRTbell library
4. Sample A/B/C & Sequence
- Anneal v2 Primer, Bind Polymerase, AMPure PB Complex
Cleanup
- Follow QRC for diffusion loading recommendations
5. Analyze
- Manage data through SMRT Link GUI
- For variant calling analysis, use GATK or Google Deep Variant
HIFI LIBRARY PREPARATION WORKFLOW IS MODIFIED FROM
STANDARD LIBRARY CONSTRUCTION PROCEDURES USING
SMRTBELL TEMPLATE PREP KIT 1.0
Library Construction Step High Fidelity Procedure Standard Procedure
Shear gDNA and concentrationMegaruptor
Target = 15 kb
Megaruptor or g-TUBE,
Target up to 75 kb
Remove ssDNA ends Exo VII Exo VII
DNA damage repair DNA Damage Repair Mix DNA Damage Repair Mix
Blunt-end repair ER Mix ER Mix
AMPure Purification Yes Yes
Ligation Blunt Adapter, overnight Blunt Adapter, overnight
AMPure Purification Yes Yes
Size-selectionSageELF, Target = 9-15 kb
(Collect 3 fractions)
BluePippin size selection (≥10-40 kb
lower cutoff)
Annealing, Binding and Cleanupv2 primer; 10:1 = Pol:Template; 4 hr
binding; AMPure clean up
v4 primer; 30:1= Pol:Template; 1 hr
binding; AMPure clean up
ExoVII Pre-Tx
0.45X AMPure
Adapter Ligation
ExoIII / ExoVII
Elf Fractionation
0.45X AMPure
ER
DDR
Day 1
Day 2
Day 30.45X AMPure
SMRTbell Template Prep Kit 1.0
15 m
45 m
O/N
60 m
5 h
45 m
15m
60 m
45 m
3 m
10 m
5 m
3 m
0.5 h
10 m
3 m
3 m
10 m
Step H-O W-A
Day 1
Day 2
Day 3H-O: Hands-on Time
W-A: Walk-away Time
HIFI SMRTBELL LIBRARY PREPARATION WORKFLOW TIME
HIFI LIBRARY INPUT GENOMIC DNA SAMPLE REQUIREMENTS
Library Construction Step Requirements
Recommended input gDNA size for shearing >40 kb
Recommended input gDNA amount for shearing 15 µg
Required sheared gDNA amount for Exo VII pre-treatment step 10 µg*
* Maximum input amount for Exo VII pre-treatment step is 10 µg. For >10 µg, scale the reaction volumes.
1. CHEF Mapper XA System (Bio-Rad)
2. FEMTO Pulse System (Agilent)
3. Pippin Pulse System (Sage Science)
https://www.pacb.com/wp-content/uploads/Procedure-Checklist-Using-the-
Sage-Science-Pippin-Pulse-Electrophoresis-Power-Supply-System.pdf
https://www.pacb.com/wp-content/uploads/Procedure-Checklist-Using-the-BIO-
RAD-CHEF-Mapper-XA-Pulsed-Field-Electrophoresis-System.pdf
https://www.aati-us.com/instruments/femto-pulse/
➢ Up to 10 Mb
➢ >16 Hour Run Time
➢ Up to 165 kb
➢ 1 Hour Run Time
➢ Up to 80 kb
➢ 16 Hour Run Time
Recommended methods for determining gDNA size distribution:
Lane 1: 8-48 kb Ladder (Bio-Rad)
Lane 2: 5 kb Ladder (Bio-Rad)
Lane 3: HMW gDNA
Lane 4: Degraded gDNA
Lane 1: High MW gDNA
Lane 2: Degraded gDNA
Lane 3: 165 kb Ladder
Evaluation of gDNA quality using A) Bio-Rad CHEF Mapper and
B) Advanced Analytical FEMTO Pulse. Lanes A3 and B1 show an
example of a high quality, high molecular weight gDNA sample,
where most of the DNA migrates as a prominent band at the top of
the gel image. Lanes A4 and B3 show an example of a partially
degraded gDNA sample where most of the DNA migrates below ~40
kb and is thus not suitable for constructing a HiFi SMRTbell library.
A B
RECOMMENDATIONS FOR EVALUATING GENOMIC DNA QUALITY
Recommended Megaruptor Shearing Conditions
Hydropore type: Long
Megaruptor software setting: 15 kb
DNA sample concentration: 50 ng/µl
DNA sample Volume: Up to 300 µl
MEGARUPTOR TOOL IS RECOMMENDED FOR SHEARING DNA
FOR HIFI SMRTBELL LIBRARY CONSTRUCTION
Megaruptor generates a tight DNA shearing distribution profile that results
in good recoveries during SMRTbell library size selection
- For HiFi library preparation, PacBio recommends shearing gDNA to a mode size of
approximately 12 kb
- For high quality gDNA, typical yields of sheared and concentrated DNA are >60%
- Because 10 μg of sheared gDNA is needed for the subsequent enzymatic steps, PacBio
recommends starting the shearing procedure with 15 μg of input gDNA
- Because the response of individual gDNA samples to recommended shearing parameters
may differ, small scale test shears are highly recommended (e.g., 50 μL at 50 ng/μL)
- Under- or over-shearing gDNA will result in low yields of final, size fractionated library
MEGARUPTOR DNA SHEARING EXAMPLE
- FEMTO Pulse analysis of human gDNA sheared to a 12-kb mode size using the ‘15 kb’
shear size setting in the Megaruptor software.
PERFORMING SHEARING OPTIMIZATIONS IS IMPORTANT
.
Sample 1
12,296 bp
Sample 2
10,288 bp
Sample 3
13,492 bp
- Above Figure shows examples of 3 different gDNA samples sheared using the same protocol.
- Under- or over-shearing impacts yield of fractions recovered from the library size selection step, hence
impacting the number of SMRT Cells that can be achieved per library prep
Performing test shears is recommended since different genomic DNA
samples may shear differently
SageELF TOOL IS THE GOLD STANDARD FOR PERFORMING HIFI
LIBRARY SIZE SELECTION
Fraction 4: 12977 bp
Fraction 5: 10993 bp
Fraction 6: 9540 bp
- 12 size fractions are collected per SageELF gel cassette (Run Time is ~4.5 hours)
- SMRTbell libraries with insert sizes of ~9 kb, ~11 kb and ~13 kb are suitable for HiFi (CCS) sequencing
- SageELF Cassette Definition File and Run Protocol Setup
- “0.75% 1-18kb v2”; ‘Size-based’ separation mode; Target value = 3400 (move the bar slider to
select Well #12)
- Recommended SMRTbell library input amount per lane: 3-5 µg
- Typical recovery yields (for the total sum of 3 collected fractions of interest) are ~15-30% per lane – but
will be highly dependent on the size distribution of the starting sheared DNA
Righthand Figure shows a FEMTO
Pulse analysis of recovered human
HG002 SMRTbell library fractions after
size selection using a SageELF system.
ALWAYS PERFORM SIZING QC ON RECOVERED SMRTBELL
LIBRARY FRACTIONS AFTER SageELF SIZE SELECTION
- To determine which fractions are suitable for generating HiFi reads, QC all recovered SMRTbell
library fractions after SageELF size selection using FEMTO Pulse, Fragment Analyzer, CHEF
Mapper or Pippin Pulse
- FEMTO Pulse sizing QC method is highly recommended since it requires very small amounts of DNA
sample (≤500 pg)
- SMRTbell libraries with insert sizes of ~9 kb, ~11 kb and ~13 kb are suitable for HiFi (CCS) sequencing
FEMTO Pulse analysis of HG002 library fractions after SageELF
size selection
CHEF Mapper analysis of HG002 library
fractions after SageELF size selection
48 kb
10 kb
4 5 6 4 5 6
20 kb
EXAMPLE SMRTBELL LIBRARY YIELDS AFTER SageELF
FRACTIONATION OF FOUR DIFFERENT HUMAN SAMPLES
Sample
Input
Sheared
gDNA (ng)
SMRTbell
Library (ng)
Input into
SageELF
(ng)
SageELF
Fraction:
9 kb (ng)
SageELF
Fraction:
11 kb (ng)
SageELF
Fraction:
13 kb (ng)
Final
SMRTbell
Yield (%)*
1 8,000 3,456 3,456 270 365 424 13.2 %
2 8,000 4,032 4,032 216 296 318 10.3 %
3 8,000 5,888 5,000 546 584 622 21.0 %
4 8,000 4,700 4,700 506 424 278 15.1 %
* Final SMRTbell library yield is calculated from the total sum of 3 collected SageELF fractions per input sheared gDNA amount.
For HiFi library sequencing, the collected SageELF fractions should have a size distribution mode between ~9 – 15 kb
- SMRTbell library yield after SageELF size selection is highly dependent on the final SMRTbell
library size distribution
SAMPLE SETUP RECOMMENDATIONS FOR SEQUENCING HIFI
SMRTBELL LIBRARIES USING THE SEQUEL II SYSTEM
(CHEMISTRY V1.0)
- Sequence each size-selected fraction of interest individually
- For Sequel II Systems: PacBio recommends sequencing fractions with insert size mode 11 kb
- Final purified SMRTbell library amount from each fraction is sufficient for approx. 3-4 Sequel II
SMRT Cell 8M
Parameter Recommendation
HiFi Library Insert Size 9 kb, 11 kb, 13 kb
Sequencing Primer Primer v2, 20:1 (Primer:Template)
Sequel II Polymerase 1.0 10:1 (Polymerase:Template)
Pre-extension Time 2 hours
Movie Collection Time 30 hours
On-plate loading concentration (OPLC) 25-50 pM
Target % P1 50-70%
In SMRT Link v7.0.0 Sample Setup, choose ‘Sequencing Primer v3’ from the dropdown menu but substitute
in Sequencing Primer v2 when setting up HiFi library annealing and binding reactions using the Sequel II
System (Chemistry v1.0). (Note: SMRT Link v7.0.0 does not include an option to specify Primer v2.)
EXAMPLE SMRT LINK V7.0 RUN QC ANALYSIS: 11-KB SageELF SIZE-
SELECTED HUMAN HIFI SMRTBELL LIBRARY (SEQUEL II SYSTEM)
50% of total base yield is found in reads >130 kb
NameMovie Time
(h)
PE Time (hs)
Total Bases (GB)
Unique Mol. Yield (GB)
Read Length ProductivityLocal Base Rate
Polymerase Longest Subread P0 P1 P2
Mean N50 Mean N50
HG005_11kb * 30 2 261.62 38.57 57445 135591 9188 1067742.1%
(3373130)56.8%
(4554204)1.1%(87337)
2.06
HG005_11kb* 30 2 280.78 42.14 56991 132967 9325 1071537.1%
(2974150)61.5%
(4926672)1.4%
(113849)2.05
Insert Read Length Density
Plot indicates many SMRTbell
library inserts are ~10-11 kb
A large proportion of CCS
reads were generated
* The same 11-kb human HiFi SMRTbell library was sequenced on two Sequel II SMRT Cells 8M using 50 pM OPLC
SAMPLE SETUP RECOMMENDATIONS FOR SEQUENCING HIFI
SMRTBELL LIBRARIES USING THE SEQUEL SYSTEM
(CHEMISTRY V3.0)
Parameter Recommendation
HiFi Library Insert Size 9 kb, 11 kb, 13 kb
Sequencing Primer Primer v2, 20:1 (Primer:Template)
Sequel Polymerase 3.0 10:1 (Polymerase:Template)
Pre-extension Time 8 hours
Movie Collection Time 20 hours
On-plate loading concentration (OPLC) 5-7 pM
Target % P1 50-70%
- Sequence each size-selected fraction of interest individually
- For Sequel Systems, PacBio recommends sequencing fractions with insert size mode 13 kb
- Final purified SMRTbell library amount from each fraction is sufficient for approx. 20 Sequel SMRT
Cell 1M
Use the Excel Sample Setup Calculator to set up annealing and binding reactions for sequencing HiFi
SMRTbell libraries using the Sequel System (Chemistry v3.0)
HiFi Long Read Sequencing Performance
Example Data (Sequel II)
HIFI LIBRARY SEQUENCING READ LENGTH
AND YIELD PERFORMANCE
Data shown above from a 11 kb size-selected human library using the SMRTbell Template Prep Kit on a Sequel II System (v1.0 Chemistry,
Sequel II System Software v7.0, 30-hour movie). Read lengths, reads/data per SMRT Cell 8M and other sequencing performance results vary
based on sample quality/type and insert size.
Metric
Number of Raw Bases (Gb) 320
Total Reads 4,053,000
Half of Bases in Reads >166,571
Longest read lengths >300,000
Data shown above from a 11 kb size-selected human library using the SMRTbell Template Prep Kit on a Sequel II System (v1.0 Chemistry,
Sequel II System Software v7.0, 30-hour movie). Read lengths, reads/data per SMRT Cell 8M and other sequencing performance results vary
based on sample quality/type and insert size.
HIFI LIBRARY SEQUENCING ACCURACY PERFORMANCE
Q20 (99%) single-molecule
accuracy
Metric
Insert Size 12 kb
Number of >Q20 Bases 21 Gb
Number of >Q20 Reads 1,855,642
Accuracy (Mean) 99.8%
99.3
95.195.4
98.6
93.0
96.8
90
91
92
93
94
95
96
97
98
99
100
SNVs Small Indels SVs
Precision Recall
Perc
en
tag
e (
%)
HIFI READS ENABLE HIGH PRECISION & RECALL FOR DETECTING
ALL VARIANT TYPES
Variant calls from 15-fold HiFi read coverage of a human genome (HG002) were measured against the Genome in a
Bottle small variant benchmark (v3.3.2) for SNVs and indels and the v0.6 benchmark for SVs.
VARIANT DETECTION COVERAGE TITRATION
40
50
60
70
80
90
100
0 5 10 15 20 25 30
Perc
enta
ge (
%)
Fold coverage
SNVs with DeepVariant
Precision
Recall
40
50
60
70
80
90
100
0 5 10 15 20 25 30
Pe
rcen
tage
(%
)
Fold coverage
Indels with DeepVariant
Precision
Recall40
50
60
70
80
90
100
0 5 10 15 20 25 30 35
Pe
rcen
tage
(%
)
Fold coverage
Structural Variants
Precision
Recall
15-fold HiFi Coverage
(2-3 SMRT Cells 8M)
provides a good trade-off
between cost and results
Variant Detection Application Best Practices
VARIANT DETECTION BEST PRACTICES (SEQUEL II SYSTEM)
Template Preparation with SMRTbell Template Prep Kit 1.0
- Use recommended high-quality genomic DNA input (≥15 µg)
- Prepare 10-15 kb SMRTbell library using Procedure & Checklist - Preparing SMRTbell
Libraries for HiFi Long Read Sequencing
- Perform SMRTbell library size selection using the SageELF System
- Sequence with 11 kb size-selected library fraction (supports ~4 SMRT Cell 8M)
Sequence on the Sequel II System
- Maximize output and turn-around-time with adjustable sequencing parameters
- Use CCS Sequencing Mode, 2-hour pre-extension time, and 30-hour collection times
- Generate 15-18 Gb of high-quality long reads per Sequel SMRT Cell 8M*
- Sequence to desired CCS coverage based on study needs:
- Recommend ~2-3 SMRT Cell 8M to achieve 15-fold CCS coverage for human studies
Data Analysis Solutions with the PacBio Analytical Portfolio
- Call single-nucleotide variants and small indels with Google DeepVariant or
GATK. DeepVariant provides higher precision and recall, particularly for indels
- Phase variants with WhatsHap
- Call structural variants and copy-number variants with the ‘Structural Variant Calling’
application in SMRT Link (powered by pbsv)
* Read lengths, number of reads, data per SMRT Cell, and other sequencing performance results vary based on sample quality/type and insert size,
among other factors.
Application Brief: Variant detection using whole genome sequencing with HiFi reads – Best Practices
Technical Documentation and Resources
TECHNICAL DOCUMENTATION AND OTHER RESOURCES
- Procedure & Checklist - Preparing SMRTbell Libraries for HiFi Long Read Sequencing on
Sequel and Sequel II Systems (PN 101-714-400)
- Procedure & Checklist – AMPure PB Bead Purification of Polymerase Bound SMRTbell
Complexes (PN 101-348-800)
- SMRT Link v7.0.0 Sample Setup should be used to set up HiFi library annealing and binding
reactions using the Sequel II System (Chemistry v1.0)
- Note: In SMRT Link v7.0.0 Sample Setup, choose ‘Sequencing Primer v3’ from the dropdown menu
but substitute in Sequencing Primer v2
- Excel Sample Setup Calculator should be used to set up HiFi library annealing and binding
reactions using the Sequel System (Chemistry v3.0). [Please contact PacBio or your
Local FAS to request a copy of the Excel Sample Setup Calculator.]
- Application Brief: Variant detection using whole genome sequencing with HiFi reads – Best
Practices (PN: BP106-041919)
-Webinars
- AGBT Presentation (March 1, 2019): HiFi long reads for comprehensive genomic analysis [Mike
Hunkapiller, Pacific Biosciences]
- Publications
- Wenger, A.M., Peluso, P. et al. Highly-accurate long-read sequencing improves variant detection and
assembly of a human genome. BioRxiv Preprint (https://doi.org/10.1101/519025)
For Research Use Only. Not for use in diagnostics procedures. © Copyright 2019 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo,
PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx.
FEMTO Pulse and Fragment Analyzer are trademarks of Agilent Technologies Inc.
All other trademarks are the sole property of their respective owners.
www.pacb.com
PACIFIC BIOSCIENCES® CONFIDENTIAL