6
Four-color DNA sequencing with 3-O-modified nucleotide reversible terminators and chemically cleavable fluorescent dideoxynucleotides Jia Guo* †‡ , Ning Xu*, Zengmin Li* , Shenglong Zhang* †‡ , Jian Wu* †‡ , Dae Hyun Kim*, Mong Sano Marma* , Qinglin Meng* †‡ , Huanyan Cao*, Xiaoxu Li* , Shundi Shi*, Lin Yu*, Sergey Kalachikov*, James J. Russo*, Nicholas J. Turro †‡§ , and Jingyue Ju* †§ *Columbia Genome Center, Columbia University College of Physicians and Surgeons, New York, NY 10032; and Departments of Chemical Engineering and Chemistry, Columbia University, New York, NY 10027 Contributed by Nicholas J. Turro, April 28, 2008 (sent for review March 21, 2008) DNA sequencing by synthesis (SBS) on a solid surface during polymerase reaction can decipher many sequences in parallel. We report here a DNA sequencing method that is a hybrid between the Sanger dideoxynucleotide terminating reaction and SBS. In this approach, four nucleotides, modified as reversible terminators by capping the 3-OH with a small reversible moiety so that they are still recognized by DNA polymerase as substrates, are combined with four cleavable fluorescent dideoxynucleotides to perform SBS. The ratio of the two sets of nucleotides is adjusted as the extension cycles proceed. Sequences are determined by the unique fluorescence emission of each fluorophore on the DNA products terminated by ddNTPs. On removing the 3-OH capping group from the DNA products generated by incorporating the 3-O-modified dNTPs and the fluorophore from the DNA products terminated with the ddNTPs, the polymerase reaction reinitiates to continue the sequence determination. By using an azidomethyl group as a chemically reversible capping moiety in the 3-O-modified dNTPs, and an azido-based cleavable linker to attach the fluorophores to the ddNTPs, we synthesized four 3-O-azidomethyl-dNTPs and four ddNTP-azidolinker-fluorophores for the hybrid SBS. After se- quence determination by fluorescence imaging, the 3-O-azidom- ethyl group and the fluorophore attached to the DNA extension product via the azidolinker are efficiently removed by using Tris(2- carboxyethyl)phosphine in aqueous solution that is compatible with DNA. Various DNA templates, including those with homopoly- mer regions, were accurately sequenced with a read length of >30 bases by using this hybrid SBS method on a chip and a four-color fluorescence scanner. sequencing by synthesis DNA chip T he completion of the Human Genome Project (1) was a monumental achievement in biological science. The engine behind this project was the Sanger sequencing method (2), which is still the gold standard in genome research. The prolonged success of the Sanger sequencing method is because of its efficiency and fidelity in producing dideoxy-terminated DNA products that can be separated electrophoretically and detected by fluorescence (3–5). However, a challenge in the use of electrophoresis for DNA separation is the difficulty in achieving high throughput and the complexity involved in the automation, although some level of increased parallelization may be achieved by using miniaturization (6). To overcome the limitations of the Sanger sequencing tech- nology, a variety of new methods have been investigated. Such approaches include sequencing by hybridization (7), mass spec- trometry sequencing (8, 9), sequencing by nanopores (10), and sequencing by ligation (11). More recently, DNA sequencing by synthesis (SBS) approaches such as pyrosequencing (12), se- quencing of single DNA molecules (13, 14), and polymerase colonies (15) have been widely explored. Previously, we reported the development of a general strategy to rationally design cleavable fluorescent nucleotide reversible terminators (NRTs) for four-color DNA sequencing by synthesis (16–20). In this approach, four nucleotides (A, C, G and T) are modified as NRTs by attaching a cleavable f luorophore to a specific location on the base and capping the 3-OH with a small chemically reversible moiety so that they are still recognized as substrates by DNA polymerase. DNA templates consisting of homopoly- mer regions were accurately sequenced by this approach. A recently developed SBS system based on a similar design of the cleavable fluorescent NRTs has already found wide application in genome biology (21–23). In addition, we have used 3-O- modified NRTs to solve the homopolymer sequencing problem in conventional pyrosequencing (24). We report here an alternative sequencing method that is a hybrid between the Sanger dideoxy chain-terminating reaction and SBS, and discuss the advantages that come with this hybrid approach. The fundamental difference between the two methods is that the Sanger method produces every possible complemen- tary DNA extension fragment for a given DNA template and obtains the sequence after the separation and detection of these fragments, whereas SBS relies on identification of each base as the DNA strand is extended by cleavable f luorescent NRTs that temporarily pause the DNA synthesis for sequence determina- tion. The limiting factor for increasing throughput in the Sanger method is the requirement to use electrophoresis. Challenges in using SBS with cleavable fluorescent NRTs involve the further improvement of the DNA polymerase that efficiently recognizes the modified nucleotides. In addition, the first generation of the cleavable fluorescent NRTs synthesize the DNA strand with a propargyl amino group modification on the base during SBS (20), which might interfere with the activity of the polymerase for chain elongation. The advantage of the Sanger method is the dideoxy chain fragment-producing reac- tion. Once the DNA strand is terminated by incorporation of a f luorescent dideoxynucleotide, it is no longer involved in further DNA extension reactions. Therefore, the continuous DNA polymerase extension reaction occurs with only natural nucle- otides at high efficiency leading to a read length of 700 bp. The most attractive feature in the SBS approach is the massive Author contributions: J.G., D.H.K., N.J.T., and J.J. designed research; J.G., N.X., Z.L., S.Z., J.W., D.H.K., Q.M., X.L., S.S., L.Y., and J.J. performed research; Z.L., S.Z., M.S.M., Q.M., H.C., X.L., L.Y., and S.K. contributed new reagents/analytic tools; J.G., N.X., Z.L., S.Z., J.W., D.H.K., Q.M., S.S., J.J.R., and J.J. analyzed data; and J.G., D.H.K., J.J.R., N.J.T., and J.J. wrote the paper. The authors declare no conflict of interest. Freely available online through the PNAS open access option. § To whom correspondence should be addressed. E-mail: [email protected] or ju@ c2b2.columbia.edu. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0804023105/DCSupplemental. © 2008 by The National Academy of Sciences of the USA www.pnas.orgcgidoi10.1073pnas.0804023105 PNAS July 8, 2008 vol. 105 no. 27 9145–9150 APPLIED BIOLOGICAL SCIENCES CHEMISTRY Downloaded by guest on December 4, 2020

Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

Four-color DNA sequencing with 3�-O-modifiednucleotide reversible terminators and chemicallycleavable fluorescent dideoxynucleotidesJia Guo*†‡, Ning Xu*, Zengmin Li*†, Shenglong Zhang*†‡, Jian Wu*†‡, Dae Hyun Kim*, Mong Sano Marma*†,Qinglin Meng*†‡, Huanyan Cao*, Xiaoxu Li*†, Shundi Shi*, Lin Yu*, Sergey Kalachikov*, James J. Russo*,Nicholas J. Turro†‡§, and Jingyue Ju*†§

*Columbia Genome Center, Columbia University College of Physicians and Surgeons, New York, NY 10032; and Departments of †Chemical Engineering and‡Chemistry, Columbia University, New York, NY 10027

Contributed by Nicholas J. Turro, April 28, 2008 (sent for review March 21, 2008)

DNA sequencing by synthesis (SBS) on a solid surface duringpolymerase reaction can decipher many sequences in parallel. Wereport here a DNA sequencing method that is a hybrid between theSanger dideoxynucleotide terminating reaction and SBS. In thisapproach, four nucleotides, modified as reversible terminators bycapping the 3�-OH with a small reversible moiety so that they arestill recognized by DNA polymerase as substrates, are combinedwith four cleavable fluorescent dideoxynucleotides to performSBS. The ratio of the two sets of nucleotides is adjusted as theextension cycles proceed. Sequences are determined by the uniquefluorescence emission of each fluorophore on the DNA productsterminated by ddNTPs. On removing the 3�-OH capping group fromthe DNA products generated by incorporating the 3�-O-modifieddNTPs and the fluorophore from the DNA products terminatedwith the ddNTPs, the polymerase reaction reinitiates to continuethe sequence determination. By using an azidomethyl group as achemically reversible capping moiety in the 3�-O-modified dNTPs,and an azido-based cleavable linker to attach the fluorophores tothe ddNTPs, we synthesized four 3�-O-azidomethyl-dNTPs and fourddNTP-azidolinker-fluorophores for the hybrid SBS. After se-quence determination by fluorescence imaging, the 3�-O-azidom-ethyl group and the fluorophore attached to the DNA extensionproduct via the azidolinker are efficiently removed by using Tris(2-carboxyethyl)phosphine in aqueous solution that is compatiblewith DNA. Various DNA templates, including those with homopoly-mer regions, were accurately sequenced with a read length of >30bases by using this hybrid SBS method on a chip and a four-colorfluorescence scanner.

sequencing by synthesis � DNA chip

The completion of the Human Genome Project (1) was amonumental achievement in biological science. The engine

behind this project was the Sanger sequencing method (2), whichis still the gold standard in genome research. The prolongedsuccess of the Sanger sequencing method is because of itsefficiency and fidelity in producing dideoxy-terminated DNAproducts that can be separated electrophoretically and detectedby fluorescence (3–5). However, a challenge in the use ofelectrophoresis for DNA separation is the difficulty in achievinghigh throughput and the complexity involved in the automation,although some level of increased parallelization may be achievedby using miniaturization (6).

To overcome the limitations of the Sanger sequencing tech-nology, a variety of new methods have been investigated. Suchapproaches include sequencing by hybridization (7), mass spec-trometry sequencing (8, 9), sequencing by nanopores (10), andsequencing by ligation (11). More recently, DNA sequencing bysynthesis (SBS) approaches such as pyrosequencing (12), se-quencing of single DNA molecules (13, 14), and polymerasecolonies (15) have been widely explored. Previously, we reportedthe development of a general strategy to rationally design

cleavable fluorescent nucleotide reversible terminators (NRTs)for four-color DNA sequencing by synthesis (16–20). In thisapproach, four nucleotides (A, C, G and T) are modified asNRTs by attaching a cleavable fluorophore to a specific locationon the base and capping the 3�-OH with a small chemicallyreversible moiety so that they are still recognized as substratesby DNA polymerase. DNA templates consisting of homopoly-mer regions were accurately sequenced by this approach. Arecently developed SBS system based on a similar design of thecleavable fluorescent NRTs has already found wide applicationin genome biology (21–23). In addition, we have used 3�-O-modified NRTs to solve the homopolymer sequencing problemin conventional pyrosequencing (24).

We report here an alternative sequencing method that is ahybrid between the Sanger dideoxy chain-terminating reactionand SBS, and discuss the advantages that come with this hybridapproach. The fundamental difference between the two methodsis that the Sanger method produces every possible complemen-tary DNA extension fragment for a given DNA template andobtains the sequence after the separation and detection of thesefragments, whereas SBS relies on identification of each base asthe DNA strand is extended by cleavable fluorescent NRTs thattemporarily pause the DNA synthesis for sequence determina-tion. The limiting factor for increasing throughput in the Sangermethod is the requirement to use electrophoresis.

Challenges in using SBS with cleavable fluorescent NRTsinvolve the further improvement of the DNA polymerase thatefficiently recognizes the modified nucleotides. In addition, thefirst generation of the cleavable fluorescent NRTs synthesizethe DNA strand with a propargyl amino group modification onthe base during SBS (20), which might interfere with the activityof the polymerase for chain elongation. The advantage of theSanger method is the dideoxy chain fragment-producing reac-tion. Once the DNA strand is terminated by incorporation of afluorescent dideoxynucleotide, it is no longer involved in furtherDNA extension reactions. Therefore, the continuous DNApolymerase extension reaction occurs with only natural nucle-otides at high efficiency leading to a read length of �700 bp. Themost attractive feature in the SBS approach is the massive

Author contributions: J.G., D.H.K., N.J.T., and J.J. designed research; J.G., N.X., Z.L., S.Z.,J.W., D.H.K., Q.M., X.L., S.S., L.Y., and J.J. performed research; Z.L., S.Z., M.S.M., Q.M., H.C.,X.L., L.Y., and S.K. contributed new reagents/analytic tools; J.G., N.X., Z.L., S.Z., J.W., D.H.K.,Q.M., S.S., J.J.R., and J.J. analyzed data; and J.G., D.H.K., J.J.R., N.J.T., and J.J. wrote thepaper.

The authors declare no conflict of interest.

Freely available online through the PNAS open access option.

§To whom correspondence should be addressed. E-mail: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/cgi/content/full/0804023105/DCSupplemental.

© 2008 by The National Academy of Sciences of the USA

www.pnas.org�cgi�doi�10.1073�pnas.0804023105 PNAS � July 8, 2008 � vol. 105 � no. 27 � 9145–9150

APP

LIED

BIO

LOG

ICA

LSC

IEN

CES

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

Dec

embe

r 4,

202

0

Page 2: Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

parallel readout capability by using a high-density DNA chipwithout the need to separate the DNA products. We haveexplored the integration of the advantageous features of the twomethods to develop a hybrid DNA sequencing approach. In thismethod, four nucleotides, modified as reversible terminators bycapping the 3�-OH with a small reversible moiety so that they arestill recognized as substrates by DNA polymerase, are combinedwith four cleavable fluorescent dideoxynucleotides to performSBS. DNA sequences are determined by the unique fluorescenceemission of each fluorophore on the DNA products terminatedby ddNTPs. On removing the 3�-OH capping group from theDNA products generated by incorporating the 3�-O-modifieddNTPs and the fluorophore from the DNA products termi-nated with the ddNTPs, the polymerase reaction reinitiates tocontinue the sequence determination [supporting information(SI) Fig. S1].

By using an azidomethyl group as a chemically reversiblecapping moiety in the 3�-O-modified dNTPs, and an azido-basedcleavable linker to attach the fluorophores to ddNTPs, wesynthesized four 3�-O-azidomethyl-dNTPs (3�-O-N3-dNTPs)and four ddNTP-azidolinker-f luorophores (ddNTP-N3-fluorophores) for the hybrid SBS. The azidomethyl cappingmoiety on the 3�-OH group and the cleavable fluorophore on theDNA extension products are efficiently removed after fluores-cence detection for sequence determination by using a chemicalmethod that is compatible with DNA. Various DNA templates,including those with homopolymer regions, were accuratelysequenced with a read length of �30 bases by using this hybridSBS method.

Results and DiscussionDesign and Synthesis of 3�-O-Modified NRTs and Cleavable FluorescentDideoxynucleotide Terminators for the Hybrid SBS. A critical re-quirement for using SBS methods to unambiguously sequenceDNA is a suitable chemical moiety to cap the 3�-OH of thenucleotide such that it temporarily terminates the polymerasereaction to allow the identification of the incorporated nucleo-tide. A stepwise addition of separate nucleotides with a free3�-OH group has inherent difficulties in detecting sequences inhomopolymeric regions (12, 13). Capping the 3�-OH group ofthe nucleotides with a reversible moiety allows for the additionof all four nucleotides simultaneously in performing SBS,thereby increasing accuracy and reducing the number of cyclesneeded. However, it is essential that the capping group beefficiently removed from the DNA extension products to regen-erate the 3�-OH group for continuous polymerase reactions. Ourprevious research efforts have firmly established the molecularlevel strategy to rationally modify the nucleotides by capping the3�-OH with a small chemically reversible moiety for SBS (16–20). Building on our successful 3�-O-modification strategy for thesynthesis of the NRTs, we have explored alternative chemicallyreversible groups for capping the 3�-OH of the nucleotides. In1991, Zavgorodny et al. (25) reported the capping of the 3�-OHgroup of the nucleoside with an azidomethyl moiety, which canbe chemically cleaved under mild condition with triphenylphos-phine. Various 3�-O-azidomethyl nucleoside analogues havesubsequently been synthesized (26). Cleavable fluorescent NRTswith a variety of 3�-O-modification groups, including the azidom-ethyl moiety, have been recently proposed for SBS (27), follow-ing the general principle that we reported (16, 18, 20) to designthe NRTs that can be recognized as substrates for DNA poly-merase by attaching a cleavable fluorophore to a specific loca-tion on the base and capping the 3�-OH with a small chemicallyreversible moiety.

We synthesized and evaluated four 3�-O-azidomethyl-modified NRTs (3�-O-N3-dNTPs) (Fig. 1) for the hybrid SBS.The 3�-O-modified NRTs containing an azidomethyl group tocap the 3�-OH on the ribose ring were synthesized based on a

method similar to that reported by Zavgorodny et al. (25, 26) asdescribed in SI Text. The 3�-O-azidomethyl group on the DNAextension product generated by incorporating each of the NRTsis efficiently removed by the Staudinger reaction by usingaqueous Tris(2-carboxyethyl) phosphine (TCEP) solution (28,29) followed by hydrolysis to yield a free 3�-OH group forelongating the DNA chain in subsequent cycles of the hybrid SBS(Fig. S2 A).

To demonstrate the feasibility of carrying out the hybrid SBSon a DNA chip, we designed and synthesized 4 cleavablef luorescent dideoxynucleotides, ddNTP-N3-f luorophores(ddCTP-N3-Bodipy-FL-510, ddUTP-N3-R6G, ddATP-N3-ROX, and ddGTP-N3-Cy5) (Fig. 2). The ddNTP-N3-fluoro-phores will be combined with the 4 NRTs (Fig. 1) to perform thehybrid SBS. Modified DNA polymerases have been shown to behighly tolerant to nucleotide modifications with bulky groups atthe 5 position of pyrimidines (C and U) and the 7 position ofpurines (A and G) (30). Thus, we attached each unique fluoro-phore to the 5 position of C/U and the 7 position of A/G througha cleavable linker, which is also based on an azido-modifiedmoiety (29) as a trigger for cleavage, a mechanism that is similarto the removal of the 3�-O-azidomethyl group (Fig. S2B). Thesynthesis and characterization of the cleavable f luorescentdideoxynucleotides in Fig. 2 are described in SI Text. TheddNTP-N3-fluorophores are found to efficiently incorporateinto the growing DNA strand to terminate DNA synthesis forsequence determination. The fluorophore on a DNA extensionproduct, which is generated by incorporation of the cleavablefluorescent ddNTPs, is removed rapidly and quantitatively byTCEP from the DNA extension product in aqueous solution.

Continuous Polymerase Extension by Using 3�-O-Modified NRTs andCharacterization by MALDI-TOF Mass Spectrometry. To verify thatthe 3�-O-N3-dNTPs incorporate accurately in a base-specificmanner in the polymerase reaction, four continuous DNAextension and cleavage reactions were carried out in solution byusing 3�-O-N3-dNTPs as substrates. This allowed the isolation ofthe DNA product at each step for detailed molecular structurecharacterization as shown in Fig. 3. The first extension product5�-primer-C-N3-3� (1) was desalted and analyzed by usingMALDI-TOF MS (Fig. 3A). This product was then incubated inaqueous TCEP solution to remove the azidomethyl moiety toyield the cleavage product (2) with a free 3�-OH group, whichwas also analyzed by using MALDI-TOF MS (Fig. 3B). As canbe seen from Fig. 3A, the MALDI-TOF MS spectrum consistsof a distinct peak corresponding to the DNA extension product

Fig. 1. Structures of the nucleotide reversible terminators, 3�-O-N3-dATP,3�-O-N3-dCTP, 3�-O-N3-dGTP, and 3�-O-N3-dTTP.

9146 � www.pnas.org�cgi�doi�10.1073�pnas.0804023105 Guo et al.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 4,

202

0

Page 3: Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

5�-primer-C-N3-3� (1) (m/z 8,310), which confirms that the NRTis incorporated base-specifically by DNA polymerase into agrowing DNA strand. Fig. 3B shows the cleavage result on theDNA extension product. The extended DNA mass peak at m/z8,310 completely disappeared, whereas the peak correspondingto the cleavage product 5�-primer-C-3� (2) appears as the soledominant peak at m/z 8,255, which establishes that TCEPincubation completely cleaves the 3�-O-azidomethyl group withhigh efficiency. The next extension reaction was carried out byusing this cleaved product, which now has a free 3�-OH group,as a primer to yield a second extension product, 5�-primer-CG-N3-3� (3) (m/z 8,639; Fig. 3C). As described above, the extensionproduct (3) was cleaved to generate product (4) for further MSanalysis yielding a single peak at m/z 8,584 (Fig. 3D). The thirdextension reaction to yield 5�-primer-CGA-N3-3� (5) (m/z 8,952;Fig. 3E), the fourth extension to yield 5�-primer-CGAT-N3-3� (7)(m/z 9,256; Fig. 3G) and their cleavage to yield products (6) (m/z8,897; Fig. 3F) and (8) (m/z 9,201; Fig. 3H) were similarly carriedout and analyzed by MALDI-TOF MS. These results demon-strate that all four 3�-O-N3-dNTPs are successfully synthesizedand efficiently incorporated base-specifically into the growingDNA strand in a continuous polymerase reaction as reversibleterminators and the 3�-OH capping group on the DNA extensionproducts is quantitatively cleaved by TCEP.

Polymerase Extension by Using Cleavable Fluorescent Dideoxynucle-otide Terminators and Characterization by MALDI-TOF Mass Spec-trometry. To verify that the four cleavable fluorescent ddNTPs(ddCTP-N3-Bodipy-FL-510, ddUTP-N3-R6G, ddATP-N3-ROX, and ddGTP-N3-Cy5) (Fig. 2) are incorporated accuratelyin a base-specific manner in a polymerase reaction, single-baseextension reactions with four different self-priming DNA tem-plates whose next complementary base was either A, C, G, or T

were carried out in solution. After the reaction, the four differentprimer extension products were analyzed by MALDI-TOF MSas shown in Fig. 4. Single clear mass peaks at 9,180, 8,915, 9,317,and 9,082 (m/z) corresponding to each primer extension productwith no leftover starting materials were produced by usingddNTP-N3-fluorophores (Fig. 4 A, C, E, and G). Brief incuba-tion of the DNA extension products in an aqueous TCEPsolution led to the cleavage of the linker tethering the fluoro-phore to the dideoxynucleotide. Fig. 4 B, D, F, and H shows thecleavage results for the DNA products extended with ddNTP-N3-fluorophores. The mass peaks at 9,180, 8,915, 9,317, and9,082 (m/z) have completely disappeared, whereas single peakscorresponding to the cleavage products appear at 8,417, 8,394,8,433, and 8,395 (m/z), respectively. These results demonstratethat cleavable fluorescent ddNTPs are successfully synthesizedand efficiently terminated the DNA synthesis in a polymerasereaction and that the fluorophores are quantitatively cleaved byTCEP. Thus, these ddNTP analogues meet the key requirementsnecessary for performing the hybrid SBS in combination with theNRTs.

Four-Color DNA Sequencing on a Chip by Using Cleavable FluorescentDideoxynucleotides and 3�-O-Modified NRTs by the Hybrid SBS Ap-proach. In our four-color hybrid SBS approach, the identity of theincorporated nucleotide is determined by the unique fluores-cence emission from the four fluorescent dideoxynucleotides,whereas the role of the 3�-O-modified NRTs is to further extendthe DNA strand. Therefore, the ratio of the ddNTP-N3-fluorophores and 3�-O-N3-dNTPs during the polymerase reac-tion determines how much of the ddNTP-N3-fluorophores in-corporate and, thus, the corresponding fluorescence emissionstrength. With a finite amount of immobilized DNA template ona solid surface, initially the majority of the priming strandsshould be extended with 3�-O-N3-dNTPs, whereas a relativelysmaller amount should be extended with ddNTP-N3-fluoro-

Fig. 3. The polymerase extension scheme (Left) and MALDI-TOF MS spectraof the four consecutive extension products and their cleavage products (Right)using four 3�-O-N3-dNTPs. Primer extended with 3�-O-N3-dCTP to yield product1 (A), and its cleavage product 2 (B); product 2 extended with 3�-O-N3-dGTP toyield product 3 (C), and its cleavage product 4 (D); product 4 extended with3�-O-N3-dATP to yield product 5 (E), and its cleavage product 6 (F); product 6extended with 3�-O-N3-dTTP to yield product 7 (G), and its cleavage product 8(H). The azidomethyl moiety capping the 3�-OH of the DNA extension productsis completely removed by TCEP aqueous solution to continue the polymerasereaction.

Fig. 2. Structures of the cleavable fluorescent ddNTPs: ddCTP-N3-Bodipy-FL-510 (�abs (max) � 502 nm; �em (max) � 510 nm), ddUTP-N3-R6G (�abs (max) � 525 nm;�em (max) � 550 nm), ddATP-N3-ROX (�abs (max) � 585 nm; �em (max) � 602 nm), andddGTP-N3-Cy5 (�abs (max) � 649 nm; �em (max) � 670 nm).

Guo et al. PNAS � July 8, 2008 � vol. 105 � no. 27 � 9147

APP

LIED

BIO

LOG

ICA

LSC

IEN

CES

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

Dec

embe

r 4,

202

0

Page 4: Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

phores to produce sufficient fluorescent signals that are abovethe fluorescence detection system’s sensitivity threshold forsequence determination. As the sequencing cycle continues, theamount of the ddNTP-N3-fluorophores needs to be graduallyincreased to maintain the fluorescence emission strength fordetection. Following these guidelines, we performed the hybridSBS on a chip-immobilized DNA template by using the 3�-O-N3-dNTP/ddNTP-N3-fluorophore combination and the resultsare shown in Fig. 5. The general four-color sequencing reactionscheme on a DNA chip is shown in Fig. 5A.

The de novo sequencing reaction on the chip was initiated byextending the self-priming DNA by using a solution consisting offour 3�-O-N3-dNTPs and four ddNTP-N3-fluorophores, and 9oNDNA polymerase. The hybrid SBS allows for the addition of alleight nucleotide substrates simultaneously to unambiguouslydetermine DNA sequences. This reduces the number of stepsneeded to complete the sequencing cycle, while increasing thesequencing accuracy because of competition among the sub-strates in the polymerase reaction. The DNA products extendedby ddNTP-N3-fluorophores, after fluorescence detection forsequence determination and cleavage, are no longer involved inthe subsequent polymerase reaction cycles because they arepermanently terminated. Therefore, further polymerase reac-tion only occurs on a DNA strand that incorporates the 3�-O-N3-dNTPs, which subsequently turn back into natural nucleotideon cleavage of the 3�-OH capping group, and should have nodeleterious effect on the polymerase binding to incorporatesubsequent nucleotides for growing the DNA chains. However,successive addition of the previously designed cleavable fluo-rescent NRTs (20, 27, 29) into a growing DNA strand during SBSleads to a newly synthesized DNA chain with a leftover propargylamino group at each nucleobase. This may interfere with theability of the enzyme to efficiently incorporate the next incomingnucleotide, which will lead to loss of synchrony and thereby

reduction in the read length. This challenge might potentially beovercome by reengineering DNA polymerases that efficientlyrecognize and accept the modified DNA strand, or by alternativedesign of the fluorescent NRTs (31).

To negate any lagging fluorescence signal that is caused by apreviously unextended priming strand, a synchronization stepwas added to reduce the amount of unextended priming strandsafter the initial extension reaction shown in the scheme of Fig.5A. A synchronization reaction mixture consisting of just the four3�-O-N3-dNTPs in relatively high concentration was used alongwith the 9oN DNA polymerase to extend any remaining primingstrands that retain a free 3�-OH group to synchronize theincorporation.

The four-color images from a fluorescence scanner for eachstep of the hybrid SBS on a chip is shown in Fig. 5B. The firstextension of the primer by the complementary fluorescentddNTP, ddCTP-N3-Bodipy-FL-510, was confirmed by observinga blue signal (the emission from Bodipy-FL-510) [Fig. 5B (1)].After fluorescent signal detection, the surface was immersed ina TCEP solution to cleave both the fluorophore from the DNAproduct extended with ddNTP-N3-fluorophores and the 3�-O-azidomethyl group from the DNA product extended with 3�-O-N3-dNTPs. The surface of the chip was then washed, and anegligible residual f luorescent signal was detected, confirmingcleavage of the fluorophore [Fig. 5B (2)]. This was followed byanother extension reaction with the 3�-O-N3-dNTP/ddNTP-N3-fluorophore solution to incorporate the next nucleotide com-plementary to the subsequent base on the template. The entireprocess of incorporation, synchronization, detection, and cleav-age was performed multiple times to identify 32 successive basesin the DNA template. The plot of the fluorescence intensity vs.the progress of sequencing extension (raw four-color sequencingdata) is shown in Fig. 5C. The DNA sequences including thehomopolymer regions are unambiguously identified with no

Fig. 4. A polymerase reaction scheme (Top) to yield DNA extension products by incorporating each of the four ddNTP-N3-fluorophores and the subsequentcleavage reaction to remove the fluorophores from the DNA extension products. MALDI-TOF MS spectra (Bottom) showing efficient base-specific incorporationof the ddNTP-N3-fluorophores and the subsequent cleavage of the fluorophores from the DNA extension products: (A) Primer extended with ddATP-N3-ROX (1)(peak at 9,180 m/z), (B) its cleavage product 2 (8,417 m/z); (C) primer extended with ddCTP-N3-Bodipy-FL-510 (3) (peak at 8,915 m/z), (D) its cleavage product 4(8,394 m/z); (E) primer extended with ddGTP-N3-Cy5 (5) (peak at 9,317 m/z), (F) its cleavage product 6 (8,433 m/z); (G) primer extended with ddUTP-N3-R6G (7)(peak at 9,082 m/z), and (H) its cleavage product 8 (8,395 m/z).

9148 � www.pnas.org�cgi�doi�10.1073�pnas.0804023105 Guo et al.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 4,

202

0

Page 5: Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

errors from the four-color raw fluorescence data without anyprocessing. Similar four-color sequencing data were obtained fora variety of DNA templates (Fig. S3).

ConclusionWe have synthesized four 3�-O-N3-dNTPs along with fourcleavable fluorescent ddNTPs and used them to produce four-color de novo DNA sequencing data on a chip by the hybrid SBSapproach that has the following advantages. With the 3�-O-N3-dNTPs, after cleavage of the 3�-OH capping group of the DNAextension product, there are no traces of modification left on thegrowing DNA strand. Therefore, there will be no adverse effecton the DNA polymerase for the incorporation of the nextcomplementary nucleotide. Second, the cleavable fluorescentddNTPs and 3�-O-N3-dNTPs are permanent and reversibleterminators, respectively, which allow the interrogation of eachbase in a serial manner, a key strategy enabling accuratedetermination of homopolymeric regions of DNA. In addition,because all of the steps of the nucleotide incorporation, f luo-rescence detection for sequence determination, cleavage of the

fluorophore, and the 3�-O-azidomethyl group are performed ona DNA chip, there is no longer a need for electrophoretic DNAfragment separation as in the classical Sanger sequencingmethod.

We have experimentally determined the ratio of the 3�-O-N3-dNTPs and ddNTP-N3-fluorophores to yield sequencing readlength of 32 bases. The signal strength at base 32 is as strong asthat of the first base (Fig. 5C), indicating it should be possible toincrease the read length of the hybrid SBS further by optimizingthe extension conditions to reduce the background fluorescencein the later sequencing cycles. The ultimate read length of thishybrid SBS system depends on three factors: the number ofstarting DNA molecules on each spot of a DNA chip, thereaction efficiency, and the detection sensitivity of the system.The read length with the Sanger sequencing method commonlyreaches �700 bp. The hybrid SBS approach described here mayhave the potential to reach this read length, especially withimprovements in the sensitivity of the fluorescent detectionsystem, where single molecules can be reliably detected.

With sequencing read length from 14 to 30 bases in the nextgeneration DNA sequencing systems, massive parallel digital

Fig. 5. Four-color DNA sequencing by the hybrid SBS. (A) A hybrid SBS scheme for four-color sequencing on a chip by using four 3�-O-N3-dNTPs and fourddNTP-N3-fluorophores. (B) The four-color fluorescence images for each step of the SBS: (1) incorporation of 3�-O-N3-dCTP and ddCTP-N3-Bodipy-FL-510; (2)cleavage of N3-Bodipy-FL-510 and 3�-CH2N3 group; (3) incorporation of 3�-O-N3-dATP and ddATP-N3-Rox; (4) cleavage of N3-ROX and 3�-CH2N3 group; (5)incorporation of 3�-O-N3-dTTP and ddUTP-N3-R6G; (6) cleavage of N3-R6G and 3�-CH2N3 group; (7) incorporation of 3�-O-N3-dGTP and ddGTP-N3-Cy5; (8) cleavageof N3-Cy5 and 3�-CH2N3 group; images 9–63 are similarly produced. (C) A plot (four-color sequencing data) of raw fluorescence emission intensity obtained byusing 3�-O-N3-dNTPs and ddNTP-N3-fluorophores. The small groups of peaks between the identified bases are fluorescent background from the DNA chip.

Guo et al. PNAS � July 8, 2008 � vol. 105 � no. 27 � 9149

APP

LIED

BIO

LOG

ICA

LSC

IEN

CES

CHEM

ISTR

Y

Dow

nloa

ded

by g

uest

on

Dec

embe

r 4,

202

0

Page 6: Four-color DNA sequencing with 3 O-modified nucleotide ... · parallel readout capability by using a high-density DNA chip without the need to separate the DNA products. We have explored

gene expression analogous to a high-throughput SAGE (32)approach has been reported reaching single copy transcriptsensitivity (33), and CHIP-Seq (21–23) based on sequencing tagsof �25 bases has led to many new discoveries in genome functionand regulation. It is well established that millions of differentPCR templates can be generated on a solid surface throughemulsion PCR or clonal amplification (23, 34). Thus, futureimplementation of the hybrid SBS approach on a high-densitybead array platform will provide a high-throughput and accurateDNA sequencing system with wide applications in genomebiology and biomedical research.

Materials and MethodsSynthesis of 3�-O-N3-dNTPs and ddNTP-N3-Fluorophores. The synthesis andcharacterization of the 3�-O-modified NRTs (3�-O-N3-dCTP, 3�-O-N3-dTTP, 3�-O-N3-dATP, and 3�-O-N3-dGTP) and the cleavable fluorescent ddNTPs (ddCTP-N3-Bodipy-FL-510, ddUTP-N3-R6G, ddATP-N3-ROX, and ddGTP-N3-Cy5) are de-scribed in SI Text.

Continuous Polymerase Extension by Using 3�-O-Modified NRTs in Solution andCharacterization by MALDI-TOF MS. We characterized the above four 3�-O-modified NRTs by performing four continuous DNA-extension reactions se-quentially by using a self-priming DNA template. The experimental proce-dures are described in SI Text.

Polymerase Extension by Using Cleavable Fluorescent Dideoxynucleotide Ter-minators in Solution and Characterization by MALDI-TOF MS. We characterizedthe four cleavable fluorescent ddNTPs (ddCTP-N3-Bodipy-FL-510, ddUTP-N3-R6G, ddATP-N3-ROX, and ddGTP-N3-Cy5) by performing four separate DNA-extension reactions, each with a different self-priming DNA template allow-

ing the four ddNTP analogues to be incorporated. The experimentalprocedures are described in SI Text.

Four-Color DNA Sequencing on a Chip by Using Cleavable FluorescentDideoxynucleotides and 3�-O-modified NRTs by the Hybrid SBS Approach. TheDNA template sequence and the immobilization procedure to construct theDNA chip are described in SI Text. Ten microliters of a solution consisting ofddCTP-N3-Bodipy-FL-510 (10 fmol), ddUTP-N3-R6G (20 fmol), ddATP-N3-ROX(40 fmol), ddGTP-N3-Cy5 (20 fmol), 3�-O-N3-dCTP (22 pmol), 3�-O-N3-dTTP (22pmol), 3�-O-N3-dATP (22 pmol), 3�-O-N3-dGTP (4 pmol), 1 unit of 9oN DNApolymerase(exo-) A485L/Y409V, 20 nmol of MnCl2 and 1� Thermopol II reac-tion buffer was spotted on the DNA chip. The nucleotide complementary tothe DNA template was allowed to incorporate into the primer at 65°C for 15min. To synchronize any unextended templates, an extension solution con-sisting of 38 pmol each of 3�-O-N3-dTTP, 3�-O-N3-dATP, 3�-O-N3-dGTP, and 75pmol of 3�-O-N3-dCTP, 1 unit of 9oN DNA polymerase(exo-) A485L/Y409V, 20nmol of MnCl2 and 1� Thermopol II reaction buffer was added to the samespot and incubated at 65°C for 15 min. After washing with SPSC buffercontaining 0.1% Tween-20 for 1 min, the chip was rinsed with dH2O, and thenscanned with a four-color fluorescence scanner as described in ref. 20. Toperform the cleavage, the DNA chip was placed inside a chamber filled with100 mM TCEP (pH 9.0) and incubated at 65°C for 10 min. After washing thesurface with dH2O, the chip was scanned again to measure the backgroundfluorescence signal. This process was followed by the next polymerase exten-sion reaction using the 3�-O-N3-dNTP/ddNTP-N3-fluorophore solution with thesubsequent synchronization, washing, fluorescence detection, and cleavageprocesses performed as described above. The 3�-O-N3-dNTP/ddNTP-N3-fluorophore ratio was adjusted to obtain relatively even fluorescence signals(SI Text and Table S1). The above reaction cycles were repeated multiple timesto obtain de novo DNA sequencing data on a DNA template immobilized ona chip.

ACKNOWLEDGMENTS. This work was supported by National Institutes ofHealth Grants P50 HG002806, R01 HG003582, and R21 HG004404 and by thePackard Fellowship for Science and Engineering.

1. Lander ES, et al. (2001) Initial sequencing and analysis of the human genome. Nature409:860–921.

2. Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminatinginhibitors. Proc Natl Acad Sci USA 74:5463–5467.

3. Smith LM, et al. (1986) Fluorescence detection in automated DNA sequence analysis.Nature 321:674–679.

4. Prober JM, et al. (1987) A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides. Science 238:336–341.

5. JuJ,RuanC,FullerCW,GlazerAN,MathiesRA(1995)Fluorescenceenergytransferdye-labeledprimers for DNA sequencing and analysis. Proc Natl Acad Sci USA 92:4347–4351.

6. Blazej RG, Kumaresan P, Mathies RA (2006) Microfabricated bioprocessor for inte-grated nanoliter-scale Sanger DNA sequencing. Proc Natl Acad Sci USA 103:7240–7245.

7. Drmanac S, et al. (1998) Accurate sequencing by hybridization for DNA diagnostics andindividual genomics. Nat Biotechnol 16:54–58.

8. Fu DJ, et al. (1998) Sequencing exons 5 to 8 of the p53 gene by MALDI-TOF massspectrometry. Nat Biotechnol 16:381–384.

9. Edwards JR, Itagaki Y, Ju J (2001) DNA sequencing using biotinylated dideoxynucle-otides and mass spectrometry. Nucleic Acids Res 29:E104.

10. Kasianowicz JJ, Brandin E, Branton D, Deamer DW (1996) Characterization of individ-ual polynucleotide molecules using a membrane channel. Proc Natl Acad Sci USA93:13770–13773.

11. Shendure J, et al. (2005) Accurate multiplex polony sequencing of an evolved bacterialgenome. Science 309:1728–1732.

12. Ronaghi M, Uhlen M, Nyren P (1998) A sequencing method based on real-timepyrophosphate. Science 281:363–365.

13. Harris TD, et al. (2008) Single-molecule DNA sequencing of a viral genome. Science320:106–109.

14. Greenleaf WJ, Block SM (2006) Single-molecule, motion-based DNA sequencing usingRNA polymerase. Science 313:801.

15. Mitra RD, Shendure J, Olejnik J, Olejnik EK, Church GM (2003) Fluorescent in situsequencing on polymerase colonies. Anal Biochem 320:55–65.

16. Ju J, Li Z, Edwards J, Itagaki Y (2003) Massive parallel method for decoding DNA andRNA. US Patent 6,664,079.

17. Li Z, et al. (2003) A photocleavable fluorescent nucleotide for DNA sequencing andanalysis. Proc Natl Acad Sci USA 100:414–419.

18. Ruparel H, et al. (2005) Design and synthesis of a 3�-O-allyl photocleavable fluorescentnucleotide as a reversible terminator for DNA sequencing by synthesis. Proc Natl AcadSci USA 102:5932–5937.

19. Seo TS, et al. (2005) Four-color DNA sequencing by synthesis on a chip using photo-cleavable fluorescent nucleotides. Proc Natl Acad Sci USA 102:5926–5931.

20. Ju J, et al. (2006) Four-color DNA sequencing by synthesis using cleavable fluorescentnucleotide reversible terminators. Proc Natl Acad Sci USA 103:19635–19640.

21. Mikkelsen TS, et al. (2007) Genome-wide maps of chromatin state in pluripotent andlineage-committed cells. Nature 448:553–560.

22. Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in vivoprotein-DNA interactions. Science 316:1497–1502.

23. Barski A, et al. (2007) High-resolution profiling of histone methylations in the humangenome. Cell 129:823–837.

24. Wu J, et al. (2007) 3�-O-modified nucleotides as reversible terminators for pyrose-quencing. Proc Natl Acad Sci USA 104:16462–16467.

25. Zavgorodny S, et al. (1991) 1-Alkylthioalkylation of nucleoside hydroxyl functions andits synthetic applications: A new versatile method in nucleoside chemistry. Tetrahe-dron Lett 32:7593–7596.

26. Zavgorodny S, Pechenov AE, Shvets VI, Miroshnikov AI (2000) S,X-acetals in nucleosidechemistry. III. Synthesis of 2�-and 3�-O-azidomethyl derivatives of ribonucleosides.Nucleosides, Nucleotides Nucleic Acids 19:1977–1991.

27. Barnes C, Balasubramanian S, Liu X, Swerdlow H, Milton J (2006) Labelled nucleotides.US Patent 7,057,026.

28. Saxon E, Bertozzi CR (2000) Cell surface engineering by a modified Staudinger reaction.Science 287:2007–2010.

29. Milton J, Ruediger S, Liu X (2006) Nucleosides/nucleotides conjugated to labels viacleavable linkages and their use in nucleic acid sequencing. US Patent ApplUS20060160081A1.

30. Rosenblum BB, et al. (1997) New dye-labeled terminators for improved DNA sequenc-ing patterns. Nucleic Acids Res 25:4500–4504.

31. Wu W, et al. (2007) Termination of DNA synthesis by N6-alkylated, not 3�-O-alkylated, photocleavable 2�-deoxyadenosine triphosphates. Nucleic Acids Res35:6339 – 6349.

32. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW (1995) Serial analysis of geneexpression. Science 270:484–487.

33. Kim JB, et al. (2007) Polony multiplex analysis of gene expression (PMAGE) in mousehypertrophic cardiomyopathy. Science 316:1481–1484.

34. Margulies M, et al. (2005) Genome sequencing in microfabricated high-density pico-litre reactors. Nature 437:376–380.

9150 � www.pnas.org�cgi�doi�10.1073�pnas.0804023105 Guo et al.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 4,

202

0