18
An Evolutionary/Biochemical Connection between Promoter- and Primer-Dependent Polymerases Revealed by Systematic Evolution of Ligands by Exponential Enrichment Katherine J. Fenstermacher, a,b Vasudevan Achuthan, a,b * Thomas D. Schneider, c Jeffrey J. DeStefano a,b a Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland, USA b Maryland Pathogen Research Institute, College Park, Maryland, USA c National Institutes of Health, National Cancer Institute, Center for Cancer Research, RNA Biology Laboratory, Frederick, Maryland, USA ABSTRACT DNA polymerases (DNAPs) recognize 3= recessed termini on duplex DNA and carry out nucleotide catalysis. Unlike promoter-specific RNA polymerases (RNAPs), no sequence specificity is required for binding or initiation of catalysis. De- spite this, previous results indicate that viral reverse transcriptases bind much more tightly to DNA primers that mimic the polypurine tract. In the current report, primer sequences that bind with high affinity to Taq and Klenow polymerases were identi- fied using a modified systematic evolution of ligands by exponential enrichment (SELEX) approach. Two Taq-specific primers that bound 10 (Taq1) and over 100 (Taq2) times more stably than controls to Taq were identified. TaqI contained 8 nu- cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers with similar binding thermodynamics in PCRs. Similarly, exonuclease Klenow polymerase also selected a high-affinity primer that contained a related core promoter sequence from phage T7 RNAP (5=-ACTATAG-3=). For both Taq and Klenow, even small modifications to the sequence resulted in large losses in binding affinity, suggesting that binding was highly sequence specific. The results are discussed in the context of possible effects on multiprimer (multiplex) PCR assays, molecular information theory, and the evolution of RNAPs and DNAPs. IMPORTANCE This work further demonstrates that primer-dependent DNA poly- merases can have strong sequence biases leading to dramatically tighter binding to specific sequences. These may be related to biological function or be a consequence of the structural architecture of the enzyme. New sequence specificity for Taq and Klenow polymerases were uncovered, and among them were sequences that con- tained the core promoter elements from T3 and T7 phage RNA polymerase promot- ers. This suggests the intriguing possibility that phage RNA polymerases exploited intrinsic binding affinities of ancestral DNA polymerases to develop their promoters. Conversely, DNA polymerases could have evolved from related RNA polymerases and retained the intrinsic binding preference despite there being no clear function for such a preference in DNA biology. KEYWORDS aptamer, DNA polymerase, molecular evolution, multiplex PCR, PCR primer bias, RNA polymerases, SELEX, T3 RNA polymerase, T7 RNA polymerase M ost DNA polymerases (DNAPs) recognize 3= recessed termini on double-stranded nucleic acid and use this feature as the priming point for nucleotide catalysis. Unlike promoter-dependent RNA polymerases (RNAPs), it is thought that sequence- specific information from the duplex region plays only a small role, if any, in polymerase binding and catalysis. Despite this, retroviral reverse transcriptases (RTs) bind much more tightly to purine-rich DNA-DNA duplexes with primers resembling their polypu- Received 27 September 2017 Accepted 11 January 2018 Accepted manuscript posted online 16 January 2018 Citation Fenstermacher KJ, Achuthan V, Schneider TD, Destefano JJ. 2018. An evolutionary/biochemical connection between promoter- and primer-dependent polymerases revealed by systematic evolution of ligands by exponential enrichment. J Bacteriol 200:e00579-17. https://doi.org/10.1128/JB .00579-17. Editor Victor J. DiRita, Michigan State University Copyright © 2018 American Society for Microbiology. All Rights Reserved. Address correspondence to Jeffrey J. DeStefano, [email protected]. * Present address: Vasudevan Achuthan, Dana- Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts, USA. RESEARCH ARTICLE crossm April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 1 Journal of Bacteriology on September 28, 2020 by guest http://jb.asm.org/ Downloaded from

An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

An Evolutionary/Biochemical Connection between Promoter-and Primer-Dependent Polymerases Revealed by SystematicEvolution of Ligands by Exponential Enrichment

Katherine J. Fenstermacher,a,b Vasudevan Achuthan,a,b* Thomas D. Schneider,c Jeffrey J. DeStefanoa,b

aCell Biology and Molecular Genetics, University of Maryland, College Park, Maryland, USAbMaryland Pathogen Research Institute, College Park, Maryland, USAcNational Institutes of Health, National Cancer Institute, Center for Cancer Research, RNA Biology Laboratory,Frederick, Maryland, USA

ABSTRACT DNA polymerases (DNAPs) recognize 3= recessed termini on duplexDNA and carry out nucleotide catalysis. Unlike promoter-specific RNA polymerases(RNAPs), no sequence specificity is required for binding or initiation of catalysis. De-spite this, previous results indicate that viral reverse transcriptases bind much moretightly to DNA primers that mimic the polypurine tract. In the current report, primersequences that bind with high affinity to Taq and Klenow polymerases were identi-fied using a modified systematic evolution of ligands by exponential enrichment(SELEX) approach. Two Taq-specific primers that bound �10 (Taq1) and over 100(Taq2) times more stably than controls to Taq were identified. TaqI contained 8 nu-cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Bothprimers dramatically outcompeted primers with similar binding thermodynamics inPCRs. Similarly, exonuclease� Klenow polymerase also selected a high-affinity primerthat contained a related core promoter sequence from phage T7 RNAP (5=-ACTATAG-3=).For both Taq and Klenow, even small modifications to the sequence resulted in largelosses in binding affinity, suggesting that binding was highly sequence specific. Theresults are discussed in the context of possible effects on multiprimer (multiplex)PCR assays, molecular information theory, and the evolution of RNAPs and DNAPs.

IMPORTANCE This work further demonstrates that primer-dependent DNA poly-merases can have strong sequence biases leading to dramatically tighter binding tospecific sequences. These may be related to biological function or be a consequenceof the structural architecture of the enzyme. New sequence specificity for Taq andKlenow polymerases were uncovered, and among them were sequences that con-tained the core promoter elements from T3 and T7 phage RNA polymerase promot-ers. This suggests the intriguing possibility that phage RNA polymerases exploitedintrinsic binding affinities of ancestral DNA polymerases to develop their promoters.Conversely, DNA polymerases could have evolved from related RNA polymerasesand retained the intrinsic binding preference despite there being no clear functionfor such a preference in DNA biology.

KEYWORDS aptamer, DNA polymerase, molecular evolution, multiplex PCR, PCRprimer bias, RNA polymerases, SELEX, T3 RNA polymerase, T7 RNA polymerase

Most DNA polymerases (DNAPs) recognize 3= recessed termini on double-strandednucleic acid and use this feature as the priming point for nucleotide catalysis.

Unlike promoter-dependent RNA polymerases (RNAPs), it is thought that sequence-specific information from the duplex region plays only a small role, if any, in polymerasebinding and catalysis. Despite this, retroviral reverse transcriptases (RTs) bind muchmore tightly to purine-rich DNA-DNA duplexes with primers resembling their polypu-

Received 27 September 2017 Accepted 11January 2018

Accepted manuscript posted online 16January 2018

Citation Fenstermacher KJ, Achuthan V,Schneider TD, Destefano JJ. 2018. Anevolutionary/biochemical connection betweenpromoter- and primer-dependent polymerasesrevealed by systematic evolution of ligandsby exponential enrichment. J Bacteriol200:e00579-17. https://doi.org/10.1128/JB.00579-17.

Editor Victor J. DiRita, Michigan StateUniversity

Copyright © 2018 American Society forMicrobiology. All Rights Reserved.

Address correspondence to Jeffrey J.DeStefano, [email protected].

* Present address: Vasudevan Achuthan, Dana-Farber Cancer Institute, Harvard MedicalSchool, Boston, Massachusetts, USA.

RESEARCH ARTICLE

crossm

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 1Journal of Bacteriology

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 2: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

rine tract (PPT) RNA sequences (5=-AAAAGAAAAGGGGGG-3= for HIV-1) (1, 2). The PPT isresistant to RNase H degradation, which allows its use as a primer for second-strandDNA synthesis (3–6). In addition, it is also a more efficient primer than random RNAsequences, suggesting a unique interaction with RT (7–9). The finding that PPT-likesequences also induce high-affinity binding to RT further elucidates the PPT’s prefer-ential usage for second-strand priming and demonstrates that DNAPs can have strongsequence binding preferences.

The sequence preferences for RTs were uncovered using a modified systematicevolution of ligands by exponential enrichment (SELEX) protocol that allows theselection of tight-binding primer-template sequences (referred to as primer-templateSELEX [PT-SELEX] in this paper) (1, 2). PT-SELEX starts with a pool of duplex nucleic acidsthat have a 4-base 5= overhang creating a 3= recessed end within a region of randomsequence nucleotides. Polymerases bind preferentially to those sequences that inducestronger binding, which get selected in the subsequent rounds of PT-SELEX.

Given that viral RTs have a biologically relevant sequence binding preference, it ispossible that other primer-template-utilizing polymerases also have preferences forspecific sequences. Though diverse, DNAPs can be grouped by sequence and structuralhomology into seven families (A, B, C, D, X, Y, and RT), all of which share severalanalogous regions that are necessary for their function, including the catalytic palmdomain, the fingers domain, the thumb domain (which helps position the nucleic acidtemplate), and a two-metal-ion binding site in the catalytic cleft (10, 11).

The 832-amino-acid DNA polymerase from the bacterium Thermus aquaticus, knownas Taq polymerase (Taq pol), is perhaps the most commercially important polymerase.Taq polymerase, which is classified in family A, is thermostable at temperatures thatwould denature many other proteins: its optimal temperature range is 75 to 80°C, andit can withstand temperatures near boiling (at 97.5°C, the enzyme has a half-life [t1/2]of 9 min) (12). This has made it ideal for PCR-based applications, which repeatedly cyclethrough �90°C temperatures (13).

Given the commercial importance of Taq polymerase, it represents an interestingtarget for investigation of sequence preferences. In this report, we show that Taq hasstrong sequence binding preferences and shares a preference for sequences thatresemble phage RNAP promoters. Primers for PCR mixtures containing Taq high-affinitybinding sequences had a dramatic competitive advantage over other primers, resultingin a strong bias for the production of PCR products specified by these primers. Further,we found that another commercially important family A polymerase, the Klenowfragment of Escherichia coli DNAP I, also has a binding preference for specific se-quences, and these sequences also resemble phage RNA polymerase promoter se-quences. These results suggest possible evolutionary and structural relationships be-tween promoter- and primer-dependent polymerases that will be discussed in thispaper. Further, the results demonstrate the strong competitive advantage of selectedsequences in PCRs and indicate that primer bias not only can occur due to thermody-namic nucleic acid binding advantages of some primers (14–16) but also may resultfrom specific sequences binding Taq with higher affinity. These findings have possibleramifications for quantitative PCRs and PCR-based protocols.

RESULTSTaq polymerase selected a high-affinity sequence containing a conserved

region of the T3 RNA polymerase promoter and a second unrelated sequence.PT-SELEX experiments with Taq were carried out for 7 rounds using the approachshown in Fig. 1 (see also references 1 and 2). Material from rounds 6 and 7 bound Taqby gel shift analysis nearly equivalently, indicating that selection was essentially com-plete. Twelve sequences were isolated from the round 7 pool, of which two differentsequence groups were identified (Fig. 2). In group 1, five sequences were identicalexcept for a single nucleotide substitution in one isolate (Fig. 2B). Interestingly, 4sequences (Fig. 2B, sequences 1, 2, 4, and 5) contained a region corresponding to bases�7 to �1 of the phage T3 RNAP promoter region (5=-CACTAAAG-3=, underlined in the

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 2

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 3: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

consensus sequence at the bottom of Fig. 2B), while the fifth had a one-nucleotidechange in that region (Fig. 2B, sequence 3, 5=-CCCTAAAG-3=). This region of the T3promoter, which constitutes a core sequence that is highly conserved between T3, T7,and SP6 RNAPs (T3Pcore) (17), was contained fully within the duplex DNA (4 bpupstream from the 3= primer terminus, which is denoted by the darker-gray shade inFig. 2). Based on these 5 sequences, a representative consensus sequence (identical to4 of the 5 sequences in the set) was produced for further testing and is referred to asTaq1 (Table 1). The second group of three sequences identified from the 12 recoveredsequences shared an identical 8-base sequence (5=-AACGTGCC-3=) and were closelyrelated in other regions (Fig. 2C). Two of the sequences were identical except for asingle nucleotide change. Unlike the first set of sequences, the 8-base sequence in thissecond set had no similarity to any known biologically relevant motifs. A sequence thatwas identical to the recovered sequence 3 in Fig. 2C was produced for further testingand is referred to as Taq2 (Table 1). An alignment of Taq1 and other selected sequenceswith the T3 and T7 RNAPs is shown in Fig. 3. Note the strong homology of the T3 RNAPwith Taq1 in Fig. 3A. In addition to the identical 8-base stretch noted above, homologyto a downstream “AGA” sequence of the T3 and T7 promoters was also apparent. Taq2showed no significant homology to either RNAP.

Determination of important sequences for Taq binding in the selected mate-rial. Both Taq1 and Taq2 bound much more tightly than the starting material (control)in gel shift assays (Fig. 4). Taq2 appeared to bind modestly better than Taq1 in thisenvironment. As these experiments used relatively large amounts of nucleic acidstarting material (2 nM) and the conditions were not optimal for Taq binding, the gelswere used only to compare binding to different constructs (filter binding assays forsome constructs were also performed [see below]). Apparent equilibrium dissociationconstants (referred to as Kd,app,gelshift; see Materials and Methods) were determined forcomparisons and are shown in Table 1. Taq1 and Taq2 had Kd,app,gelshift values of 36 �

6 and 6.3 � 0.1 nM, respectively, while the control (starting material in the PT-SELEX)bound too weakly to estimate a value.

Mutational analysis was performed on the Taq1 and Taq2 sequences to determinewhat regions of the sequences were important for high-affinity binding to Taq (Tables 2and 3 show data obtained for Taq1 and Taq2, respectively). In general, the results

FIG 1 Primer-template SELEX (PT-SELEX) protocol for selecting primer-template sequences that bind Taqand Exo�Kl with high affinity. The preparation of the top construct is described in Materials and Methods.Refer to Materials and Methods for a detailed description of the selection process for Taq and Exo�Klpolymerases.

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 3

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 4: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

showed that both the absolute sequence and the location of the conserved regionsof Taq1 and Taq2 were necessary for tight binding. To test the role of the T3Pcoresequence for binding of Taq to Taq1, a construct was created with substitutions atprimer nucleotides flanking the T3Pcore (“Modified sequences surrounding theT3Pcore” in Table 2), as well as a second construct with a completely modified T3Pcore

FIG 2 Alignment of recovered round 7 Taq-selected sequences. Nucleotides that match the consensus,located beneath each alignment, are represented by a dot. Sequences are from the random region of theprimer strand, and the four bases corresponding to the single-stranded template overhang are shadeddarker gray (Fig. 1). Note that those sequences are the complements of the bases that were present inthe template overhang (Table 1). Of the 12 sequences isolated (A), two general motifs were observed.Group 1 (B) contained a region that matched the initiation domain (underlined) of the T3 RNAPpromoter. Group 2 (C) did not contain a recognizable biologically relevant sequence. Sequences thatwere tested further are represented by sequence 1 in panel B (Taq1) and sequence 3 in panel C (Taq2).Sequences were aligned using MacVector. Codes: M is A or C; R is A or G; W is A or T; Y is C or T; S is Cor G; K is G or T.

TABLE 1 Selected SELEX sequences from Taq and Exo�Kl

Namea Primer-template sequenceb Kd,app,gelshift (nM)c

Random starting material 5=-gcctgcaggtcgactctagaNNNNNNNNNNNNNNNNNNNNN-3= �200 (Taq)3=-cggacgtccagctgagatctNNNNNNNNNNNNNNNNNNNNNNNNN-5= �500 (Exo�Kl)

Taq1 5=-gcctgcaggtcgactctagaCCCAGTCCACACTAAAGCATA-3= 36 � 63=-cggacgtccagctgagatctGGGTCAGGTGTGATTTCGTATCTGT-5=

Taq2 5=-gcctgcaggtcgactctagaCCCCAATTTGCGAACGTGCCT-3= 6.3 � 0.13=-cggacgtccagctgagatctGGGTCAGGTGTGATTTCGTATCAGC-5=

Exo�Kl-1 5=-gcctgcaggtcgactctagaCAACCATCGAAGACTA-3= 125 � 253=-cggacgtccagctgagatctGTTGGTAGCTTCTGATATCGTCCGT-5=

aRandom starting material was the starting material for the SELEX process. Selected sequences with Exo�Kl contained a BbsI restriction site (underlined) derived fromthe random region of the starting material that produced a 9-base 5= overhang leading the Exo�Kl-1 sequence shown (see Results).

bLowercase letters are bases derived from the fixed region of the starting material, while bases derived from the random region are capitalized. “N” signifies that thebase was random in the starting material (either A, T, C, or G), while “N” is the complement of that base in the template strand.

cAffinities measured in gel shift assays were measured with 2 nM primer-template starting material and under the conditions described in Materials and Methods. Thederived Kd values were useful for comparisons between these sequences and various modifications presented in the text but were not determined under conditionsthat would necessarily yield accurate Kd values. Therefore, they are referred to as apparent Kd values that are specific for the gel shift experiment (Kd,app,gelshift).Results were averages from at least 3 experiments � SD.

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 4

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 5: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

sequence (“Modified T3Pcore” in Table 2). Using gel shift analysis, both constructsbound less tightly than Taq1 to Taq, but the modified T3Pcore construct bound muchless tightly and was essentially equivalent to the control, while the construct with anintact T3Pcore but modified surrounding sequences bound modestly less strongly. Thisindicated that the T3Pcore was an important factor in the enzyme’s affinity for theconstruct but that other sequences surrounding this region enhanced binding to asmaller extent. This was not surprising, as the conserved region of Taq1 extendedbeyond the eight-nucleotide T3Pcore bases (Fig. 2). By mutating 4-bp segments of theT3Pcore, it was found that the bases matching from �3 to �1 (5=-AAAG-3=, in theprimer strand) in the T3Pcore were the largest contributors to tight binding (“4-bpmodifications B” in Table 2); mutation of these bases disrupted binding, while mutatingthe �7 to �4 bases (5=-CACT-3=, in the primer strand, “4 bp modification A” in Table 2)did not significantly affect binding. Intriguingly, sometimes small changes to theT3Pcore were more deleterious to binding than large ones: a single change of the �2T3Pcore base (to mimic the phage T7 promoter core sequence, “T3Pcore to T7Pcore,”in Table 2) completely disrupted strong binding. Also, shifting the position of the coresequence even by a single base relative to the 3= primer terminus completely abolishedtight binding in the gel shift assay (“�1 shift” in Table 2). Interestingly, Taq polymerasebound a construct that contained the entire T3 RNAP promoter (Table 2, “full T3 RNAP

FIG 3 Clustal alignment of the selected sequences from the Taq SELEX (Taq1 and Taq2) and theexonuclease� Klenow SELEX (Exo�Kl-1) with T3 (A) and T7 (B) RNAP promoter sequences. T7 and T3RNAP sequences are extended 23-nucleotide versions of the promoter regions illustrated in Fig. 7. Thebinding and initiation regions from that figure are highlighted above the illustrations in yellow and gray,respectively. The primer strand of the 25-nucleotide regions of Taq1, Taq2, and Exo�Kl-1, derived fromthe random nucleotides in the starting material, were used in the alignments. Dashes are gaps in thealignments, while dots indicate the same nucleotide as that in the reference sequence at that position.

FIG 4 Gel shift assay of Taq1 and Taq2 constructs, as well as random starting material by Taq pol. Theconstructs are shown in Table 1. Taq1 and Taq2 were selected from round 7 of SELEX (Fig. 1). In each gel,the concentrations of Taq pol were 0, 6.25, 12.5, 25, 50, 100, and 200 nM. The apparent affinities for theconstructs using gel analysis (referred to as Kd,app,gelshift) were 36 � 6 nM and 6.3 � 0.1 nM for Taq1 andTaq2, respectively (averages from 3 experiments � SD). The starting material bound too weakly todetermine a Kd value.

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 5

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 6: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

promoter sequence”) as poorly as it did the random control, suggesting that a constructwith the entire promoter sequence is in a conformation that disrupts high-affinity Taqbinding, despite the presence of the T3Pcore.

Unlike Taq1, all tested modifications of Taq2 significantly disrupted binding (Table3). Modification of the 8-bp region found in all three sequences (underlined in the Taq2sequence in Table 3) reduced binding �8-fold, as did modification of the basesupstream and downstream of this region. Although some binding enhancement rela-tive to the control was observed when only upstream bases were modified (“Modifiednoncore/nonoverhang nucleotides” in Table 3). As with Taq1, even a one-base shift ofthe sequence relative to the 3= primer terminus completely abolished tight binding(“�1 shift” in Table 3).

Filter binding assays show that Taq1 and Taq2 bind extremely tightly to Taq incomparison to other sequences. Since protein-nucleic acid complexes for someproteins may be weakened in gel environments, other parameters of these sequenceswere evaluated using nitrocellulose filter binding assays (Table 4). Consistent with this,even the random starting material bound to Taq polymerase strongly in filter bindingassays with a measured Kd of 14 � 8 pM at 23°C. The Kd of Taq1 and Taq2 was less thana few picomolar under these conditions, making it too small to accurately calculate inthis assay. Binding half-lives were also calculated; at 23°C, Taq1 and 2 bound Taq witha half-life of greater than 8 h, and in fact, almost no dissociation of the sequences fromTaq was detected after 8 h for either sequence. In contrast, the random starting materialhad a binding half-life of �20 min. At 60°C, all the sequences showed more-rapid

TABLE 2 Effects of modifying Taq1 sequence on affinity for Taq polymerase

Taq1 modifications Sequencea

Fold decreasein affinityb

Taq1 (no modification) 5=-gcctgcaggtcgactctagaCCCAGTCCACACTAAAGCATA NA3=-cggacgtccagctgagatctGGGTCAGGTGTGATTTCGTATCTGT

Random 5=-gcctgcaggtcgactctagaNNNNNNNNNNNNNNNNNNNNN �53=-cggacgtccagctgagatctNNNNNNNNNNNNNNNNNNNNNNNNN

Modified T3Pcore 5=-gcctgcaggtcgactctagaCCCAGTCCACCAGCCCTCATA �53=-cggacgtccagctgagatctGGGTCAGGTGGTCGGGAGTATCTGT

Modified sequences surrounding the T3Pcore 5=-gcctgcaggtcgactctagaAAACTGAACAACTAAAGCATA 23=-cggacgtccagctgagatctTTTGACTTGTTGATTTCGTATCTGT

Full T3 RNAP promoter sequence 5=-gcctgcaggtcgactctagaAATTAACCCTCACTAAAGGGAG �53=-cggacgtccagctgagatctTTAATTGGGAGTGATTTCCCTCTGTC

Modification of sequences 5= of the T2Pcore 5=-gcctgcaggtcgactctagaAAACTGAACAACTAAAGCATA 23=-cggacgtccagctgagatctTTTGACTTGTTGATTTCGTATCTGT

4-bp modification A 5=-gcctgcaggtcgactctagaCCCAGTCCAACAGAAAGCATA 13=-cggacgtccagctgagatctGGGTCAGGTTGTCTTTCGTATCTGT

4-bp modification B 5=-gcctgcaggtcgactctagaCCCAGTCCACACTCCCTCATA 33=-cggacgtccagctgagatctGGGTCAGGTGTGAGGGAGTATCTGT

�1 shift 5=-cctgcaggtcgactctagaCCCAGTCCACACTAAAGCATAG �53=-ggacgtccagctgagatctGGGTCAGGTGTGATTTCGTATCTGTC

T3Pcore to T7Pcore 5=-gcctgcaggtcgactctagaCCCAGTCCACACTATAGCATA �53=-cggacgtccagctgagatctGGGTCAGGTGTGATATCGTATCTGT

aLowercase letters are bases derived from the fixed region of the starting material, while bases derived from the random region are capitalized. “N” signifies that thebase was random in the starting material (either A, T, C, or G), while “N” is the complement of that base in the template strand. Underlined regions highlightchanges from the Taq1 sequence, while the region of Taq1 corresponding to the T3 RNAP core is underlined in Taq1.

bAffinities measured in gel shift assays were measured with 2 nM primer-template and under the conditions described in Materials and Methods. The derived Kd

values were useful for comparisons between these sequences but were not determined under conditions that would necessarily yield accurate Kd values. Therefore,they are referred to as apparent Kd values that are specific for the gel shift experiment (Kd,app,gelshift). Values listed are approximate changes relative to the Taq1sequence (measured Kd,app,gelshift, 36 � 6 nM; averages from at least 3 experiments � SD), with higher numbers indicating fold lower affinity. NA, not applicable(reference sequence).

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 6

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 7: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

dissociation from Taq, consistent with previous results showing that binding of Taq toprimer-templates is less stable at higher temperature (18, 19). The random control inthis case bound with a half-life of �4.4 � 1.2 min, while Taq1 showed about 10-fold-more-stable binding. Taq2 bound about 100 times more stably than the control at thistemperature, consistent with tighter binding than Taq1 in the gel shift assay (Table 1and Fig. 3). Since Taq2 was G-C rich near the 3= end of the primer strand and thereforemore stable at higher temperature, it was possible that this, rather than a sequence-specific intrinsic binding advantage, was responsible for the extremely stable binding

TABLE 3 Effects of modifying Taq2 sequence on affinity for Taq polymerase

Taq2 modification Sequencea

Fold decreasein affinityb

Taq2 5=-gcctgcaggtcgactctagaCCCCAATTTGCGAACGTGCCT NA3=-cggacgtccagctgagatctGGGGTTAAACGCTTGCACGGACAGC

Random 5=-gcctgcaggtcgactctagaNNNNNNNNNNNNNNNNNNNNN �303=-cggacgtccagctgagatctNNNNNNNNNNNNNNNNNNNNNNNNN

Modified 8-bp core 5=-gcctgcaggtcgactctagaCCCCAATTTGCGCCATGTAAT 83=-cggacgtccagctgagatctGGGGTTAAACGCGGTACATTACAGC

Modified noncore nucleotides 5=-gcctgcaggtcgactctagaAAAACCGGGTATAACGTGCCG �303=-cggacgtccagctgagatctTTTTGGCCCATATTGCACGGCACTA

Modified noncore/nonoverhang nucleotides 5=-gcctgcaggtcgactctagaAAAACCGGGTATAACGTGCCT 73=-cggacgtccagctgagatctTTTTGGCCCATATTGCACGGACAGC

�1 shift 5=-cctgcaggtcgactctagaCCCCAATTTGCGAACGTGCCTG �303=-ggacgtccagctgagatctGGGGTTAAACGCTTGCACGGACAGCC

4-bp modification A 5=-gcctgcaggtcgactctagaCCCCAATTTGCGCCATTGCCT �303=-cggacgtccagctgagatctGGGTCAGGTTGTCTTTCGTATCAGC

4-bp modification B 5=-gcctgcaggtcgactctagaCCCCAATTTGCGAACGGTAAT �303=-cggacgtccagctgagatctGGGTCAGGTGTGAGGGAGTATCAGC

aLowercase letters are bases derived from the fixed region of the starting material, while bases derived from the random region are capitalized. “N” signifies that thebase was random in the starting material (either A, T, C, or G), while “N” is the complement of that base in the template strand. Underlined regions highlightchanges from the Taq2 sequence, while the region of Taq2 shared by all 3 sequences recovered from SELEX (Fig. 2) is underlined in Taq2.

bAffinities measured in gel shift assays were measured with 2 nM primer-template and under the conditions described in Materials and Methods. The derived Kd

values were useful for comparisons between these sequences but were not determined under conditions that would necessarily yield accurate Kd values. Therefore,they are referred to as apparent Kd values that are specific for the gel shift experiment (Kd,app,gelshift). Values listed are approximate changes in affinity compared withthe Taq2 sequence (measured Kd,app,gelshift, 6.3 � 0.1 nM; averages from at least 3 experiments � SD), with higher numbers indicating fold lower affinity. NA, notapplicable (reference sequence).

TABLE 4 Binding parameters at 23°C and 60°C for selected Taq sequences

Assay temp and construct namea Kd (pM)b koff (min�1)b t1/2 (min)c

23°CControl (starting material) 14 � 8 0.034 � 0.004 20 � 2Taq1 ND ND �480Taq2 ND ND �480

60°CControl (starting material) ND 0.159 � 0.056 4.4 � 1.2Taq1 ND 0.015 � 0.003 46 � 8Taq2 ND 0.0013 � 0.0002 533 � 71Taq1-core mod ND 0.124 � 0.020 5.6 � 0.8Taq2 (�1 shift) ND 0.198 � 0.157 3.5 � 1.5

aRefer to Tables 1, 2, and 3 for construct information.bkoff and Kd values were calculated using filter binding assays as described in Materials and Methods. Resultswere averages from at least 3 experiments � SD. Kd values for Taq1 and Taq2 at 23°C were too low tomeasure with the assay, while values could not be accurately measured at 60°C due to temperaturefluctuations that occur during application to the filter of the sample and washing of the filters. ND, notdetermined.

cHalf-life values (t1/2) for the binding of Taq to the construct were calculated from koff using the equationt1/2 � 0.693/koff.

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 7

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 8: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

at 60°C. However, the Taq2 “�1 shift” sequence (see above and Table 3), which differsfrom Taq2 by just a single nucleotide, bound poorly in gel shift experiments (Table 3)and had a half-life at 60°C of only 3.5 � 1.5 min, essentially identical to the randomsequence control half-life (Table 4). This suggests that the relatively G-C-rich nature ofTaq2 plays no role in the observed results. Technical issues made it impossible toaccurately measure Kd values by filter binding at 60°C, but the half-life measurementsindicate that the selected sequences bind much more tightly than random primer-template sequences, even at an elevated temperature.

The selected Taq1 and Taq2 sequences dramatically outcompete other se-quences in PCRs. The isolation of high-affinity binding sequences provided an oppor-tunity to test the potential effects of these sequences on PCRs. Results indicated that20-nucleotide primers mimicking Taq1 (5=-CCAGTCCACACTAAAGCATA-3=) and Taq2(5=-CCCAATTTGCGAACGTGCCT-3=) dramatically outcompeted other primers with simi-lar thermodynamic binding properties in PCRs (see Fig. S1 and S2 and the accompa-nying descriptions in the supplemental material). The Taq1 and Taq2 primers werealso effective in PCRs over a wide range of binding temperatures (see Fig. S3 in thesupplemental material).

Exo�Kl polymerase selected a high-affinity sequence containing a conservedregion from the T7 polymerase core sequence. A second commercially importantbacterial DNA polymerase was also analyzed for sequence-specific binding preferences.In this case, the exonuclease-minus version of Klenow polymerase was used in order toavoid degradation of the starting material that would occur in the presence of Mg2�

with the exonuclease-containing enzyme. After 7 rounds of PT-SELEX (Fig. 1), theenriched pool bound Exo�Kl several times better than a random pool based on gel shiftanalysis (Kd,app,gelshift of 90 � 10 nM versus �500 nM), suggesting that it containedhigh-affinity sequences. Round 6 material bound with approximately the same affinityas round 7 material, indicating that the selection was essentially complete. The round 7pool was cloned and sequenced, and 12 isolates were recovered, of which 11 showedsignificant identity over a large region (Fig. 5). Interestingly, these 11 sequencescontained a second BbsI site in an orientation opposite to the engineered site in theprimer region (Fig. 1 and 5A). We later found that due to incomplete cleavage by the

FIG 5 Alignment of recovered round 7 Exo�Kl selected sequences. (A) The starting material hybrid andoriginal BbsI cut site (which generates a 41:45 nucleotide primer/template ratio) and the second selectedcut site (which yields a 36:45 nucleotide primer/template ratio [as indicated by an asterisk] when combinedwith the first site) are shown. See Fig. 1 and Results for details. (B) The 11 sequences isolated from round7 containing a portion of the T7 promoter region aligned using MacVector. Sequences are from the randomregion of the primer strand, and the nine bases corresponding to the single-stranded template overhangare shaded in darker gray. Note that those sequences are the complement of the bases that were presentin the template overhang (Table 1). A consensus sequence is shown at the bottom with the T7 promoterelement underlined. The first sequence corresponds to the Exo�Kl-1 sequence that was examined further.

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 8

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 9: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

restriction enzyme, this allowed a nick on each strand that resulted in a 9-nucleotideoverhang on the 5= end of the template instead of the intended 4-nucleotide overhang(generating 36 nucleotide primer/45 nucleotide template pairs) (Tables 1 and 5).Subsequent testing revealed that reducing the length of the 5= overhang by more than3 nucleotides significantly disrupted binding (“36-nt primer-42-nt template” in Table 5),suggesting a strong selective pressure for the location of this second BbsI site. In fact,Exo�Kl was unable to preferentially gel shift any sequences that were tested containing4-nucleotide overhangs of the type that the protocol was designed to produce (Fig. 1).

Intriguingly, 7 of the cloned round 7 sequences contained a region identical to apart of the T7 phage RNAP consensus promoter sequence (bases �6 to � 1 of the T7polymerase promoter core sequence [T7Pcore] [17], 5=-ACTATAG-3=), while three otherscontained only a single nucleotide difference in that region (Fig. 5B). Note that the Taq1sequence above contained a related region from the T3 promoter (nucleotides �7 to�1). In the phage RNAP promoters, this region is highly conserved between differentRNAPs (a single base differentiates the T3 promoter from T7 and SP6); bases �4 to �3are where a single-stranded bubble forms for the initiation of transcription by phageRNAP (17). The T7-like sequences selected by Exo�Kl contained four of the bases (�6to �2 of the T7Pcore) in the duplex region and three bases (�2 to �1 of the T7Pcore)in the single-stranded template overhang of the primer-template hybrid. A consensussequence, constructed using the most common bases from the 11 highly similar

TABLE 5 Effects of modifying Exo�Kl-1 sequence on affinity for Exo�Kl polymerase

Exo�Kl-1 modification Sequencea Fold decrease in affinityb

Exo�Kl 5=-gcctgcaggtcgactctagaCAACCATCGAAGACTA NA3=-cggacgtccagctgagatctGTTGGTAGCTTCTGATATCGTCCGT

Random substrate 5=-gcctgcaggtcgactctagaNNNNNNNNNNNNNNNN �43=-cggacgtccagctgagatctNNNNNNNNNNNNNNNNNNNNNNNNN

36-nt primer-42-nt template 5=-gcctgcaggtcgactctagaCAACCATCGAAGACTA 2.53=-cggacgtccagctgagatctGTTGGTAGCTTCTGATATCGTG___

36-nt primer-44-nt template 5=-gcctgcaggtcgactctagaCAACCATCGAAGACTA 13=-cggacgtccagctgagatctGTTGGTAGCTTCTGATATCGTGCC_

�1 shift 5=-ggcctgcaggtcgactctagaCAACCATCGAAGACT 43=-ccggacgtccagctgagatctGTTGGTAGCTTCTGATATCGTCCG

�1 shift 5=-cctgcaggtcgactctagaCAACCATCGAAGACTAT 23=-ggacgtccagctgagatctGTTGGTAGCTTCTGATATCGTCCGTC

�4 shift 5=-gcaggtcgactctagaCAACCATCGAAGACTATAGC 33=-cgtccagctgagatctGTTGGTAGCTTCTGATATCGTCCGTCCGT

�8 shift 5=-gtcgactctagaCAACCATCGAAGACTATAGCAGGC �43=-cagctgagatctGTTGGTAGCTTCTGATATCGTCCGTCCGTCCGT

Modified T7Pcore 5=-gcctgcaggtcgactctagaCAACCATCGAAGCAGC �43=-cggacgtccagctgagatctGTTGGTAGCTTCGTCGCGAGTCCGT

Modified template overhang 5=-gcctgcaggtcgactctagaCAACCATCGAAGACTA 23=-cggacgtccagctgagatctGTTGGTAGCTTCTGATCGATGTAAG

5= modified non-T7 sequence 5=-gcctgcaggtcgactctagaACCAACGATCCTACTA 24=-cggacgtccagctgagatctTGGTTGCTAGGATGATATCGTCCGT

aLowercase letters are bases derived from the fixed region of the starting material, while bases derived from the random region are capitalized. “N” signifies that thebase was random in the starting material (either A, T, C, or G), while “N” is the complement of that base in the template strand. Underlined regions highlightchanges from the Exo�Kl-1 sequence.

bAffinities measured in gel shift assays were measured with 2 nM primer-template and under the conditions described in Materials and Methods. The derived Kd

values were useful for comparisons between these sequences but were not determined under conditions that would necessarily yield accurate Kd values. Therefore,they are referred to as apparent Kd values that are specific for the gel shift experiment (Kd,app,gelshift). Values listed are approximate changes relative to the Exo�Kl-1sequence (measured Kd,app,gelshift, 125 � 25 nM; averages from at least 3 experiments � SD), with higher numbers indicating fold lower affinity. NA, not applicable(reference sequence).

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 9

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 10: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

selected sequences (Fig. 5B, bottom reference sequence), was used for further testingto determine which bases contribute most to selective binding. An alignment of thatsequence (named Exo�Kl-1) with the T7 and T3 RNAPs is shown in Fig. 3. Other thanthe 7 nucleotides from �6 to �1 noted above, there was no significant homology withother regions of the T7 or T3 RNAP.

In gel shift assays, Exo�Kl-1 bound with a Kd,app,gelshift of �125 nM to Exo�Klpolymerase, comparable to the overall Kd,app,gelshift of the round 7 material (Fig. 6 andTable 1). Notably, in gel shifts conducted in the absence of Mg2�, this sequence alsobound Klenow wild-type polymerase, which has no mutation to eliminate the 3=¡5=exonuclease activity. Binding to Exo�Kl was essentially identical for Klenow and Exo�Kl,indicating that the mutation in Exo�Kl played no role in the sequences that wereselected in the SELEX protocol (data not shown).

Determination of important sequences for Exo�Kl binding in the selectedmaterial. Mutational analysis was performed on the Exo�Kl-1 sequence to determinewhich bases were most responsible for the specific binding of Exo�Kl (Table 5; see alsoFig. S4 in the supplemental material for an example of a gel shift experiment). Inaddition to constructs with template truncations (described above), which revealed theenzyme’s preference for 9-nucleotide overhangs, two other major types of mutationswere made: base substitutions and shifting of the primer-template sequences.

Modifying the 7 bases that made up the T7Pcore disrupted tight binding to Exo�Kl(Table 5, modified T7Pcore); as this sequence was found in most of the recoveredsequences, the result is not surprising and implies that these bases contribute signif-icantly to the specific interaction with the enzyme. Modifying the 9-nucleotide single-stranded template overhang sequence (which contains 3 nucleotides of the T7Pcore)(Table 5, modified template overhang) modestly decreased enzyme binding, whilemodifying the bases 5= of the T7Pcore had a similar effect (Table 5, 5= modified non-T7sequence). Together, these results suggest that the four bases of the T7Pcore in theduplex region are the major contributors to tight binding but are aided by sequencesupstream and downstream. This agrees with the observation that the region ofconservation among recovered sequences is larger than just the �6 to �1 T7Pcore(Fig. 5B).

To determine if the context of the sequence relative to the primer terminus wasimportant, constructs in which the entire random-region-derived sequences wereshifted further in (�1, �4, �8) or out (�1) of the duplex region were created. While allshifts decreased binding, the �1 shift (which contained 5 bp of the T7Pcore in theduplex region and 2 nucleotides in the single-stranded region) was tolerated best;other shifts were more deleterious to binding, demonstrating that, like the selected Taqsequences, sequence context was also integral to tight binding.

FIG 6 Gel shift of Exo�Kl-1, as well as random starting material by Exo-minus Klenow polymerase. TheExo�Kl-1 construct was selected from round seven of PT-SELEX and is shown in Table 1. The concen-trations of Exo�Kl polymerase were 0, 25, 50, 100, 200, 400, and 800 nM for the starting material and 0,3.1, 6.3, 12.5, 25, 50, 100, 200, 400, and 800 nM for Exo�Kl-1. The apparent affinity for the Exo�Kl-1construct using gel analysis (referred to as Kd,app,gelshift) was 125 � 25 nM (average from 3 experiments � SD).The starting material bound too weakly to determine an affinity.

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 10

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 11: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

DISCUSSION

In previous reports, we demonstrated that reverse transcriptases exhibit primer-template sequence-specific binding preferences for substrates that resemble the PPTsof the viruses (1, 2). The current report demonstrates that both Taq and Klenowpolymerases also demonstrate primer-template sequence-specific binding preferences.In this case, sequences related to phage RNA polymerase promoters were selected withboth enzymes (Fig. 7). Unlike the PPT-like sequences selected by HIV RT, there was noclear biological role for the specific sequences that were selected; however, they didreveal a novel relationship between phage RNAPs and specific bacterial DNA poly-merases that could have functional, structural, and evolutionary implications.

The evolutionary origin of single-subunit DNA-dependent RNAPs (ssRNAPs) is un-certain. In addition to phage, members of this class are found in chloroplasts, nucleus,and mitochondria. Given the presumptive bacterial origin of chloroplasts and mito-chondria, it is conceivable that these enzymes have a bacterial/phage origin (this andother possibilities are discussed in detail in reference 20). However, primary sequenceconservation to multisubunit RNA polymerases and DNA polymerases is low, which hasmade it difficult to trace the origin of this group. Despite low sequence homology,ssRNAPs share a structural core with other right-hand-shaped polymerases. Using theproperties of the conserved amino acids in this core, Monttinen et al. (21) constructeda structure-based distance tree demonstrating that ssRNAPs are closely related tofamily A DNA polymerases, which include DNAP I and Taq. The striking structuralsimilarities between T7 phage RNAP and the Klenow fragment of DNAP I and relatedpolymerases (22–24), coupled with mutational analysis showing that deoxynucleosidetriphosphate (dNTP) and ribonucleoside triphosphate (rNTP) usage can be alteredwithin this group by a small number of amino acid changes (25), led Cermakian et al.(20) to hypothesize that ssRNAPs and DNAP I-like enzymes evolved by divergentevolution from a common ancestor. Our results further bolster these analyses by demon-strating the existence of a common sequence motif that can drive high-affinity binding.It is conceivable that during evolution, phage RNAPs exploited an intrinsic bindingaffinity of an ancestral polymerase to develop their promoters, while DNA polymerasesretained the intrinsic binding preference despite there being no clear function for sucha preference in DNA biology.

It is notable that the core region of phage RNAP promoters was selected by Taq andExo�Kl (Fig. 7), as this is the most highly conserved region among phage RNAPs (17).Mutational analysis indicates that the core region (nucleotides �7 to �1) is pivotal forboth initiation (maps most strongly to nucleotides �5 to �3) and RNAP binding (mapsmost strongly to nucleotides �12 to �5), while binding specificity is more stronglydictated by the less conserved upstream promoter sequences (17, 26, 27). The A-T-richnature of the core region may also be important for allowing melting, which isobserved in this region in crystal structures between T7 RNAP and promoter duplexes

FIG 7 T7 and T3 phage promoter sequences denoting regions of homology with Taq and Exo�KlPT-SELEX selected sequences. The �1 bolded G residue indicates the start of transcription. Boxed regionsdenote sequences shared by the promoters and the selected tight-binding primer-template sequencesfor Exo�Kl (T7 promoter) and Taq (Taq1, T3 promoter) polymerases.

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 11

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 12: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

(22). Further, A-T-rich nucleic acid duplexes are known to be more flexible (28–31). Aspolymerases are well known to “bend” the primer-template duplex upon binding, thisseems to be a possible role of the Taq1 sequence, given its positioning in theprimer-template duplex (Fig. 2). A more trivial explanation is that Taq and Exo�Klselected sequences with appropriately oriented A-T-rich flexible regions to promotebinding, but this is argued against by the specific selection of the T7 and T3 corenucleotides. Since the starting pool had a 25-nucleotide random region, we calculatethat only �1 in every 700 oligonucleotides would have contained a 7-nucleotide regioncorresponding to the T7 or T3 core region, let alone one that was in the appropriateposition in the sequence to induce strong binding. This argues for the core promoterregions having specialized properties that promote binding beyond a high A-T content.

The phage core sequences in Taq1 and Exo�Kl, as well as the unrelated sequencein Taq2, were dependent on surrounding sequences to promote strong binding, andalso highly orientation dependent (Tables 2, 3, and 5). Even a one-nucleotide shift in theposition of the 3= primer terminus dramatically weakened binding. If it is assumed thatthe 3= terminus represents an anchor point for a primer-dependent polymerase, thenmoving the terminus by even a single nucleotide would change sequences that contactvarious domains of the polymerase. The observation suggests that the elements of theselected sequences that induce strong binding must be positioned precisely withspecific polymerase binding domains. In this regard, binding is similar to the recogni-tion of sites by sequence-specific DNA binding proteins such as promoter-dependentpolymerase, enhancer, or restriction enzymes.

Sequence logos can be used to analyze the sequence conservation and diversitybetween a set of related nucleotide sequences (32). A logo comparing 17 T7 RNAPbacteriophage promoter binding sites is shown in Fig. 8 along with a logo comparingthe sequences recovered from round 7 of PT-SELEX with Exo�Kl. The level of relativeconservation of specific nucleotides between the two alignments is striking. It isnotable that the 5=-ACTATAG-3= sequence (nucleotides 12 to 18 in the Exo�Kl align-ment and �6 to 0 in the T7 RNAP alignment) shared between the proteins showedalmost identical conservation in bits. That is, in the region marked by bars at thebottom of the figure, the predominant sequence (reading the topmost letter—theconsensus) is ACTATAG for both the Klenow SELEX and natural T7 RNA polymerasepromoters.

To test the resemblance quantitatively, we used the T7-like promoter model devel-oped by Chen and Schneider (33). To build this model, 76 promoters from six T7-likebacteriophage had been combined into a single model, as shown in Fig. 4E of thatpaper. That model was successful in predicting T7-like promoters and led to thediscovery of T7 islands, transposon-like genetic structures that apparently use T7-likepromoters (34). Remarkably, the consensus of the 76-site T7-like promoter model from�6 to 0 is also 5=-ACTATAG-3=, the same as the Klenow SELEX consensus. To comparethese precisely, the �6 to 0 individual information weight matrix model (35) for the76-site model was scanned over the Klenow SELEX sequences. Every Klenow SELEXsequence had a positive information content by this evaluation, ranging from 0.2 to12.6 bits. The second law of thermodynamics implies that positive values of individualinformation measured in bits imply functional binding sites (35–37), providing evidencein favor of the Klenow SELEX sequences being in the T7-like promoter class.

Sequence logos for Taq1 and T3 RNAP are shown in Fig. 9. Quantitatively, the region�7 to 0 of the 76-site individual information model matches the first 5 Taq SELEXsequences at 13 bits, while the other sequences are below zero bits, suggesting thatthey are a different class of molecules. In this case, the similarities are not as clear. Thiscould be due in part to Taq polymerase selecting two diverse sequence motifs in theSELEX experiments (Fig. 2, groups 1 and 2). However, a trend between the sharedsequence of Taq1 (5=-CACTAAAG-3=, nucleotides 9 to 16 for Taq and �7 to 0 for T3RNAP) and the T3 promoter sequences is still evident, especially for the last 4 nucle-otides. The similarities between the logo analyses with Klenow and Taq and phagepromoters suggest that the uncommonly strong binding between the DNA polymerase

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 12

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 13: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

and these specific sequences may be dictated by the same parameters that specifybinding of the phage RNAPs to promoter sequences. The 76-site model closely resem-bles the common sequences for Taq and Klenow sequences. This result is consistentwith the evolutionary connection between these polymerases.

Exactly what properties of the promoter core sequence allow strong binding wasnot clear. Taq2, which binds even tighter than Taq1, had an A-T-rich region that did notmatch a phage RNAP core, but it was further upstream of the 3= recessed terminus, andTaq2 bears little overall similarity to Taq1. This indicates that there are other sequences

FIG 8 Klenow sequences compared to T7 promoters using sequence logos. The aligned Klenow (Exo�Kl)sequences on the top correspond to those in Fig. 5. Below them are sequence logos of selectedpolymerase binding sites (23). A logo represents sequence conservation measured in bits at eachposition of the aligned sequences by the height of a stack of letters. Within each stack, the frequency ofbases is shown by their relative heights. On the top of each logo stack is an “I” beam symbol that showsthe likely variation of the stack height based on the number of sequences (35). To the right of eachsequence (top panel) is the individual information for that sequence, in bits (35). The sequence logo forselected Klenow sequences is aligned with that of bacteriophage T7 promoters, and the common regionsare marked with boxes. The open-stranded region for the DNA to which Klenow binds is shaded in lightgreen in positions 16 to 18. The orientation of T7 polymerase on DNA has been determined (22, 41), asshown by the sine wave for which the peak represents the major groove facing the protein. Bases thatexceed 1 bit are likely to represent non-B-form DNA (42) as confirmed experimentally by crystalstructures (43–45). The region matches between Klenow (12 to 18) and the T7 promoter (�6 to 0) includethe two possible flipping bases at �3 and �4 of the T7 promoter (circles). The last base of the commonregion corresponds exactly to the first base of transcription of the T7 promoter (triangle, base 0). Thesingle-stranded region is just to the right of the potentially flipping bases and to the left of the initiationbase. This correspondence suggests that the strong binding of Klenow to this sequence may be basedon an opened DNA structure.

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 13

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 14: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

not related to phage promoters that can promote high-affinity binding, perhaps by adifferent mechanism. Further complicating the analysis was the inconsistent position-ing of the core sequence in Taq1 and Exo�Kl-1. The T7 core region of Exo�Kl-1 waspositioned over the 3= recessed terminus with 4 nucleotides in the duplex region and3 in the single-stranded 5= overhang (Fig. 5). In contrast, in Taq1, the T3 core ended 5nucleotides upstream of the 3= recessed terminus (Fig. 2). Therefore, the core promoterregions of Taq1 and Exo�Kl-1 are shifted by several nucleotides. Although this mightsuggest that these sequences have different functions for the different enzymes, it isimportant to note that Exo�Kl was highly limited in the selection process by theapparent requirement to select a second BbsI site that could generate the longeroverhangs required for Exo�Kl to gel shift the substrate (see Results and Fig. 5). In fact,the first 2 nucleotides of the selected T7 core sequence (AC) are part of the selectedBbsI restriction site. Unlike the Taq1 sequence, Exo�Kl-1 would position the Klenowactive site in nearly the same position relative to the core as the T7 RNAP active site ispositioned in the T7-DNA crystal structure (22). It is unclear whether this positioningwas forced by the prerequisite selection of the BbsI site or if there was a more complex

FIG 9 Taq sequences compared to T3 promoters using sequence logos. The aligned Taq sequences onthe top correspond to those in Fig. 2A. The conservation for the aligned sequences is shown as a logoin the middle. See Fig. 7 legend for details of the alignment and the logos. The logo on the bottom isfor bacteriophage T3 promoter sequences (33). A large box shows the apparent alignment between theTaq-selected sequences and the T3 promoters (33). The alignment for T3 is like that for T7. The potentiallyflipping bases (above the sine wave [42]) are less clear because there are fewer sequences; four commonbases are marked with circles. The triangle indicates the first base of the mRNA transcript. The wave onthe T3 logo is in a dashed line to indicate that we are unaware of data that assign the binding—thoughit almost certainly has to be the same as for T7.

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 14

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 15: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

interplay for selection of the T7 core sequence and BbsI site. Finally, it is also possiblethat phage core promoter sequences have special properties that would induce theirselection by many polymerases, even non-family A and non-DNAP-like polymerases.Although this was not the case for viral RTs (2), more enzymes would have to be testedto draw any conclusions.

In addition to the biological implications discussed above, this work also demon-strated that the high-affinity Taq primers dramatically outcompeted other primers inPCRs, even when only a single high-affinity primer was included as the forward orreverse primer in a PCR see (Fig. S1 and S2 in the supplemental material). Primer biasin PCRs is known to occur and commonly results from differential hybridization kineticsof primers to target DNA (14–16) but may also be related to DNA polymerase primerspecificity (38, 39). This phenomenon can complicate genome sequencing and otherquantitative protocols requiring PCRs (e.g., multiplexing, transcriptome sequencing[RNA-seq], DNA sequencing [DNA-seq]), where short nonspecific primers (e.g., randomhexamers) are often used in the PCR step. The extreme primer bias demonstrated in thecurrent work was not related to the thermostability of the primers used but correlatedwith more-stable binding of Taq to the SELEX-selected primers (Table 4) and wouldtherefore be consistent with the DNA polymerase primer specificity noted above. SincePT-SELEX is designed to select only the primers that bind with the highest affinity, itdoes not reveal information about primers that may bind preferentially but to a lesserextent, and it is not clear how such primers would affect PCRs. There may be a spectrumof different Taq binding affinities for primers, or the selected primers may represent raresequences that bind with uncommonly high affinity. If the latter were the case, DNApolymerase primer specificity might not be a major issue in most multiplex reactionswith multiple primers. In contrast, if a spectrum of different primer affinities exists,this could complicate quantitative analysis in multiplexing and other PCR protocols. Inthe future, we plan to use PT-SELEX to determine if other thermostable polymerasesalso show strong sequence bias and if the recovered sequences are related to thosefound for Taq.

MATERIALS AND METHODSMaterials. A 3=¡5= exonuclease� mutant of Klenow polymerase (referred to as Exo�Kl), containing

the mutations D355A and E357A, was purchased from New England BioLabs, as were wild-type Klenow,Taq DNA polymerase, BbsI, and T4 polynucleotide kinase (T4 PNK). dNTPs were purchased from RocheApplied Sciences. Radiolabeled [�-32P]ATP was from PerkinElmer. Sephadex G-25 spin columns were fromHarvard Apparatus. Oligonucleotides were from Integrated DNA Technologies. The PCR blunt endcloning kit was from Agilent. The Miniprep DNA preparation kit was from Qiagen. Nitrocellulose filterdisks (25 �m, 0.45-�m pore size, Protran BA 85) were from Whatman. The LigaFast rapid ligation kit wasfrom Promega. All other reagents were obtained from Thermo Fisher Scientific, Inc., Sigma-Aldrich Co.,or VWR. Graphs were produced and analyzed using SigmaPlot. The sequence logos and alignedsequences in Fig. 8 and 9 were generated using makelogo 9.59, alist 6.63, and alo 1.12, which areavailable at https://alum.mit.edu/www/toms (32).

5=-32P end labeling of DNA oligonucleotides. Twenty-five picomoles of the 41-nucleotide primerstrand oligonucleotide or 500 pmol of primer 1 (see below) was 5=-32P end labeled using T4 PNK. Thelabeling reaction was performed at 37°C for 30 min, in accordance with the manufacturer’s protocol.Reaction mixtures were shifted to 70°C for 15 min to heat inactivate the PNK. The DNA was thencentrifuged on a Sephadex G-25 column to remove any excess radiolabeled nucleotide.

Preparation of DNA-DNA duplexes. Duplexes were prepared by mixing 2 pmol of 41-nucleotide5=-32P-end-labeled DNA from above and 2 pmol of 45-nucleotide template (see Results for a full list ofsequences) in 15 �l of buffer containing 50 mM Tris-HCl (pH 8.0), 80 mM KCl, 1 mM dithiothreitol (DTT),and 0.1 mM EDTA. Reaction mixtures were placed at 80°C for 5 min and then allowed to slowly cool toroom temperature prior to use.

Preparation of starting material for PT-SELEX. Approximately 400 pmol each of radiolabeledprimer (5=-GCCTGCAGGTCGACTCTAGA-3= [primer 1]) and unlabeled template [5=-GCATGAATTCCCGAAGACGC(N)25TCTAGAGTCGACCTGCAGGC-3=, where N is any base] was mixed in a 50-�l volume containing50 mM Tris-HCl (pH 8), 1 mM DTT, and 80 mM KCl. The material was heated to 65°C for 5 min and thenslow cooled to room temperature to form hybrids. The hybrid reaction mixture was diluted into a totalvolume of 100 �l containing 0.5 mM dNTPs, 6 mM MgCl2, and 3 U Klenow polymerase and incubated for45 min at 37°C. The product was purified by running on a 12% native polyacrylamide gel in Tris-borateEDTA (TBE) (40) and cutting out bands corresponding to the double-stranded hybrid. The hybrid waseluted from the gel overnight in 500 �l of 25 mM Tris-HCl (pH 8) at 4°C and then passed through a0.45-�m polyethersulfone membrane syringe filter. The material was precipitated at �20°C in 2 volumesof 100% ethanol and 1/10 volume of 3 M sodium acetate (pH 7) (final volume, �1,500 �l) and then spun

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 15

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 16: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

for 30 min at �15,000 � g to pellet the DNA. The pellet was washed with 500 �l of 70% ethanol andthen dried in a Savant DNA120 speed vacuum. Collected hybrids were incubated with 20 U of BbsI at37°C for 3 h in a final volume of 100 �l using the buffer supplied by the manufacturer. Digested productswere separated from any uncut material by electrophoresis on a 12% native polyacrylamide gel andrecovered as described above. Each reaction volume typically yielded about 50 to 100 pmol of finalproduct.

PT-SELEX with Exo�Kl and Taq polymerase. An overview of the PT-SELEX process is presented inFig. 1. For the initial selection, �200 pmol of cleaved hybrid was incubated with 20 pmol Exo�Kl or Taqpolymerase for 1 h in 100 �l of 50 mM Tris-HCl (pH 7), 50 mM KCl, 1 mM DTT, 10 mM MgCl2, and 50 �g/mlbovine serum albumin (BSA) (Klenow buffer) or 10 mM Tris-HCl, 50 mM KCl, and 1.5 mM MgCl2 (Taqpolymerase buffer) at room temperature. After addition of 6� gel loading buffer, consisting of 30%(vol/vol) glycerol, 0.25% (wt/vol) bromophenol blue, and 0.25% (wt/vol) xylene cyanol FF, samples wererun on a 7% native gel to separate bound and unbound materials. Bound material shifted toward the topof the gel and was recovered as described above.

After selection, the hybrids’ recessed ends were filled in using Klenow polymerase in reactionmixtures containing 0.2 U of enzyme, 50 mM Tris-HCl (pH 8), 100 �M dNTPs, 6 mM MgCl2, 80 mM KCl,and 1 mM DTT, for a total volume of 50 �l. After 10 min at 37°C, reaction mixtures were phenol extractedand precipitated with ethanol in the presence of 50 �g of glycogen. Following this, the recoveredmaterial was ligated for 20 min to a 5- to 10-fold excess of the hybrid duplex of 5=-ATAGCATGAATTCGCAGAAGACCC-3= and 5=-GGGTCTTCTGCGAATTCATGC-3= (no phosphate on 5= ends), using the LigaFastrapid ligation kit. For the first round, the ligation was performed in a volume of 30 �l; subsequent roundswere performed in 10 �l as per the manufacturer’s protocol. Note that the duplex can potentially ligateto either end of the selected material; however, the vast majority of ligations will be to the end withrandom nucleotides (Fig. 1), as the other 5= end is mostly nonphosphorylated. Only those primersphosphorylated during the radiolabeling step would be substrates for ligation, and these represent justa few percentages of the total primers used. Even if ligation to this end does occur, the resulting PCRproducts would be eliminated based on their size in the PCR step below.

The entire ligation mixture from above was then PCR amplified by Taq polymerase in a 400-�lreaction mixture containing 400 pmol each of the primers 5=-GCCTGCAGGTCGACTCTAGA-3= (32P endlabeled) and 5=-GCATGAATTCGCAGAAGACCC-3=, in Taq buffer. The reaction mixtures were dividedequally into 4 tubes and PCR amplified for 8 to 14 cycles of 94°C (30 s), 50°C (30 s), and 72°C (30 s),followed by a final 5-min extension at 72°C. The number of amplification cycles was controlled to preventoveramplification, which results in a smeared rather than a discrete product on a nondenaturing gel.Typically, aliquots were removed from the PCR mixtures after 8, 10, 12, or 14 cycles, added to 6� gelloading buffer, and then run on a 12% nondenaturing gel as described above. The correct-size productswere excised and recovered as described above. In some rounds, the recovered products were used tomake more PCR product (�0.1 pmol of recovered product was amplified for 8 to 10 cycles) in order toensure a yield of at least 25 pmol for the next SELEX round. Recovered material was combined, BbsIdigested, purified as described above, and then subjected to the next round of SELEX. This selectionprocess was repeated 6 times for both Taq polymerase and Exo�Kl, with subsequent rounds using �1/10(rounds 2 to 4) or 1/20 (rounds 5 to 7) the enzyme (mole/mole) to the recovered material, until noincrease in binding was detected between rounds. Also, only one-half of the material from the ligationmixture (see above) was used in PCRs after round 1. After round 2, 1/5 of the material was also savedfrom each round for use in Kd determinations. After reaching round 7, PCR material from each selectionpool was inserted into a vector and cloned into bacteria using the Strataclone Blunt Ended PCR kit fromAgilent Technologies according to the manufacturer’s instructions. Isolated colonies were grown over-night in 3 ml of 50 �g/ml ampicillin-LB broth. Insert-containing plasmids were purified using a QiagenSpin Miniprep kit and sequenced.

Determination of the equilibrium dissociation constant by gel shift assay. To determine Kd bygel shift assay, oligonucleotides based on the recovered sequences from PT-SELEX were purchased, 5=end labeled on the primer strand (36 nucleotides for Exo�Kl and 41 nucleotides for Taq unless otherwisenoted), and hybridized to the template strand (45 nucleotides unless otherwise noted) as describedabove. Hybrids were purified from 12% native gels as described above. These were mixed with variousamounts of either Taq polymerase or Exo�Kl at a final concentration of 2 nM in either Klenow or Taqpolymerase buffer (see above). The enzyme concentrations were 0, 1.1, 3.1, 6.3, 12.5, 25, and 50 nM forTaq polymerase and 0, 25, 50, 100, 200, 400, and 800 nM for Exo�Kl, unless otherwise indicated. Theenzymes and hybrids were incubated for 5 min at room temperature in a final volume of 15 �l followedby addition of 6� gel loading buffer. Samples were then run at 110 V on a 7% polyacrylamide native gel.The products were visualized on a FLA-7000 phosphorimager (Fujifilm), and the ratio of amount ofproduct shifted to amount unshifted was quantified with Multi Gauge software. Kd was determined byplotting the ratio of material shifted ([shifted]/[shifted � unshifted]) to the concentration of the enzymeand fitting the data by nonlinear least-square fit to the following quadratic equation: [ED] � 0.5 ([E]t �[D]t � Kd) � 0.5 {([E]t � [D]t � Kd)2 � (4 [E]t[D]t)1/2}, where [E]t is the total enzyme concentration and [D]t

is the total primer-template concentration. As these experiments used relatively large amounts of nucleicacid starting material (2 nM) and the conditions were not optimized for binding to the polymerases, thegels were used only to compare binding to different constructs. These “apparent” equilibrium dissoci-ation constants (referred to as Kd,app,gelshift) were determined for comparisons between the variousconstructs.

Determination of Kd, koff, and t1/2 for binding of Taq to primer-templates at 23°C (roomtemperature) and 60°C using nitrocellulose filter binding. For dissociation constant (koff) determina-

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 16

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 17: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

tions, reaction mixtures were in Taq polymerase buffer (see above) containing 5 nM gel-purifiedradiolabeled primer-template construct and 5 nM Taq polymerase in a volume of 90 �l. To start theanalysis, 500 nM (final concentration in reaction mixtures) of unlabeled primer-template was added in avolume of 10 �l in Taq buffer. Ten-microliter aliquots were removed at different time points (dependingon how long the construct bound to Taq) and immediately applied to nitrocellulose filter disks undervacuum. Each disk was washed twice with 1 ml of wash buffer (25 mM Tris-HCl [pH 7.5], 10 mM KCl), airdried, and counted in a scintillation counter. A graph of bound counts versus time was constructed, andthe koff value was calculated by fitting the curve to an equation for two-parameter exponential decay(y � ae�bx, where a is the value for bound substrate at time zero and b is the koff value) using SigmaPlot.Averages from 3 or more independent experiments (�standard deviations [SD]) were used to calculatekoff. The half-life (t1/2) for binding was calculated from koff using the equation t1/2 � 0.693/koff. Kd

determinations used the same buffer conditions as those described above with the addition of 0.1 mg/mlBSA. The reactions were performed in 1 ml of buffer that contained 2 pM gel-purified radiolabeledprimer-template construct. Different amounts of Taq polymerase were added based on pilot experimentsto approximate the Kd. Material was incubated at 23°C for 10 min and then applied to nitrocellulose filtersas described above. The Kd was calculated using the equation described above in “Determination of theequilibrium dissociation constant by gel shift assay.” Note that this approach could not be used toaccurately assess Kd values at 60°C due to temperature fluctuations that occur during application of thesample to the filter and washing of the filter.

Preparation of constructs for competition PCR experiments. Constructs were prepared by PCRamplification using 50 ng of plasmid pNL43 and 50 pmol of each primer for 25 cycles of 94°C (30s), 50°C(1 min), and 72°C (1 min), followed by 1 cycle for 4 min at 72°C. Samples were electrophoresed on a 6%native polyacrylamide gel, located by UV shadowing, and purified as described above. The primers usedfor the reactions are described in Table S1 in the supplemental material.

Competition PCR experiments with Taq polymerase. Reactions were performed in standard Taqpolymerase buffer containing 200 �M dNTPs and 2.5 units of Taq polymerase in a 100-�l final volume.Reaction mixtures contained a 448-nucleotide template construct (prepared as described above) containinga 400-nucleotide region of the HIV genome (nucleotides 336 to 735 in the HIV LAI virus sequence) that wasflanked by two 24-nucleotide regions at the 5= and 3= ends (see Fig. S1A and S2A in the supplementalmaterial). The 24-nucleotide regions contained 20 nucleotides that could bind to different sets of primersfollowed by a fixed 4-nucleotide region derived from the 5= overhang sequences of Taq1 (5= end of construct)or Taq2 (3= end of construct) (Table 1). Reaction mixtures included 1 ng of template and 50 pmol of eachprimer, including one radiolabeled primer (5= end primer, in each case denoted with an asterisk in Fig. S1 andS2). For competition reactions, a second primer-template set was included at the same concentration withoutradiolabel. Cycling conditions were 94°C (30 s), 53°C (30 s), and 72°C (30 s). Ten-microliter aliquots wereremoved at 10, 13, 16, 19, 22, 25, and 28 cycles and added to 2 �l of 6� agarose gel loading buffer. Sampleswere electrophoresed on a 1% agarose gel, fixed in 10% trichloroacetic acid (TCA), and dried onto filter paperas described previously (40). Dried gels were analyzed using a phosphorimager.

SUPPLEMENTAL MATERIAL

Supplemental material for this article may be found at https://doi.org/10.1128/JB.00579-17.

SUPPLEMENTAL FILE 1, PDF file, 1.6 MB.

ACKNOWLEDGMENTSThis work was supported in part by a grant from the National Institute of General

Medical Sciences (GM116645) awarded to J.J.D. and in part by funds to T.D.S. from theIntramural Research Program of the NIH, National Cancer Institute, Center for CancerResearch.

REFERENCES1. DeStefano JJ, Cristofaro JV. 2006. Selection of primer-template se-

quences that bind human immunodeficiency virus reverse transcriptasewith high affinity. Nucleic Acids Res 34:130 –139. https://doi.org/10.1093/nar/gkj426.

2. Nair GR, Dash C, Le Grice SF, DeStefano JJ. 2012. Viral reverse transcrip-tases show selective high affinity binding to DNA-DNA primer-templatesthat resemble the polypurine tract. PLoS One 7:e41712. https://doi.org/10.1371/journal.pone.0041712.

3. Powell MD, Levin JG. 1996. Sequence and structural determinants re-quired for priming of plus-strand DNA synthesis by the human immu-nodeficiency virus type 1 polypurine tract. J Virol 70:5288 –5296.

4. Rausch JW, Le Grice SF. 2004. ‘Binding, bending and bonding’: polypu-rine tract-primed initiation of plus-strand DNA synthesis in human im-munodeficiency virus. Int J Biochem Cell Biol 36:1752–1766. https://doi.org/10.1016/j.biocel.2004.02.016.

5. Wöhrl BM, Moelling K. 1990. Interaction of HIV-1 ribonuclease H withpolypurine tract containing RNA-DNA hybrids. Biochemistry 29:10141–10147. https://doi.org/10.1021/bi00496a001.

6. Champoux JJ, Schultz SJ. 2009. Ribonuclease H: properties, substratespecificity and roles in retroviral reverse transcription. FEBS J 276:1506 –1516. https://doi.org/10.1111/j.1742-4658.2009.06909.x.

7. Jacob DT, DeStefano JJ. 2008. A new role for HIV nucleocapsid protein inmodulating the specificity of plus strand priming. Virology 378:385–396.https://doi.org/10.1016/j.virol.2008.06.002.

8. Julias JG, McWilliams MJ, Sarafianos SG, Alvord WG, Arnold E, Hughes SH.2004. Effects of mutations in the G tract of the human immunodeficiencyvirus type 1 polypurine tract on virus replication and RNase H cleavage.J Virol 78:13315–13324. https://doi.org/10.1128/JVI.78.23.13315-13324.2004.

9. Post K, Kankia B, Gopalakrishnan S, Yang V, Cramer E, Saladores P,

DNA Polymerases Recognize Phage Promoter Elements Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 17

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 18: An Evolutionary/Biochemical Connection between Promoter ...cleotides (5=-CACTAAAG-3=) that matched the phage T3 RNAP “core” promoter. Both primers dramatically outcompeted primers

Gorelick RJ, Guo J, Musier-Forsyth K, Levin JG. 2009. Fidelity of plus-strand priming requires the nucleic acid chaperone activity of HIV-1nucleocapsid protein. Nucleic Acids Res 37:1755–1766. https://doi.org/10.1093/nar/gkn1045.

10. Johnson KA. 2010. The kinetic and chemical mechanism of high-fidelityDNA polymerases. Biochim Biophys Acta 1804:1041–1048. https://doi.org/10.1016/j.bbapap.2010.01.006.

11. Steitz TA. 1999. DNA polymerases: structural diversity and commonmechanisms. J Biol Chem 274:17395–17398. https://doi.org/10.1074/jbc.274.25.17395.

12. Chien A, Edgar DB, Trela JM. 1976. Deoxyribonucleic acid polymerasefrom the extreme thermophile Thermus aquaticus. J Bacteriol 127:1550 –1557.

13. Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB,Erlich HA. 1988. Primer-directed enzymatic amplification of DNA with athermostable DNA polymerase. Science 239:487– 491. https://doi.org/10.1126/science.2448875.

14. Dohm JC, Lottaz C, Borodina T, Himmelbauer H. 2008. Substantial biasesin ultra-short read data sets from high-throughput DNA sequencing.Nucleic Acids Res 36(16):e105.

15. Benjamini Y, Speed TP. 2012. Summarizing and correcting the GC con-tent bias in high-throughput sequencing. Nucleic Acids Res 40:e72.https://doi.org/10.1093/nar/gks001.

16. Polz MF, Cavanaugh CM. 1998. Bias in template-to-product ratios inmultitemplate PCR. Appl Environ Microbiol 64:3724 –3730.

17. Jorgensen ED, Durbin RK, Risman SS, McAllister WT. 1991. Specificcontacts between the bacteriophage T3, T7, and SP6 RNA polymerasesand their promoters. J Biol Chem 266:645– 651.

18. Datta K, LiCata VJ. 2003. Thermodynamics of the binding of Thermusaquaticus DNA polymerase to primed-template DNA. Nucleic Acids Res31:5590 –5597. https://doi.org/10.1093/nar/gkg774.

19. Berezovski M, Krylov SN. 2005. Thermochemistry of protein-DNA inter-action studied with temperature-controlled nonequilibrium capillaryelectrophoresis of equilibrium mixtures. Anal Chem 77:1526 –1529.https://doi.org/10.1021/ac048577c.

20. Cermakian N, Ikeda TM, Miramontes P, Lang BF, Gray MW, Cedergren R.1997. On the evolution of the single-subunit RNA polymerases. J MolEvol 45:671– 681. https://doi.org/10.1007/PL00006271.

21. Monttinen HA, Ravantti JJ, Poranen MM. 2016. Common structural coreof three-dozen residues reveals intersuperfamily relationships. Mol BiolEvol 33:1697–1710.

22. Cheetham GM, Jeruzalmi D, Steitz TA. 1999. Structural basis for initiationof transcription from an RNA polymerase-promoter complex. Nature399:80 – 83. https://doi.org/10.1038/19999.

23. Joyce CM, Steitz TA. 1994. Function and structure relationships in DNApolymerases. Annu Rev Biochem 63:777– 822. https://doi.org/10.1146/annurev.bi.63.070194.004021.

24. Sousa R. 1996. Structural and mechanistic relationships between nucleicacid polymerases. Trends Biochem Sci 21:186 –190. https://doi.org/10.1016/S0968-0004(96)10023-2.

25. Sousa R, Padilla R. 1995. A mutant T7 RNA polymerase as a DNA polymerase.EMBO J 14:4609–4621.

26. Imburgio D, Rong M, Ma K, McAllister WT. 2000. Studies of promoterrecognition and start site selection by T7 RNA polymerase using acomprehensive collection of promoter variants. Biochemistry 39:10419 –10430. https://doi.org/10.1021/bi000365w.

27. Rong M, He B, McAllister WT, Durbin RK. 1998. Promoter specificitydeterminants of T7 RNA polymerase. Proc Natl Acad Sci U S A 95:515–519.

28. Johnson S, Chen YJ, Phillips R. 2013. Poly(dA:dT)-rich DNAs are highlyflexible in the context of DNA looping. PLoS One 8:e75799. https://doi.org/10.1371/journal.pone.0075799.

29. Scipioni A, Anselmi C, Zuccheri G, Samori B, De Santis P. 2002. Sequence-dependent DNA curvature and flexibility from scanning force micros-copy images. Biophys J 83:2408 –2418. https://doi.org/10.1016/S0006-3495(02)75254-5.

30. Xiao S, Zhu H, Wang L, Liang H. 2014. DNA conformational flexibilitystudy using phosphate backbone neutralization model. Soft Matter10:1045–1055. https://doi.org/10.1039/c3sm52345d.

31. Zhang Y, Xi Z, Hegde RS, Shakked Z, Crothers DM. 2004. Predictingindirect readout effects in protein-DNA interactions. Proc Natl Acad SciU S A 101:8337– 8341. https://doi.org/10.1073/pnas.0402319101.

32. Schneider TD, Stephens RM. 1990. Sequence logos: a new way to displayconsensus sequences. Nucleic Acids Res 18:6097– 6100. https://doi.org/10.1093/nar/18.20.6097.

33. Chen Z, Schneider TD. 2005. Information theory based T7-like promotermodels: classification of bacteriophages and differential evolution ofpromoters and their polymerases. Nucleic Acids Res 33:6172– 6187.https://doi.org/10.1093/nar/gki915.

34. Chen Z, Schneider TD. 2006. Comparative analysis of tandem T7-likepromoter containing regions in enterobacterial genomes reveals a novelgroup of genetic islands. Nucleic Acids Res 34:1133–1147. https://doi.org/10.1093/nar/gkj511.

35. Schneider TD. 1997. Information content of individual genetic se-quences. J Theor Biol 189:427– 441. https://doi.org/10.1006/jtbi.1997.0540.

36. Schneider TD. 1991. Theory of molecular machines. I. Channel capacityof molecular machines. J Theor Biol 148:83–123.

37. Schneider TD. 1991. Theory of molecular machines. II. Energy dissipationfrom molecular machines. J Theor Biol 148:125–137.

38. Dabney J, Meyer M. 2012. Length and GC-biases during sequencinglibrary amplification: a comparison of various polymerase-buffer systemswith ancient and modern DNA sequencing libraries. Biotechniques 52:87–94. https://doi.org/10.2144/000113809.

39. Pan W, Byrne-Steele M, Wang C, Lu S, Clemmons S, Zahorchak RJ, Han J.2014. DNA polymerase preference determines PCR priming efficiency.BMC Biotechnol 14:10. https://doi.org/10.1186/1472-6750-14-10.

40. Sambrook J, Russell DW. 2001. Molecular cloning: a laboratory manual,3rd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

41. Lyakhov IG, Hengen PN, Rubens D, Schneider TD. 2001. The P1 phagereplication protein RepA contacts an otherwise inaccessible thymine N3proton by DNA distortion or base flipping. Nucleic Acids Res 29:4892– 4900. https://doi.org/10.1093/nar/29.23.4892.

42. Schneider TD. 2001. Strong minor groove base conservation in sequencelogos implies DNA distortion or base flipping during replication andtranscription initiation. Nucleic Acids Res 29:4881– 4891. https://doi.org/10.1093/nar/29.23.4881.

43. Feklistov A, Darst SA. 2011. Structural basis for promoter-10 elementrecognition by the bacterial RNA polymerase sigma subunit. Cell 147:1257–1269. https://doi.org/10.1016/j.cell.2011.10.041.

44. Zhang Y, Feng Y, Chatterjee S, Tuske S, Ho MX, Arnold E, Ebright RH.2012. Structural basis of transcription initiation. Science 338:1076 –1080.https://doi.org/10.1126/science.1227786.

45. Liu X, Bushnell DA, Kornberg RD. 2011. Lock and key to transcription:sigma-DNA interaction. Cell 147:1218 –1219. https://doi.org/10.1016/j.cell.2011.11.033.

Fenstermacher et al. Journal of Bacteriology

April 2018 Volume 200 Issue 7 e00579-17 jb.asm.org 18

on Septem

ber 28, 2020 by guesthttp://jb.asm

.org/D

ownloaded from