6
Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins Author(s): Stephen J. Elledge, Paul Sugiono, Leonard Guarente and Ronald W. Davis Source: Proceedings of the National Academy of Sciences of the United States of America, Vol. 86, No. 10 (May 15, 1989), pp. 3689-3693 Published by: National Academy of Sciences Stable URL: http://www.jstor.org/stable/33940 . Accessed: 02/05/2014 21:18 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access to Proceedings of the National Academy of Sciences of the United States of America. http://www.jstor.org This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PM All use subject to JSTOR Terms and Conditions

Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

Embed Size (px)

Citation preview

Page 1: Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding ProteinsAuthor(s): Stephen J. Elledge, Paul Sugiono, Leonard Guarente and Ronald W. DavisSource: Proceedings of the National Academy of Sciences of the United States of America,Vol. 86, No. 10 (May 15, 1989), pp. 3689-3693Published by: National Academy of SciencesStable URL: http://www.jstor.org/stable/33940 .

Accessed: 02/05/2014 21:18

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access toProceedings of the National Academy of Sciences of the United States of America.

http://www.jstor.org

This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PMAll use subject to JSTOR Terms and Conditions

Page 2: Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

Proc. Natl. Acad. Sci. USA Vol. 86, pp. 3689-3693, May 1989 Genetics

Genetic selection for genes encoding sequence-specific DNA-binding proteins

(spectinomycin resistance/Epstein-Barr virus nuclear antigen 1/transcriptional interference/pseudolinkage/molecular cloning)

STEPHEN J. ELLEDGE*, PAUL SUGIONOt, LEONARD GUARENTEt, AND RONALD W. DAvIS*

*Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305; and tDepartment of Biology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139

Contributed by Ronald W. Davis, December 22, 1988

ABSTRACT We describe a genetic selection method de- signed to facilitate the cloning of genes encoding sequence- specific DNA-binding proteins. The strategy selects for clones expressing particular sequence-specific DNA-binding activities from a library of clones encoding other, nonspecific proteins. Specific DNA-binding sites have been placed near the start of transcription of the strong synthetic conll promoter to create promoters that can be repressed by the corresponding se- quence-specifi'c DNA-binding proteins. Transcription from the conl derivatives in the absence of repression interferes with the phenotypic expression of an adjacent drug-resistance gene, aadA. Sequence-specific DNA-binding proteins are shown to repress these promoters and alleviate transcriptional interfer- ence of aadA, resulting in drug resistance in cells expressing the appropriate DNA-binding protein.

Gene expression is primarily regulated at the level of inter- actions between cis-acting control sequences and their cor- responding sequence-specific recognition proteins. The iden- tification of cis-acting regulatory sequences has become fairly routine in the last decade. Likewise, the identification of sequence-specific DNA-binding proteins that interact with these sequences in vitro has also accelerated. These advances have relied primarily on several methods to measure in vitro binding of proteins to DNA, such as nitrocellulose filter binding (1), DNase I protection ("footprinting") (2), and electrophoretic mobility retardation (3). Identifying the genes that encode these sequence-specific DNA-binding proteins, especially those from higher eukaryotes, has proven to be a more difficult task. The strategy typically employed involves the purification of the protein, followed by the preparation of antibody probes or by protein sequence determination to generate nucleic acid probes (4). These probes are then used to screen the appropriate libraries for the gene of interest. A recent, less arduous approach to cloning genes encoding sequence-specific DNA-binding proteins has been to screen expression libraries by using labeled specific DNAs as ligands (5). This direct screening procedure should facilitate the isolation of regulatory genes, although it relies on the production of a functional protein that retains its DNA- binding capacity when bound to a nitrocellulose filter. This requirement may preclude identification of certain classes of DNA-binding activities.

Identification methods that involve the screening of librar- ies are inherently limited in their ability to isolate cDNAs present in low abundance. An identification strategy utilizing a selection method can increase the sensitivity for detection of rare clones by several orders of magnitude, thus allowing one to take advantage of larger expression libraries being produced by advances in cloning technologies. Several DNA- binding proteins from eukaryotes-e.g., GAL4 (6), FLP (7),

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. ?1734 solely to indicate this fact.

and Spl (4)-have been shown to function as repressors when expressed in bacteria. Based on these observations we developed a genetic selection for isolating clones that express binding proteins corresponding to specific DNA sequences. The utility of this method is contingent upon the assumption that, in general, other eukaryotic DNA-binding proteins, when expressed in Escherichia coli, will retain their DNA- binding capacity. A selection scheme based on these consid- erations has been developed for Salmonella typhimurium (8), an organism that is less well suited to the introduction of large expression libraries and other genetic manipulations than E. coli. Our selection utilizes E. coli and is derived from the transcriptional interference assay described by Elledge and Davis (9).

MATERIALS AND METHODS

Bacteria, Plasmids, and Genetic Techniques. E. coli JM106 was used as the lacIZ- host for the lacI selection experi- ments. CY15075 (W3110 tna2, AlacU169, trpR2) (10) was used as the trpR host. JM109 (JM107 recAi) was used as the host for the Epstein-Barr virus nuclear antigen I (EBNA-1) selection experiments. Plasmid pRPG9 (9) was a gift from M. Kuroda. pMC9 (11) was a gift from M. Calos. pKB280 (12) was a gift from K. Bachman. pNN386, pNN387, and pNN388 were constructed as described (9).

For drug-sensitivity measurements, LB agar plates were supplemented with either chloramphenicol (Cmi 40 ,g/ml), spectinomycin (Sp, 80 ,ug/ml), kanamycin (Km, 40,ug/ml), or ampicillin (Ap, 50 mg/ml). E. coli is not sensitive to Sp in minimal medium. All plates contained an additional 100 ,ug of L-tryptophan per ml. When necessary, isopropyl ,B- D-thiogalactopyranoside was added to a final concentration of 3 mM. For selection of pNAK28 in cells bearing the conII-11 promoter, the Sp concentration was 200 jig/ml in plates.

Construction of Regulated Promoters and Plasmids. Con- struction of conI-1 and conII-2 was described previously (9). The reporter plasmid for the genetic selection of the A repressor (cI) was made in two steps. First, a blunt-ended Pst I double-stranded oligonucleotide representing OR1, of se- quence 5'-TTACCTCTGGCGGTGATATGCA-3' (bold let- ters represent the double-stranded region), was cloned into Sma I/Pst I-cut pNN396. This places the left edge of OR1 at +2 relative to the start of transcription, creating conII-10, pNN397. Then the Not I-HindIII promoter-ORl-containing fragment was cloned into Not I/HindIII-cleaved pNN388. This plasmid, pNN398, confers the CmrSps phenotype. The reporter plasmid for the selection of EBNA-1 was con- structed in two steps also. First, a double-stranded oligonu- cleotide containing a single oriP sequence was synthesized with BamHI-compatible ends. The top strand was 5'-

Abbreviations: EBNA-1, Epstein-Barr virus nuclear antigen 1; Cm, chloramphenicol; Sp, spectinomycin; Ap, ampicillin.

3689

This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PMAll use subject to JSTOR Terms and Conditions

Page 3: Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

3690 Genetics: Elledge et al. Proc. Natl. Acad. Sci. USA 86 (1989)

GATCAGATTAGGATAGCATATGCTACCCAGATA-3' and the bottom strand was 5'-GATCTATCTGGGTAG- CATATGCTATCCTAATCT-3' (the overlapping region that is double-stranded when the oligonucleotides are hybridized is shown in bold type). This double-stranded oligonucleotide was ligated into BamHI-cleaved pNN396. A clone was chosen, pNN399, which contained two tandem copies of the oligonucleotide arranged in direct repeats. Then the Not I- HindIII promoter-oriP-containing fragment was cloned into Not I/HindlIl-cleaved pNN388 to create pNN400.

RESULTS AND DISCUSSION

The General Selection Scheme for Sequence-Specific DNA- Binding Activities. Elledge and Davis (9) described a tran- scriptional interference assay in which transcription from the strong tac promoter (Ptac) can interfere with the phenotypic expression of a convergently transcribed drug-resistance gene, aadA. This gene encodes aminoglycoside 3"- adenylyltransferase, an enzyme that confers resistance to the aminoglycosides Sp and streptomycin. Transcriptional inter- ference can be detected by measuring Sp resistance; the level of drug resistance is inversely correlated with the level of transcription from Ptac. Therefore, repression of Ptac acti- vates expression of aadA and confers resistance to Sp. In the presence of functional lac repressor (Lac), Ptac is repressed, aadA is functional, and cells are Sp-resistant (Spr). In the presence of isopropyl /3-D-thiogalactopyranoside, cells be- come Sp-sensitive (Sps). Therefore, lacI is genetically, but not physically, linked to Spr (pseudolinkage). The general assay is based upon the following reasoning: (i) any DNA- binding protein will act as a repressor when bound at the proper position in a prokaryotic promoter; (ii) the principle of pseudolinkage can be employed during transformations and used to specifically enrich for molecules encoding the proper sequence-specific binding protein in a mixture of molecules encoding nonspecific proteins. The selection is illustrated in Fig. 1. First, a defined binding site for the protein of interest is introduced into the assay promoter in place of the lac operator. This modified promoter is placed on the single-copy transcriptional interference assay plasmid and introduced into E. coli. Since E. coli normally lacks the protein of interest, the modified promoter will constitutively interfere with aadA expression and render the strain sps (Fig. 1). An expression library is then introduced into this strain. If the library contains a clone expressing a functional DNA-binding protein that recognizes the specific binding site, that partic- ular clone will disrupt the transcriptional interference and confer the Spr phenotype on the strain (Fig. 1), while other clones encoding nonspecific proteins will remain sensitive. Therefore, by selecting for Spr on the transformation plates, clones encoding the sequence-specific binding proteins that recognize the chosen binding site will be selectively enriched. It should be noted that the resistance marker on the expres- sion library must be coselected along with spr; The most significant aspect of this selection scheme is that extremely large libraries containing >108 recombinants may be screened by employing multiple rounds of selection. This capability is at least 2 orders of magnitude larger than the numbers of recombinants that can be readily screened in plaque assays. The limits of enrichment for a single selection step depend upon the reversion frequency of the particular modified promoter, which varies from construct to construct.

The Selection Scheme Functions with a Wide Variety of Prokaryotic DNA-Binding Proteins. As a preliminary test for the selection scheme described in Fig. 1, selections were designed for three prokaryotic genes, lacI, trpR, and A cI. These genes were carried on Apr derivatives of pBR322. Target promoters were constructed for each of these DNA- binding proteins and were placed on the assay vector

areg Op aadA PaadA

S Introduction of a Sp cDNA expression

library on an Ap r plasmid vector

P Op aadA pid reg aadA

Ap rSpr FIG. 1. General strategy for genetic selection of genes encoding

sequence-specific DNA-binding proteins. Rounded rectangles rep- resent individual E. coli cells. Lines with arrows represent directions of transcription. The thin line represents DNA, and the stippled box represents the aadA gene. Op is the synthetic DNA binding site that is the target of the DNA-binding protein encoded by the gene of interest. The Op site is placed in front of the con promoter to produce Preg, a regulated promoter that interferes with the phenotypic expression of aadA; when transcription from Preg is not repressed, the cell will be SpS. PaadA represents the promoter for aadA. If, after introduction of an expression library, there exists a plasmid clone (represented by an open circle) that produces a protein (represented by the filled circles) that specifically recognizes the Op site, then the Preg promoter will be repressed. This dimihishes transcriptional interference and permits expression of the aadA gene as depicted by the longer transcription arrow. The cell containing the plasmid encoding the cognate tONA-binding protein becomes Spr.

pNN388 (9). The sequences of these promoters are showh in Fig. 2. The repressor-encoding plasmids were introduced by transformation into their respective reporter strains and the frequencies of drug-resistant transformants were measured (Table 1).

The data in Table 1 show that the ability to activate aadA is specific for the plasmid encoding the cognate sequence- specific DNA-binding protein (compare pMC9 and pRPG9 effects on conIl-2). The vector alone (pBR322 or pUC8) did not confer Spr upon the host. The number of transformants within a cognate pair that survive a double Ap Sp selection is slightly lower than the number that survive Ap alone, even though all Apr colonies are Spr when the selections are applied sequentially.

Many eukaryotic transcription factors are thought to bind DNA weakly and rely on multiple binding sites and cooper- ativity to function in their normal environments. The bacte- riophage A repressor, encoded by the cI gene, is a prokaryotic gene that shares these properties. Therefore, the A cI gene

-10 +1 conI I TATAATGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTT

KpnI SmaI BamHI XbaI SalI PstI SphI HindIII

conI-1 TATAATGTGTGGAATTAATTGTGAGCGCTCACAATTAATTCGAGCTCGGTG lacO

conII-2 TATAATGGTACAACTAGTTAACTAGTACGCTGCAGGCATGCAAGCTT.... trpO

conII-10 TATAATGGTACCCTTACCTCTGGCGGTGATATGCAGGCATGCAAGCTT... OR1

con II-11 TATAATGGTACCCGGG(GATCTATCTGGGTAGCATATGCTATCCTAATCT)2

oriP

FIG. 2. Sequences of the regulated promoters used in this study. The underlined sequences represent the binding sites of the cognate DNA-binding proteins. The names of these binding sites are listed below the relevant sequences. The plasmids encoding the particular cognate proteins are pMC9 (LadI) for lacO; pRPG9 (TrpR) for trpO; pKB280 (cI) for ORi; and pNAK28 (EBNA-1) for oriP.

This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PMAll use subject to JSTOR Terms and Conditions

Page 4: Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

Genetics: Elledge et al. Proc. Natl. Acad. Sci. USA 86 (1989) 3691

Table 1. The Sp selection system functions for a wide variety of DNA-binding proteins

% of % of Apr transformants* %ofA Target colonies

Plasmid or phage promoter Apr AprSpr that are Spr

pMC9 (lacl) conI-1 100 91 100 pBR322 conI-1 100 0 pRPG9 (trpR) conII-2 100 87 100 pBR322 conII-2 100 0 pMC9 (lacl) conll-2 100 0 0 Agtll (cI857) conll-10 100l 0 0 pKB280 (cI) conII-10 100 85 100 pBR322 conII-10 100 0 pNAK28 (EBNA-1) conIl-11 100 80 100 pUC8 conII-11 100 0 0

Pseudolinkage to Spr was tested by transformation. After heat shock, cells were incubated in LB broth while shaking for 1-1.5 hr at 37?C to allow expression of the DNA-binding protein before plating. The transformation recipient for the pRPG9 selection was CY15075 bearing the conll-2 derivative of pNN388. The transfor- mation recipient for pMC9 was the conl-1 derivative of pNN388 (9) in JM106. The transformation recipient for pKB280 was the conlI-10 derivative of pNN388 in JM107. Agtll was lysogenized into the same strain to test for the ability of a single copy of c1857 to be selected. Lysogens were selected by infecting at a multiplicity of infection of 2 in LB broth plus 10 mM MgCl2 for 1 hr at 30?C. Cells were then plated on LB agar plates spread with 0.1 ml of Ac-h80 (109 plaque-forming units) and incubated overnight at 30?C. Colonies that grew were tested for A immunity by cross-streaking with wild-type A at 30?C and were checked for temperature sensitivity at 42?C. *Values are normalized to the number of Apr transformants. tThese percentages were calculated by replica-plating the Apr transformants after 1 day of growth on plates containing Sp or Cm. AprCms colonies were subtracted from the Apr total in this calcu- lation. Approximately 400 colonies were screened for each trans- formation. tThis value represents a normalization to AcIlh8O-resistant, tem- perature-sensitive transductants as described in the legend.

product was tested for DNA-binding activity in this assay. A single copy of the A cI gene, provided by lysogenizing the tester strain with Agtll, is insufficient to provide Spr with the conlI-10 assay promoter (Table 1). However, pKB280, an overproducer of A repressor (12), can confer Spr on the tester strain. This experiment demonstrates that even weak DNA- binding proteins that normally rely on multiple binding sites and cooperativity can be selected in this assay by increasing their levels in the cell.

Eukaryotic DNA-Binding Proteins Can Also Be Selected in the Spr Assay. EBNA-1 was selected as a model eukaryotic DNA-binding protein to test for DNA-binding in this assay. EBNA-1 is required for maintenance of the Epstein-Barr virus genome as an autonomously replicating plasmid in human cell lines and is also a transactivator of viral gene expression (13). The carboxyl-terminal third of EBNA-1 has been expressed in E. coli as a fusion protein frotn the plasmid pNAK28 and shown to encode a sequence-specific DNA- binding protein (14, 15). We constructed the test promoter conll-11 (Fig. 2) by using a sequence derived from Epstein- Barr virus that has been shown to bind the EBNA-1 fusion protein in vitro. pNAK28 produces EBNA-1 from the A PL promoter as a fusion protein of the first 33 amino acids of the A N protein and the terminal 191 amino acids of EBNA-1. The plasmid also carries a A gene, c1857, that encodes a temper- ature-sensitive repressor of the PL promoter. Cells bearing the target plasmid pNN400 containing conlI-ll become Spr in the presence of pNAK28 at 37?C, but not in the presence of pUC8 (the parent plasmid of pNAK28), which lacks the EBNA-1 sequence (Table 1). Furthermore, cells bearing pNAK28 are not 5pr at 30?C, indicating that transcription of EBNA-1 is needed for resistance (data not shown). Thus,

eukaryotic sequence-specific DNA-binding proteins are se- lectable in this assay.

A Plasmid Encoding the trp Repressor (TrpR) Can Be Enriched 100,000-Fold in One Selection Step. To test the hypothesis that molecules encoding particular DNA-binding activities can be enriched from a mixture, either pBR322, pRPG9 (trpR+), or mixtures of the two plasmids in different ratios were transformed into the trpR- strain containing the conII-2 derivative of pNN388 (Fig. 1) and selected for either Apr or AprSpr (Table 2). Alone, pBR322 did not confer Spr on the strain, but pRPG9 did at a high frequency (Table 1). To demonstrate that one can identify rare clones of a gene encoding a particular DNA-binding protein among many clones that do not, mixtures of pBR322 spiked with amounts of pRPG9 as low as I part in 100,000 were introduced into the tester strain and AprSpr colonies were selected. All but one of the 18 transformants were shown to contain pRPG9 by restriction analysis. The single pBR322-containing transfor- mant at the highest dilution was probably due to transfor- mation of a previously existing revertant in the population. This experiment demonstrates that the Spr assay can be used to identify clones encoding sequence-specific DNA-binding proteins when they are present in very low amounts in a mixture of molecules such as one might find in an expression library. Successful selections for the lac repressor and the A repressor when present in mixtures have also been performed (data not shown).

General Considerations for Designing a Genetic Selection. The most important factor in using the genetic selection scheme is the choice of a well-characterized DNA-binding site of high affinity and near-minimal size. An in vitro assay to detect binding to this sequence is important to facilitate the subsequent characterization of putative positive clones. Ide- ally, a sequence should be chosen for which a footprint of the factor on that sequence has been obtained. Less well suited, but still useful, are cis-acting sequences that have been defined genetically through deletion and point mutation analysis. Occasionally, multiple binding sites are employed as was the case for oriP. It is not clear whether these additional sites are necessary or even helpful. However, if the protein of interest shows any signs of cooperative binding to a pair of sites, multiple sites should be strongly considered.

A successful strategy for operator positiotiing has been to place the leftmost edge of an operator near the start of transcription, between positions -5 and +5 (9). Proteins that bind DNA very tightly, such as Lac-I and EBNA-1, can efficiently repress when placed further away from the start of transcription than +5, but weaker binding proteins such as Trp-R can only repress efficiently when located closer to the start of transcription (9). Approximately 95% of all sequences placed in this context retain sufficient promoter strength to function in the assay (data not shown). It is known that sequences in the transcribed region can contribute to pro- moter strength (16), but the rules governing these effects are still unknown.

Table 2. Plasmids producing specific DNA-binding activities may be recovered from mixed populations

Transformation frequency, resistant

colonies per jg Plasmid(s) Apr AprSpr

None 0 0 pBR322 5x105 0 pRPG9 6x105 4x105

pBR322/pRPG9 (100: 1)* 1 x 106 5 x 103 pBR322/pRPG9 (105: 1)* 1X 106 18

*Weight ratio.

This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PMAll use subject to JSTOR Terms and Conditions

Page 5: Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

3692 Genetics: Elledge et al. Proc. Natl. Acad. Sci. USA 86 (1989)

To construct a regulated promoter, an oligonucleotide containing the binding site is synthesized with ends that will facilitate subcloning into the assay promoter. When designing the oligonucleotide, an attempt is made to define the se- quence such that a purine, preferably an adenine or a stretch of adenines, is located at position +1, although this is clearly not essential since the conIl-10 promoter has only pyrimi- dines from -2 to +3 and it is fully functional in this assay.

Assay promoters carrying substituted binding sites are constructed in two steps as shown in Fig. 3. First, sites are synthesized and cloned into pNN396, which contains the conII promoter. Typically, pNN396 is cut with Kpn I or Asp718 and made flush-ended with T4 DNA polymerase to prepare a blunt site at position -5 or -1, respectively. This molecule is then cleaved with Pst I and purified by agarose gel electrophoresis. It is then ligated to the synthetic double- stranded oligonucleotide, which was designed to possess one flush end and one Pst I-compatible end.

Oligonucleotide-containing subclones should be se- quenced to verify their structure. This is easily accomplished because pNN396 contains a phage fi origin. After properly modified promoters have been created, they can be placed into the transcriptional interference assay vector pNN388 by using Not I and HindIII restriction enzymes (Fig. 3). These constructs should convert the CmrSpr pNN388 into a CmrSps derivative.

After choosing the best operator sequence to function in this assay, choosing the proper vector for the construction of the expression library is the most critical decision. It is clear that higher levels of functional binding protein will result in more efficient repression, as observed for A cI in Table 1. However, it is also clear that overproduction of a protein can have lethal consequences for E. coli. The optimal solution is

DNA binding site Kpn - Pst / Op /

KpnSmaBamXbaSalPstSphHindill

140t ~~~~~Not! Op Hind!ll

ori f 1 ori ori f I ori

Noti Hindlil Not! Hind!il Not! Hind!il

ori aad A pNN388

Cmr

Repbrter Plasmid CmT Sps

FIG. 3. Construction of regulated promoters. A synthetic oligo- nucleotide containing a binding site, labeled Op, is introduced into pNN386. This figure shows an oligonucleotide with Kpn I and Pst I ends, but any combination of end sites that are compatible with the desired position of insertion can be used. Insertion of this oligonu- cleotide creates a new potentially regulated target promoter, Preg. This new promoter is then excised on a Not I-HindIII fragment and cloned into the single-copy assay vector pNN388. This produces the required transcriptional interference needed for selection and changes pNN388 from SpT to Sps. bla, f3-Lactamase gene.

to use a tightly regulated strong promoter such as Plac, Ptac, or A PL (17, 18) and attempt the selection at two different levels of expression. Further, the greatest likelihood that a DNA-binding protein will be functional occurs when that protein is produced from its normal initiation codon. If a protein fusion must be made to provide optimal translation, then the fewest amino acids possible should be added prior to the fusion junction. Many proteins must form higher-order oligomeric structures in order to function properly and the larger the perturbation, the greater the probability of inter- ference. Some expression vectors designed primarily for immunological purposes fuse proteins to the carboxyl termi- nus of p-galactosidase, creating a large fusion protein. Sur- prisingly, several DNA-binding proteins have been shown to retain their DNA-binding activities when fused in this manner (5, 19).

After the DNA-binding sequence has been defined and placed into pNN388 to produce a CmrSps derivative, and the expression libraries to be screened are chosen, the selection can be attempted. A second round of selection can be employed to further the enrichment of the desired plasmids. One potential difficulty with the second round of selection is that some of the revertants from the first selection are due to mutations in the pNN388-based assay plasmid-e.g., in- creased copy-number mutants and rearrangements that alter the transcriptional interference. If a plasmid library were used, then the typical second-round selection would be to prepare plasmid DNA from the putative positives en masse and reintroduce this DNA into the assay strain by transfor- mation, selecting for AprSpr. However, since Spr plasmids already exist in the population, they will occasionally cotransform the Apr clones and will produce secondary false positives. One simple way to circumvent this problem is to cleave with Not I prior to the second-round transformation. This will linearize the assay vector in the population. Another method is to insert the assay cassette into the host chromo- some. This might be achieved either by employing homolo- gous recombination as detailed by Winans et al. (20) or by placing the assay cassette in the A genome and creating a lysogen.

One major advantage of having the interference-assay cassette on the chromosome is that now a plasmid bearing the regulated assay promoter driving lacZ expression on pNN387 (9) can be used as a confirming screen in the second selection step. We advise using the lacZ plasmid in the second selection step only, because the homologous promoters on the plasmid and the chromosome may interact at low fre- quency to produce rare revertants. Rare events will be less of a problem in the second round of selection.

A positive sign that an enrichment for a plasmid molecule encoding a DNA-binding activity has occurred is an increase in the ratio of AprSpr to Apr transformants in the second round of selection. If this has not occurred, it may be due to a high rate of reversion and a third round of selection should be performed. Potential positive clones should then be purified and treated individually.

Once pure positives have been isolated, linkage and specificity must be demonstrated. Putative positive clones must be shown to transduce the Apr phenotype with the Sp' phenotype 100% of the time. Further, they must transduce Spr to only those strains bearing assay plasmids with the cognate DNA-binding site. Reporter strains containing target sites known not to bind the protein should remain 5pS.

The final and most rigorous proof of having isolated the sought-after gene is the demonstration of a binding activity in vitro that matches the properties observed for the original binding factor such as binding competition, footprints, and methylation protection patterns. Ideally, one would like to demonstrate that this cDNA, when expressed in its native

This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PMAll use subject to JSTOR Terms and Conditions

Page 6: Genetic Selection for Genes Encoding Sequence-Specific DNA-Binding Proteins

Genetics: Elledge et al. Proc. Natl. Acad. Sci. USA 86 (1989) 3693

organism, produces the expected regulation in vivo and overproduces the identified binding factor assayed in vitro.

Conclusions. We have demonstrated the feasibility of a genetic selection scheme based on a transcriptional interfer- ence assay to isolate clones that express proteins that bind to specific DNA sequences. This selection can be employed in a transformation procedure to specifically enrich for mole- cules encoding the proper sequence-specific binding protein in a mixture of molecules encoding nonspecific proteins. We have shown that both prokaryotic and eukaryotic DNA- binding proteins can be detected by this assay. The advan- tageous features of this system are that (i) it functions in E. coli and is therefore relatively quick and easy; (ii) it has a low background reversion frequency, and multiple rounds of selection can be used to purify rare clones; (iii) it is compat- ible with existing A-based cDNA expression vectors as well as plasmid vectors; (iv) the binding site is present on a single-copy plasmid or the chromosome, thereby minimizing the number of molecules of binding protein needed; (v) it is extremely sensitive, requiring DNA-binding proteins in con- centrations that give only 4-fold repression (9); (vi) Sp levels may be adjusted to select for minimal or maximal levels of repression; (vii) relatively small proteins that bind DNA weakly can be selected with this system; and (viii) extremely large libraries can be completely screened.

This selection scheme has certain limitations. Eukaryotic DNA-binding proteins that require posttranslational modifi- cations absent from E. coli will be difficult to isolate, as will binding functions requiring two different polypeptides for activity (21). Factors that have a short recognition sequence (7 base pairs or less) will be difficult to identify because sites identical to the recognition sequence are present on the E. coli chromosome and will compete efficiently with the site on the assay plasmid.

This assay is also useful once a gene has been isolated. It can be used to rapidly identify mutations in the gene and to select revertants of these mutations. Having a functional assay in E. coli will also greatly facilitate the structure- function analysis of the gene. Further, this assay may be used to select for novel binding specifities.

We feel that this genetic selection, along with the other recently developed screens for identification of sequence- specific DNA-binding proteins, has set the stage for the systematic isolation of eukaryotic DNA-binding proteins from expression libraries. The critical task that lies ahead is

the production of large cDNA expression libraries of high quality in easy-to-manipulate expression vectors.

We thank M. Kuroda, J. Mulligan, M. Sachs, and B. Konforti for critical comments on the mansucript. We wish also to thank M. Kuroda, S. Lewis, and T. Huynh for many helpful discussions and H. Singh for gifts of plasmids and oligonucleotides. S.J.E. was a Helen Hay Whitney Fellow and an American Cancer Society Senior Fellow during the course of this work. This work was supported by grants from the National Instiute of General Medical Sciences (5R37-GM21891) and the National Science Foundation (DMB 8719440) to R.W.D.

1. Lin, S. & Riggs, A. D. (1973) J. Mol. Biol. 72, 671-690. 2. Gala, D. J. & Schmitz, A. (1978) Nucleic Acids Res. 5, 3157-

3171. 3. Fried, M. & Crothers, D. M. (1981) Nucleic Acids Res. 9, 6505-

6525. 4. Kadanaga, J. T., Carner, K. R., Masiarz, F. R. & Tjian, R.

(1987) Cell 51, 1079-1090. 5. Singh, H., LeBowitz, J. H., Baldwin, A. S., Jr., & Sharp, P.

(1988) Cell 52, 415-423. 6. Paulmier, N., Yaniv, M., vonWilcken-Bergmann, B. & Muller-

Hill, B. (1987) EMBO J. 6, 3539-3542. 7. Lebreton, B., Prasad, P. V., Jayaram, M. & Youdarian, P.

(1988) Genetics 118, 393-400. 8. Benson, N., Sugiono, P., Bass, S., Mendelman, L. V. &

Youderian, P. (1986) Genetics 114, 1-14. 9. Elledge, S. J. & Davis, R. W. (1989) Genes Dev. 3, 185-197.

10. Kelley, R. L. & Yanofsky, C. (1985) Proc. Natl. Acad. Sci. USA 82, 483-487.

11. Miller, J., Lebkowski, J. S., Greisen, K. S. & Calos, M. C. (1984) EMBO J. 3, 3117-3121.

12. Bachman, K. & Ptashne, M. (1978) Cell 13, 65-71. 13. Reisman, D. & Sugden, B. (1986) Mol. Cell. Biol. 6, 3838-3846. 14. Rawlins, D. R., Milman, G., Hayward, S. D. & Hayward,

G. S. (1985) Cell 42, 859-868. 15. Reisman, D., Yates, J. & Sugden, B. (1985) Mol. Cell. Biol. 5,

1822-1832. 16. Kammerer, W., Deuschle, U., Gentz, R. & Bujard, H. (1986)

EMBO J. 5, 2995-3000. 17. Boer, H. A., Comstock, L. S. & Vasser, M. (1983) Proc. Natl.

Acad. Sci. USA 80, 21-25. 18. Rosenberg, M., Ho, Y. & Shatzman, A. (1983) Methods

Enzymol. 101, 123-138. 19. Johnson, S. & Herskowitz, I. (1985) Cell 42, 237-247. 20. Winans, S. C., Elledge, S. J., Mitchell, B. B. & Walker, G. C.

(1985) J. Bacteriol. 161, 1219-1221. 21. Olesen, J., Hahn, S. & Guarente, L. (1987) Cell 51, 953-961.

This content downloaded from 130.132.123.28 on Fri, 2 May 2014 21:18:29 PMAll use subject to JSTOR Terms and Conditions