8
The EMBO Journal vol. 10 no. 5 pp. 1 179 - 1186, 1991 Optimal DNA sequence recognition by the Ultrabithorax homeodomain of Drosophila Stephen C.Ekker, Keith E.Young, Doris P.von Kessler and Philip A.Beachy Howard Hughes Medical Institute, Department of Molecular Biology and Genetics, The Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA Communicated by D.Nathans The 61 amino acid homeodomain is conserved among members of a family of eukaryotic DNA-binding proteins that play regulatory roles in transcription and in development. We have refined a rapid method for determining optimal DNA binding sites and have applied it to a 72 amino acid peptide containing the homeodomain of the Ultrabithorax (Ubx) homeotic gene of Drosophila. The site (5'-TTAATGG-3') is tightly bound (KD 7 x 10-11 M) by the Ubx homeodomain peptide; the four central TAAT bases of this sequence play a primary role in determining the affinity of binding, with significant secondary contributions deriving from the flanking bases. Although previously defined genomic sites contain multiple TAAT sequences with flanking bases distinct from those in the optimal binding site, we have found a new binding site with seven near-perfect repeats of the optimal sequence; this site is located in the promoter region of decapentaplegic, a probable Ubx regulatory target. The presence of a TAAT motif in the binding sites for most other homeodomain proteins suggests the existence of a conserved mechanism for recognition of this core sequence, with further specificity conferred by interactions with bases flanking this core. Key words: development/DNA recognition/Drosophila! homeodomain/Ultrabithorax Introduction The homeodomain was first recognized as a 61 codon region of similarity in the sequences of several Drosophila genes that play important roles in embryonic development. These sequences are present in many other genes from higher and lower eukaryotes whose products also appear to function in transcriptional or developmental regulation (Gehring, 1987; Scott et al., 1989). The homeodomain is sufficient for sequence-specific DNA binding activity, even without the context of a flanking polypeptide sequence (Mihara and Kaiser, 1988; Muller et al., 1988). The C-terminal half of the homeodomain resembles the DNA-contacting helix -turn - helix portion of certain prokaryotic repressors and, indeed, recently determined structures for the Antennapedia (Antp) and engrailed (en) homeodomains of Drosophila confirm a helix -turn - helix conformation for this region (Qian et al., 1989; Kissinger et al., 1990). The functional significance of this structural homology is supported by the observation that DNA sequence preferences can be redirected by amino acid changes in portions of Oxford University Press homeodomain proteins corresponding to the recognition helix of the prokaryotic helix -turn-helix motif (Hanes and Brent, 1989; Treisman et al., 1989; Percival-Smith et al., 1990). Functional studies of chimeric proteins in the Drosophila embryo indicate that regulatory specificity is determined, at least in part, by the identity of the homeodomain present within a particular protein (Gibson et al., 1990; Kuziora and McGinnis, 1989; Mann and Hogness, 1990). Our focus here is upon Ultrabithorax (Ubx), a homeotic gene within the lowest tier of the genetic control hierarchy that directs early Drosophila development. Ubx operates within the segmented framework established in the embryo by earlier-acting maternal effect and segmentation genes; its function is to specify the unique features of parasegments 5 and 6 (PS5 and PS6), which together constitute a contiguous region including the posterior thorax and a portion of the first abdominal segment (reviewed in Beachy, 1990). As a result of differential splicing, the Ubx transcription unit produces a family of closely related proteins (Beachy et al., 1985; Komfeld et al., 1989; O'Connor et al., 1988) which appear to function in the control of transcription (Johnson and Krasnow, 1990; Krasnow et al., 1989; Samson et al., 1989; Thali et al., 1988). Since each resultant protein product contains the homeodomain, each is presumably capable of sequence- specific DNA binding; this property, however, has been directly studied for only one of the Ubx protein structures (Beachy et al., 1988). Although DNA binding sites for a variety of homeodomain proteins have been identified, optimal sites have not been defined; nor have the contributions of individual bases to the affinity of binding been systematically examined. Such information is crucial for understanding how homeodomain proteins discriminate between DNA binding sites, and in particular, how homeodomain proteins might bind cooperatively or competitively to DNA in situations where a number of different homeodomain regulatory proteins are present within a cell. Such information is also essential as a basis for interpreting three-dimensional structure data concerning the molecular mechanisms of DNA sequence recognition. We have refined a method for determining the optimal binding site of a DNA-binding protein; this method provides a statistical indication of the relative importance of individual bases within the binding site. We have applied this method to a homeodomain peptide encoded by Ubx and have confirmed the validity of the statistical predictions via a biochemical analysis of several binding site sequence variants. Our results indicate that the Ubx homeodomain optimally recognizes a seven base pair sequence (5'-TTAATGG-3'), within which the central TAAT core plays a major role in determining the affinity of binding. Bases flanking the TAAT core further contribute to overall affinity and hence to sequence specificity. Since TAAT is a sequence common to many other homeodomain binding 1 79

The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

The EMBO Journal vol. 10 no. 5 pp. 1 179 - 1186, 1991

Optimal DNA sequence recognition by the Ultrabithoraxhomeodomain of Drosophila

Stephen C.Ekker, Keith E.Young,Doris P.von Kessler and Philip A.Beachy

Howard Hughes Medical Institute, Department of Molecular Biologyand Genetics, The Johns Hopkins University School of Medicine,Baltimore, MD 21205 USA

Communicated by D.Nathans

The 61 amino acid homeodomain is conserved among

members of a family of eukaryotic DNA-binding proteinsthat play regulatory roles in transcription and indevelopment. We have refined a rapid method fordetermining optimal DNA binding sites and have appliedit to a 72 amino acid peptide containing the homeodomainof the Ultrabithorax (Ubx) homeotic gene of Drosophila.The site (5'-TTAATGG-3') is tightly bound (KD

7x 10-11 M) by the Ubx homeodomain peptide; thefour central TAAT bases of this sequence play a primary

role in determining the affinity of binding, withsignificant secondary contributions deriving from theflanking bases. Although previously defined genomic sitescontain multiple TAAT sequences with flanking basesdistinct from those in the optimal binding site, we havefound a new binding site with seven near-perfect repeatsof the optimal sequence; this site is located in thepromoter region of decapentaplegic, a probable Ubxregulatory target. The presence of a TAAT motif in thebinding sites for most other homeodomain proteinssuggests the existence of a conserved mechanism forrecognition of this core sequence, with further specificityconferred by interactions with bases flanking this core.Key words: development/DNA recognition/Drosophila!homeodomain/Ultrabithorax

Introduction

The homeodomain was first recognized as a 61 codon regionof similarity in the sequences of several Drosophila genes

that play important roles in embryonic development. Thesesequences are present in many other genes from higher andlower eukaryotes whose products also appear to function intranscriptional or developmental regulation (Gehring, 1987;Scott et al., 1989). The homeodomain is sufficient forsequence-specific DNA binding activity, even without thecontext of a flanking polypeptide sequence (Mihara andKaiser, 1988; Muller et al., 1988). The C-terminal half ofthe homeodomain resembles the DNA-contactinghelix -turn- helix portion of certain prokaryotic repressorsand, indeed, recently determined structures for theAntennapedia (Antp) and engrailed (en) homeodomains ofDrosophila confirm a helix -turn - helix conformation forthis region (Qian et al., 1989; Kissinger et al., 1990). Thefunctional significance of this structural homology is

supported by the observation that DNA sequence preferencescan be redirected by amino acid changes in portions of

Oxford University Press

homeodomain proteins corresponding to the recognition helixof the prokaryotic helix -turn-helix motif (Hanes and Brent,1989; Treisman et al., 1989; Percival-Smith et al., 1990).Functional studies of chimeric proteins in the Drosophilaembryo indicate that regulatory specificity is determined, atleast in part, by the identity of the homeodomain presentwithin a particular protein (Gibson et al., 1990; Kuziora andMcGinnis, 1989; Mann and Hogness, 1990).Our focus here is upon Ultrabithorax (Ubx), a homeotic

gene within the lowest tier of the genetic control hierarchythat directs early Drosophila development. Ubx operateswithin the segmented framework established in the embryoby earlier-acting maternal effect and segmentation genes; itsfunction is to specify the unique features of parasegments5 and 6 (PS5 and PS6), which together constitute acontiguous region including the posterior thorax and aportion of the first abdominal segment (reviewed in Beachy,1990). As a result of differential splicing, the Ubxtranscription unit produces a family of closely relatedproteins (Beachy et al., 1985; Komfeld et al., 1989;O'Connor et al., 1988) which appear to function in thecontrol of transcription (Johnson and Krasnow, 1990;Krasnow et al., 1989; Samson et al., 1989; Thali et al.,1988). Since each resultant protein product contains thehomeodomain, each is presumably capable of sequence-specific DNA binding; this property, however, has beendirectly studied for only one of the Ubx protein structures(Beachy et al., 1988).

Although DNA binding sites for a variety of homeodomainproteins have been identified, optimal sites have not beendefined; nor have the contributions of individual bases tothe affinity of binding been systematically examined. Suchinformation is crucial for understanding how homeodomainproteins discriminate between DNA binding sites, and inparticular, how homeodomain proteins might bindcooperatively or competitively to DNA in situations wherea number of different homeodomain regulatory proteins arepresent within a cell. Such information is also essential asa basis for interpreting three-dimensional structure dataconcerning the molecular mechanisms of DNA sequencerecognition.We have refined a method for determining the optimal

binding site of a DNA-binding protein; this method providesa statistical indication of the relative importance of individualbases within the binding site. We have applied this methodto a homeodomain peptide encoded by Ubx and haveconfirmed the validity of the statistical predictions via abiochemical analysis of several binding site sequencevariants. Our results indicate that the Ubx homeodomainoptimally recognizes a seven base pair sequence(5'-TTAATGG-3'), within which the central TAAT core

plays a major role in determining the affinity of binding.Bases flanking the TAAT core further contribute to overallaffinity and hence to sequence specificity. Since TAAT isa sequence common to many other homeodomain binding

1 79

Page 2: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

S.C.Ekker et al.

sites, interactions with flanking bases may provide the basisfor discrimination between binding sites by differenthomeodomain proteins.

ResultsSelection of binding site oligonucleotidesOur strategy to define the optimal Ubx binding site is anamplification of the procedure described by Oliphant et al.(1989) and is similar to a method recently reported(Blackwell and Weintraub, 1990). We used an affinity matrixcontaining immobilized Ubx homeodomain peptide to selectbinding site sequences from a population of 70 baseoligonucleotides with a random sequence core. As illustratedin Figure IA, the 12 bp random sequence core within the70mer was flanked by restriction sites for cloning and endsequences homologous to two 17 base primers which wereused either for amplification in the polymerase chain reaction(PCR) or to make the 70mer double-stranded. The generalscheme (Figure IB) was to select specifically boundsequences by loading double-stranded 70mers on to the Ubxhomeodomain affinity column at low salt concentration;tightly bound DNA was then isolated by washing the columnat intermediate salt and eluting at high salt concentrations.The design of our oligonucleotides permitted amplificationby PCR, thus facilitating handling and cloning of the smallDNA quantities that remain after multiple rounds ofselection.The 72 amino acid polypeptide used to construct the

affinity matrix was expressed in E.coli using pET3c, aplasmid expression vector containing a T7 promoter(Rosenberg et al., 1987). The DNA fragment to be insertedin pET3c was generated by a PCR with a Ubx cDNA

A

Primer A__EcoRI Xhol

(N)12

template. The primers for this PCR were designed such thatappropriate initiation and termination codons and restrictionsites for cloning into pET3c were added in a single operation(Figure 2A; see Materials and methods). The plasmid usedfor Ubx homeodomain expression (pUHD-72; Figure 2B)carried tandem repeats of the homeodomain coding fragment,both of which contained the 61 codons of the Ubxhomeodomain with a 10 codon C-terminal extension (aminoacid residues 295 - 365 in the UBX LI I open reading frame;Beachy, 1990; Kornfeld et al., 1989). We do not knowwhich of these repeats was expressed, although each openreading frame was initiated by a methionine codon andfollowed by an amber chain termination codon; we suspectthat the particularly high level of homeodomain accumulation

A ende!hTol;extended homology

to abd-A

b5 U b x _ 'bxCNA3'

Ubx homeo Ddomain

B

T CGA... x A A: .

Fig. 2. Construct for expression of the Ubx homeodomain peptide. (A)PCR primers and template for generating the DNA fragmentcontaining the Ubx homeodomain fragment. Primers C and D definethe extent of the fragment and provide initiation and terminationcodons as well as restriction sites for cloning. The fragment containsthe Ubx homeodomain plus a 10 codon C-terminal extensionhomologous to the abd-A homeotic gene (details in text and inMaterials and methods). (B) Structure of the Ubx homeodomainexpression plasmid. In the final construct, pUHD-72, the sequenceencoding amino acids 295 -365 in the UBX LII open reading frame(Beachy, 1990) is present in two tandem repeats. The repeats includethe Ubx homeodomain, a 10 codon C-terminal extension as well asinitiation (Met) and termination (AMB) codons for each repeat and aShine-Dalgarno (SD) sequence upstream of the first repeat. Details ofthe construction are given in Materials and methods.

.Prime?

_5 c:En n CnEc m E m

B

+ ;f s;n ,e -:

- -S 'rpIOP - r tc

: p 3t e :

Fig. 1. General strategy for determination of optimal binding sitesequences recognized by DNA-binding proteins. (A) Oligonucleotideused for binding site selection. The 70 base oligonucleotide contained a

12 base random core, (N)12. The two restriction sites used for cloningand the two primers used for PCR are schematically represented (seeMaterials and methods for complete oligonucleotide sequences). (B)General scheme for the selection and analysis of double-strandedoligonucleotides containing binding sites.

1180

X_~II

-ualI

2 3

_iiii,! ,,,

._

4 5 6

Fig. 3. Expression and purification of the Ubx homeodomain peptide.Protein samples from uninduced (U) and induced (I) cells containingthe expression plasmid are shown in lanes 2 and 3; the polypeptidecomposition from various stages of the purification is shown in lanes5-8 (CL, cleared lysate; BR, BioRex 70 pool; MS, Mono S pool;PS, Phenyl Superose pool; 5 jig protein are loaded in each lane).Markers (M) in lanes 1 and 4 are 43, 29, 18.4, 14.3 and 6.2 kd.Samples were electrophoresed on a 15% SDS-polyacrylamide gel andstained with Coomassie blue.

mow mm..

.. .;I r -,. t -

Page 3: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

Optimal DNA binding by the Ubx homeodomain

produced by this construct may have been due to expressionof both repeats. The C-terminal extension was includedbecause it encompasses conserved residues encoded by Ubx,the abd-A homeotic gene of Drosophila, and the leechhomeodomain gene Lox-2 (Wysocka-Diller et al., 1989;Karch et al., 1990).The identity of the extended Ubx homeodomain peptide

Table I. Amount of DNA retained by the affinity column afterwashing with the indicated NaCi concentration

Round Amount of DNA retained (as % input)

0.25 M Na+a 0.3 M Na+ 0.35 M Na+ 1.0 M Na+

1 5.4 _b 02 72.3 10.0 - 0

(141)c (19)3 - (30) (8.9) 0

aOther components of the wash buffer are given in Materials andmethods.bA dash indicates that the column was not washed at the indicatedconcentration during that selection round.cValues in parenthesis are absolute amounts of DNA (in ng).

A

was confirmed by its immunoreactivity with a homeodomain-specific monoclonal antibody (data not shown) and by theconcordance between predicted and observed mobilities inSDS-PAGE (expected Mr = 9169; see Figure 3A).Purification to homogeneity from E. coli extracts (details inFigure 3B and in Materials and methods) was achieved bya combination of conventional column chromatography (Bio-Rex 70) with FPLC (Mono S, Phenyl Superose). Thepurified peptide was immobilized by attachment to cyanogenbromide activated Sepharose and this affinity matrix was usedfor three rounds of selection designed to enrich foroligonucleotides containing Ubx homeodomain binding sitesequences (details in Materials and methods). The precisedegree of enrichment for specific sequences could not bedetermined because of differences in quantity of DNA loadedat each round and because the activity and specific bindingcapacity of the matrix-bound protein was unknown. Theoccurrence of enrichment between rounds one and two wassuggested by the increase in percentage input DNA retainedafter the 0.25 M wash (see Table I). Enrichment betweenrounds two and three was more difficult to assess becauseof amplification by PCR after the second round, whichgreatly magnified the quantity of specific DNA being loadedon to the column. The occurrence of enrichment at this stagewas suggested by the greater mass of DNA retained duringround 3 after the 0.3 M wash. Amplification was again usedafter the third round of selection to facilitate cloning. Thediffering sequences of all but two of the oligonucleotides (62and 63 in Figure 4A) indicate that the PCR amplificationitself did not detectably skew the population toward aparticular sequence.

8 -. -1 -1 T A -2-- - TI - Ials ll

45 A TA A TG G C T G T C46 T A A T G A G T C C47 T C A T G G T T T T A T48T T A A C G

A

C C T C A44: T T G A A G T A A T GSO A G G G A T A A T T G CII T T A T T G C C C A T G52 A A G G T G T C A G C.53 T T T A A C G A G G A A54 T A A T G G A C T C T A55 A T T A A T G G C A C5I G G G C A C T T A A T57 A A T G A T T C T-

C G G G A C T T A A C G -59 T T A A A T T GGC A

A T A A T T G A G G T GA A T A G G C T C A

2 AA A T G G G G T C GI G3 A A A T G G G G T C GG-64 T C G T T A A T G G A CI5 G T G A C C C T A A T T

T T T T A T G G G C T T:7 T A G T T GA^ G G G G G-6: G C A G C A T A A T C a

T A A T A G C T A A G T-70 A T C A A A A T A A TT-7I T A A T A G C T A A G T-72 T T G G T T A T A A G C73 T A A T G G C C T T C C74 G G T G T T G A T G C C75 C G A G C T T T T A TG7* G G T G A T A G A T T77 A T T G T T C T A G-7: A A T T G T T A C A T7§ G C C C T A A T T G C Te A A T G G C T A G T T:i T C C C A C T T A AT-:2 T G G G T T T A A T T83 A A T G G A T T A A C-a4 G G A C T A T C G T T85 A G Gl G A G T A A T A

a T T A C T T C TA A T G87 C G T T A A G G G C A G88 C A A G G C T T A A T G

A T Gn1 e,- 0 t Ia

8S5 0 8 13 9 6 1 1 9 6 40 3 3 23 17 1 14 5 32 3 41 41 17 11 4 7 15 71 82 21 2 8 20 22 1 11 4 1088a88 74 S9 57 54 48 41 30 24

LOD 1.7 0.5 1.5 3.4 1.7 0.9 5.9 43 38 54 49 3.3 15 2.1 1.7 3.1 0.7 2.8 12

Fig. 4. Tabulation and analysis of Ubx homeodomain binding sitesequences present within affinity-selected oligonucleotides. (A) Alignedoligonucleotide sequences. The nucleotide sequences ofoligonucleotides cloned after the third round of selection are shown as

aligned by the method described in the text. Only the randomsequence core portion of each oligonucleotide is shown. The (+) and(-) respectively indicate that the EcoRI cloning site is present to theleft or to the right of the sequence shown. (B) Statistical analysis ofbase usage by position At each position (numbered as in panel A),the number of occurrences of the four bases is shown along with an

estimate of the degree of skewing from random expectation. Thisestimate was derived as described in the text and is given as a

probability in the form of a LOD score (A LOD of 5 indicates a

probability of one in l05). Seven positions clustered in the center ofthe tabulation display LOD scores higher than those of the surroundingpositions; the sixth and seventh positions show strong secondarypreferences when the base of first preference is not present. Theconsensus binding site sequence thus derived is 5'-T-T-A-A-T-G/T-G/A-3'.

I

U..- --

--ia=

*U

a

i-~~~~

Fig. 5. DNase I protection by Ubx homeodomain of synthetic andnaturally occurring DNA sequences containing the consensus bindingsite. DNase I footprinting was carried out as described in Materialsand methods. Ubx homeodomain peptide concentrations are given inpM and protected sequences are indicated by the solids bars. Withinprotected sequences, the ATTA motif is underlined. (A) Footprint of a

single consensus binding site sequence. A 15 base protection iscentered over the consensus sequence present in oligonucleotide clone54 (Figure 4A). The protein concentration required for half-maximalprotection was between 1 and 3 x 10- 10 M (lanes 3 and 4). (B)Protection of a Drosophila genomic DNA containing multiple repeatsof the consensus binding site sequence. The protected region containsseven repeats of the near-perfect consensus sequence TAATGG. Theserepeats are located near the promoter of the decapentaplegic gene(nucleotides 987- 1028 according to the numbering of Padgett et al.,1987), a likely target for regulation by Ubx protein. The proteinconcentration required for half-maximal protection was - 8 x 10- M(lane 7).

1181

8 .. *5 '. -1 ' ' T A A T Qo^ ,@2

I G A C G G C T T A A T2 G A A T G G G T A A T A3 C G A C T A A T T G C A4 C T A A T T A T T A C T5 G C G G G A C T A A TI A T C A T G G C C C C7T A A T C C C A

T A A T T G G G T G G AC C C T A A T G G G C A

10 A G G T A G T A A T11 T A A A T A T T A A TG12 G C G T A C G T A G T13 C T T A T G A C C T C -14 T A A T T G G T T C T a

I5 G T A T A A T G G C A T11 G C G G A C T T A A T (G1 7 T A A T G G G T T G G G11 T A C T T A A T G a G1§T A A T G A C T T G G T-

20 T A C T C C T TAA T T21 T T T TA T G G T T T A22 A A T

a

C A C C G G-23 C T T A A T

a

a G C A C24 G T C A (IA T T TA T25 G G G G TAA TAA A G26 . A T T TA T GT T27 C G G T (IG T TAA T2a A C T A A T G TAA T2: A T G T T C A G T.3e c T G T T A A T G A C C31 G G C G G3 T T A A T32 TAA T T A C G TA G T-33 G T A C C G G T A A T T34 TAA T T G A TT CA A35 G T T A A T T A G G T G3: G A A C G G A TAA T37 C TAA T T A C CC C C3 A G G T C T G A T G A3: A G T G T C G TAA T40 G A T G G A T T A A T41 T T T A A T G G C T A T42 T A A T G G A C T G G T43 A C C T A A T G G C C C44 T A T G G G T T T A T

0 6 .S .4 .3 -t .1 T T A

A 4 9 9 7 12 9 9 3 74C 4 5 7 7 9 1 8 13 0 3a 12 8 14 20 18 16 8 0 3T 7 8 4 7 6 12 32 76 6

Total 27 30 34 41 45 55 62 79 811

Page 4: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

S.C.Ekker et al.

A

*_ _ Mt e

_ law ._ _ - F

2 3 4 :F0

.-

60 50 40 30 20 i3 i imr

_mw moo- -. -_. --_ -O U0 -B

0 _w _so we gm MP am.

l 3 7 9m 4- 4 5 6 7~ 8 9 1 0 1 1 12 13 1415F

iK.

Fig. 6. Determination of equilibrium binding coefficients. (A)Equilibrium DNA binding as a function of homeodomain peptideconcentration. DNA mobility shift assays were as described inMaterials and methods using the indicated protein concentrations and 5

pM end-labeled DNA (from oligonucleotide clone 54, which containsthe optimal binding site; see Figure 4A). B is bound, F is free DNA.(B) Quantitative analysis. Bands from (A) were excised, counted andthe fraction of bound DNA was plotted as a function of the log ofUbx homeodomain peptide concentration. Under conditions of proteinexcess the concentration required for half-maximal binding (73 pM)may be considered an estimate of the equilibrium binding coefficient(see Table II).

Determination of a consensus binding site sequenceSequences of cloned oligonucleotides were optimally alignedby inspection, a task greatly facilitated by the presence withinmost sequences of the tetramer 5'-TAAT-3' (Figure 4A).All sequences containing a single TAAT motif were usedto derive a preliminary consensus; this preliminary consensus

was used as a guide to align optimally the remainingsequences that contained more than one complete or an

incomplete TAAT motif (the final consensus was identicalto this preliminary consensus). In several instances the bestalignment for a particular oligonucleotide produced aconsensus region that contained one or more basescontributed by sequences flanking the random core of the70mer. This occurred most frequently in certain specificalignments (e.g. 10 instances where downstream flankingsequences supplied a G at positions 6 and 7, and 12 instanceswhere a T at position 1 was supplied by upstream flankingsequences). High frequencies of specific alignmentsinvolving flanking bases as part of the consensus region are

an inherent feature of this type of approach; this feature couldcreate distortions in the final consensus through a 'piggyback'effect where other adjacent bases within flanking sequences

are made to appear important for binding. The effect was

avoided by tabulating for statistical analysis only basesderived from within the random sequence region of the70mer. As an estimate for the significance of skewing from1 182

Fig. 7. Determination of dissociation rate constants. (A) Decay ofprotein-DNA complexes as a function of time. The stabilities ofprotein-DNA complexes were assayed by DNA mobility shift assays(see Materials and methods). Complexes pre-formed with labeled DNAwere incubated in the presence of excess unlabeled competitor DNA;times of incubation are indicated in min above lanes 5-15. End-labeled DNA was from oligonucleotide clone 54, which contains theoptimal binding site (see Figure 4A). Lane 1 shows the end-labeledDNA alone, lane 2 shows a binding reaction without addedcompetitor, lane 3 shows a binding reaction loaded immediately afteraddition of competitor (time 0), and lane 4 shows a reaction wherecompetitor was added before protein. B is bound, F is free DNA. (B)Quantitative analysis. Bands from (A) were excised, counted andln(fraction DNA bound) was plotted as a function of time. Thedissociation rate constant was determined from the formula: ln(fractionDNA bound)= -kdt. Only early time points were used to minimize theeffect of reassociation.

random expectation, the number of occurrences of the fourbases at each position within the tabulation was used tocalculate the x2 statistic and a corresponding probability(expressed as a LOD score; Figure 4B). Because of themethod of tabulation, the probability scores should not betaken as absolute measures of base preference but rather as

approximate indications of the degree of constraint.Of the 19 positions tabulated, a cluster of seven centrally

located positions displayed a significantly higher degree ofconstraint with respect to base preference (Figure 4B). Twoof these seven displayed strong secondary preferences whenthe base of first preference was not present. The consensussequence thus derived is 5 '-T-T-A-A-T-G/T-G/A-3'. Withinthis consensus, the central four positions (the TAAT core)appeared to be more highly constrained, with LOD scoresranging from 38 to 54 versus LOD scores from 5.9 to 15for the three outer bases. Specific binding of the Ubxextended homeodomain peptide to a DNA restrictionfragment containing the consensus sequence was

demonstrated by the DNase I protection experiment shownin Figure 5A, with a 15 base protected region centered over

the seven base consensus sequence. Under the conditions

B

M

Page 5: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

Optimal DNA binding by the Ubx homeodomain

Table H. Ubx homeodomain binding to variants of the optimal sequence

1 2 3 4 5 6 7 KDa AGa KD kdXlOOa t12a KdT T A A T G/T G/A (pM) (kcal) (rel) (min- ') (min) (rel)

Ab T T A A T G G 73, 67C -13.7 1.00 2.3 0.1, 2.2C 30.1 1.00B T T A A T T A 83 -13.6 1.19 3.1 0.1 22.4 1.35C T T A A T a G 89 -13.5 1.27 3.3 0.2 21.0 1.43D a T A A T T G 100 -13.4 1.43 4.2 0.3 16.5 1.82E T T t A T G G 360 -12.7 5.14 5.0 0.3 13.9 2.17F T T A A g G G 360 -12.7 5.14 5.9 0.1 11.7 2.57G a c A A g c A 23 000 -10.3 329 190 0.37 83

LODd 5.9 48 38 54 49 8.8 15

aEquilibrium dissociation constants (KD), dissociation rate constants (kd), free energies (AG), and half-lives of the complexes (0lI2) were determined asdescribed in Materials and methods and illustrated in Figures 6 and 7. The kd values (except for CA and G) are given as an average of twoindependent determinations 4 the standard error.bSequences A, C, D, E and F respectively, are DNAs 54, 69, 50, 66 and 87 as given in Figure 4A. Sequences B (TAATTAGGGGGC+) and G(GACAAGCACTC+) were derived from other experiments and are aligned for maximum fit to the consensus. The small letters indicate differencesfrom the optimal binding site sequence, with upper case letters indicating occurrence of a secondary preference.CKD and kD values were also measured for a second DNA containing the optimal binding site sequence (DNA 41 as given in Figure 4A). Anaverage KD value of 70 pM was used for A to calculate free energy and the relative KD.dLOD scores are from Figure 4B.

of this experiment, where the homeodomain peptide was inmolar excess relative to the DNA fragment, the proteinconcentration required to produce half-maximal protection(1-3x10-10 M) may be considered an estimate of theequilibrium dissociation coefficient (KD).

The dependence of binding affinity upon sequencecorresponds to the degree of constraint at a particularpositionTo quantify binding affinities more accurately, we used aDNA fragment mobility shift assay. As shown in Figure 6,the KD measured for the consensus sequence was 7 x 101-I1M. By similarly measuring the affinities ofDNA fragmentsfrom other oligonucleotide clones of known sequence, wewere able to correlate binding affinities with changes atparticular positions within the consensus binding site. Afurther indication of complex stability as a function ofDNAsequence was obtained by measurement of the dissociationrate (kd) for complexes between the Ubx extendedhomeodomain peptide and various oligonucleotides (Figure7). Unlike the footprinting and mobility shift equilibriumbinding experiments, which were all carried out atapproximately physiological ionic strengths, these kineticexperiments were carried out at reduced ionic strength toslow down dissociation rates and thus bring them into ameasurable range. (The half-life of the Ubx homeodomaincomplex with the consensus binding site was < 1 min atphysiological ionic strengths.) The results from theequilibrium and kinetic experiments are summarized in TableII.The principal conclusion from our binding measurements

is that alteration of the consensus binding site sequence ata given position affects overall affinity of binding (KD) or

complex stability (kd) to an extent that correlates well withthe degree of constraint observed in our statistical analysis(Table II). Thus, single substitutions at the highly constrainedpositions within the inner core (rows E and-F) have greaterimpact on overall affinity and complex stability than do single(row C) or double substitutions (rows B and D) at positionsoutside the core. Likewise, substitution at a more highlyconstrained position within the inner core (row F) has greaterimpact than substitution at a less constrained position within

the core (row E). We also find that effects of substitutionsat multiple positions appear to be cumulative, as suggestedby comparison of the overlapping single, double and multiplesubstitutions in rows C, D and G. One interesting exceptionto these general rules is illustrated in row B, where a doublesubstitution at positions 6 and 7 has less impact than othersingle or double substitutions at positions outside the innercore (rows C and D). Even this exception, however, isconsistent with our tabulation since the bases present at bothsubstituted positions are the secondary preferences indicatedby our statistical analysis (i.e. two favorable substitutionsmay have less impact than a single unfavorable substitution).Although we have not systematically tested all possiblecombinations of single, double and multiple substitutions,the agreement of these binding data with the expectationsfrom the tabulation leads us to conclude that our statisticalanalysis has good predictive value and that the consensussequence is the optimal binding site for the Ubxhomeodomain peptide.

Finally, by comparing binding of Ubx homeodomain tosequences in rows A and G, we can assess the degree ofdiscrimination between specific and non-specific bindingsites. For complex stability at lower ionic strength, thedifference was > 80-fold, while at the higher ionic strengthused for the equilibrium measurements, the differenceappears to be > 300-fold.

Multiple repeats of a near-optimal binding site atdecapentaplegic, a probable Ubx regulatory targetWe have identified a new genomic site, bound both by intactUbx protein and the Ubx homeodomain peptide, whichcontains multiple repeats of a sequence closer to the optimalbinding site than sequences at previously reported sites(Beachy et al., 1988; see Discussion). This site is locatednear a promoter in the short vein region of thedecapentaplegic (dpp) gene, which encodes a proteinhomologous to the vertebrate growth factor TGF-3 andfunctions in a variety of developmental processes in theDrosophila embryo and larva (Padgett et al., 1987; StJohnston et al., 1990). This site contains seven near perfecttandem repeats of the sequence TAATGG, which is a closerelative of the TTAATGG optimal binding sequence for the

1 183

Page 6: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

S.C.Ekker et al.

Ubx homeodomain. As judged from the higher apparent KD(Figure SB), the dpp site is bound by Ubx homeodomainpeptide with lower affinity than the optimal site, consistentwith the absence of the first base of the Ubx consensus fromthe repeated motif at dpp. In contrast, binding affinity forthe dpp motif is greater than that for previously defined sitescontaining multiple repeats of the more distantly related TAAand TAATCG motifs (D.von Kessler and P.A.Beachy,unpublished). While the functional significance of the dppsite has not been demonstrated, recent genetic studies indicatepositive regulation of dpp expression by Ubx in the visceralmesoderm, where Ubx is present in a sharply defined stripecorresponding to parasegment 7 (Immergluck et al., 1990;Reuter et al., 1990). The presence near one of the dpppromoters of a sequence tightly bound by Ubx proteinsuggests that this regulation may be direct and provides anexample of near-optimal Ubx binding sites occurring withinthe Drosophila genome.

DiscussionThe optimal binding site we have defined differs frompreviously identified binding sites for intact Ubx protein inseveral respects, the most striking of which are the smallsize of the sequence and of its DNase I footprint (7 bpsequence; 15 nucleotide DNase I footprint). The naturallyThe naturally occurring sites, in contrast, range in size from40 to 90 bp and consist primarily of multiple tandem repeatsof simple sequence elements such as the triplet TAA or thehexanucleotide TAATCG (Beachy et al., 1988). Underlyingthese apparent differences, however, are some fundamentalsimilarities. We know, for example, that the homeodomainpeptide binds the large naturally occurring sites with highaffinity in DNAse I protection experiments (K.E.Young andP.A.Beachy, unpublished data). In addition, a site containingfour TAA repeats is sufficient for specific binding by theintact UBX L 1I protein (Beachy et al., 1988), indicatingthat the large size of naturally occurring sites is not absolutelyessential. It is of interest that the most important positionsin the optimal site we report here correspond to a core TAATtetramer; this motif occurs in multiple copies at all of theknown naturally occurring sites due to the repeating natureof sequences at these sites. Indeed, U-A, one of the naturallyoccurring Ubx binding sites (Beachy et al., 1990; see below),is capable of simultaneously binding multiple Ubxhomeodomain peptides (S.C.Ekker and P.A.Beachy,unpublished data); this suggests that the large naturallyoccurring sites contain multiple core recognition sequences.The TAAT motif is also present in binding sites reported

for many other homeodomain proteins, although sequencesoutside the TAAT core vary (reviewed by Hayashi and Scott,1990). These binding site sequences include many whichhave been shown to mediate a functional response toDrosophila homeodomain proteins, either in vivo or in vitro(Thali et al., 1988; Muller et al., 1989; Krasnow et al.,1989; Winslow et al., 1989; Samson et al., 1989; Hanesand Brent, 1989; Biggin and Tjian, 1989; Johnson andKrasnow, 1990; D.von Kessler, S.C.Ekker and P.A.Beachy,unpublished data).Thus, the TAAT tetramer appears toconstitute a common feature recognized by manyhomeodomain proteins and this notion is strongly supportedby the crystallographically determined structure of anengrailed homeodomain-DNA complex (Kissinger et al.,

1990). This structure reveals major or minor groove contactswith each base of the TAAT motif. These base-specificcontacts involve the side chains of amino acid residues whichare conserved in engrailed, Ubx and indeed most otherDrosophila and vertebrate homeodomains: Arg3 and Arg5are part of an N-terminal arm and they contact the first twobase pairs of the core in the minor groove, while Asn5 1 andIle47 are part of the recognition helix and provide majorgroove contacts with the third and fourth base pairs of theTAAT core (Kissinger et al., 1990).

It has been suggested that residue 50 (residue nine withinthe recognition helix) plays an important role in modulatingthe specificity of DNA sequence recognition byhomeodomain proteins (Hanes and Brent, 1989; Treismanet al., 1989). More recent work suggests that the side chainof residue 50 contacts a base(s) to the 3' side of the TAATcore (Kissinger et al., 1990; Otting et al., 1990; Percival-Smith et al., 1990). Since residue 50 in the Ubxhomeodomain is glutamine, our tabulation suggests thatGln5O prefers to interact with one or both G-C base pairsfollowing the TAAT core. Alternative modes of interactionby Gln5O may be possible, as suggested by the existenceof secondary base preferences at these positions. Since theN-terminal and C-terminal ends of the engrailedhomeodomain lie close to the DNA in the crystal structureof the complex (Kissinger et al., 1990), adjacent regions ofhomeodomain proteins may also be involved in specifyingsequence binding preferences. Whether the 10 amino acidC-terminal extension in the Ubx homeodomain is involvedin base-specific contacts is not known, but it is interestingin this regard that differential splicing of the Ubx primarytranscript (Beachy et al., 1985; O'Connor et al., 1988;Kornfeld et al., 1989) yields structures with alternative aminoacid sequences adjacent to the N-terminus of thehomeodomain. We are using the method presented in thiswork to determine how N- and C-terminal residues in largerUbx peptides might be involved in specific binding.

In light of the importance of the TAAT core forhomeodomain binding, it is tempting to speculate that therepeated triplet or hexamer motifs found in many naturallyoccurring sequences may be capable of binding multiplehomeodomain proteins through recognition of the TAATmotif, while the exact affinity of a protein for a particularsite might be modulated by the identity of the bases occurringbetween TAAT motifs. The character of a protein complexassociated with a specific site would then depend upon theidentity and level of homeodomain proteins expressed withina particular cell and upon the bases between core TAATelements. From our Ubx homeodomain binding studies, thecontributions of bases between core TAAT elements sumto several-fold differences in complex stability or overallaffinity. If one accepts the possibility for cooperative bindingto many core sites arranged in tandem (throughmultimerization or other protein -protein interactions), therange of affinities of homeodomain proteins (alone or incombination) for these naturally occurring binding siteclusters could be quite large, thus providing a highly specificmechanism for differential gene regulation. In support ofthis idea, the 45 bp U-A site near the Ubx promoter,originally identified as a binding site for Ubx protein (Beachyet al., 1988), has been reported to bind homeodomainproteins encoded by even-skipped (Biggin and Tjian, 1989),abdominal-A (Samson et al., 1989), and can probably be

1184

Page 7: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

Optimal DNA binding by the Ubx homeodomain

bound by other Drosophila homeodomain proteins not yettested. Given a knowledge of the nucleotide sequence at U-A or at other sites it would be interesting to know whetherthe affinities of single proteins or combinations of proteinsare predictable from a knowledge of: (i) the optimal bindingsites of individual proteins and; (ii) the energies ofinteractions between these proteins.The method we present for determination of optimal

binding site sequences is rapid and sensitive. One indicationof its sensitivity is its success in correctly predicting therelative importance of bases which, when altered, affectcomplex stability by < 10% (Table II). This sensitivity stemsfrom the infinitely renewable population of oligonucleotides,which permit many rounds of selection and thus make itpossible to pinpoint preferences for specific bases that makeminor contributions to overall binding energies. This methodtherefore should prove useful for studying the bindingspecificities of groups of closely related DNA-bindingproteins (e.g. other homeodomains) as well as for predictingthe binding affinities of closely related sites for a particularprotein. The three rounds of selection used in this workproduced a fairly heterogeneous collection of oligonucleotidesequences which can be driven to a much higher degree ofuniformity by further selection (D.von Kessler andP.A.Beachy, preliminary results). An advantage of usingfewer selections, however, is the occurrence of intermediatedegrees of constraint which permit assignments of relativeimportance to bases at particular positions.The small size of the footprint and the lack of dyad

symmetry in the optimal binding site sequence suggest thatthe Ubx homeodomain peptide binds DNA as a monomer;this is consistent with the peptide's behavior as as a monomerin gel filtration (data not shown) and with the monomerbinding observed for the engrailed and Antennapediahomeodomains (Affolter et al., 1990; Kissinger et al., 1990;Otting et al., 1990). From our mobility shift experimentsthe equilibrium dissociation coefficient (KD) for Ubxhomeodomain at physiological ionic strength appears to be

- 7 x 10-11 M. Binding as a monomer is somewhat unusualfor proteins of the helix -turn -helix class and this affinityis surprisingly high given that prokaryotic helix-turn-helixproteins or peptides bind to half-operator sites only weaklyor not at all (see for example, Hollis et al., 1988). The valueis comparable, however, to values reported recently for theinteraction with a specific DNA site of the Antennapediahomeodomain (Affolter et al., 1990). The ability ofhomeodomain peptides to independently fold and bind DNAwith such high affinity and sequence discrimination suggeststhat they constitute functional, autonomously acting unitswhose study should yield genuine insights into the propertiesof the larger proteins in which they function.

Materials and methodsPCR, sequence determination and plasmidsAll PCR amplifications (Saiki et al., 1988) were carried out in 10 mMTris-HCI (pH 8.3), 50mM KCI, 1.5 mM MgCl2, 0.01% gelatin, 50 AMeach of dNTP, 0.5 AM each of DNA primer and 0.025 units/4l Taqpolymerase (Cetus). Temperature cycling was done with the PTC-100 (MJResearch) at maximum ramp speed. DNA sequencing was by the double-strand plasmid method of Hattori and Sakaki (1986), using a Sequenasekit (US Biochemical).The 70 base oligonucleotide synthesized for use in binding site selection

was 5'-GTAAAACGACGGCCAGTGAATTCAGATCT(N)12GGATC-CCTCGAGGTCGTGACTGGGAAAAC-3' where (N)12 indicates an equal

mixture of the four bases during synthesis at 12 consecutive positions. PrimersA (5'-GTAAAACGACGGCCAGT-3') and B (5'-GTTTTCCCAGTCA-CGAC-3') were used for amplification of the 70mer by cycling 20 timesat 94°C (30 s), 58°C (30 s) and 72°C (10 s). Cycling was preceded by3 min at 94°C and followed by 10 min at 72°C. Following selection forthe presence of binding site sequences (see below), double-stranded 70merwas digested with XhoI and EcoRI and ligated into similarly digestedBluescript (Stratagene). Resulting clones contained 1, 3, or 5 inserts andsequence determination utilized the Bluescript T3, T7, or M13 primers.

Primers C (5'-ACGGCATATGCGAAGACGCGGCCGA-3') and D(5'-GATTGGATCCTACTTCTCCTGTTCGTTCA-3') were used foramplification from a Ubx cDNA template (p3712; Beachy et al., 1985) togenerate the extended homeodomain fragment: 30 cycles at 94°C (1 min),62°C (45 s), and 72°C (2 min). Cycling was preceded by 9 min at 94°C.The resulting fragment was digested with NdeI and BamHI and ligated intosimilarly digested pET3c (Rosenberg et al., 1987). Plasmid pUHD-72 wasselected from among many candidate clones because of its particularly highlevels of Ubx homeodomain production. Restriction enzyme digestion andsequence analysis showed that pUHD-72 contained two intact copies of theextended homeodomain. We can not be certain about the genesis of thisplasmid, but its structure is suggestive of a ligation event involving productswhich suffered some exonuclease digestion. (The sequencing of the pUHD-72insert was done by recloning an XbaI-BamHI fragment into Bluescript andusing the T3 and T7 primers.)

Homeodomain purificationE.coli strain BL21(DE3) pLysS carrying pUHD-72 was grown in M9ZBat 37°C to OD6W = 0.7, then induced with IPTG as described (Rosenberget al., 1987). Growth continued for 2.5 h, followed by harvest of cells,lysis, clarification and removal of nucleic acids by polyethyleneimineprecipitation as described in Beachy et al. (1988). Following thepolyethyleneimine step, the homeodomain peptide was purified bychromatography with a BioRex 70 column and with a Pharmacia MonoS FPLC column (Figure 3, lanes 6 and 7) essentially as described in Mulleret al. (1988). Pooled fractions from the Mono-S column (-0.7 M NaCl,50 mM NaPO4 (pH 7.5), 1 mM dithiothreitol, 10% glycerol) were adjustedto 1.7 M (NH4)2SO4 and further fractionated by loading 1 mg portions onto a Pharmacia Phenyl Superose HR5/5 FPLC column. The column wasdeveloped with a 20 mi linear concentration gradient from 1.7-0.0 M(NH4)2SO4, with homogeneous Ubx homeodomain eluting at - 1 M(NH4)2SO4 (Figure 3, lane 8). Pooled fractions from the Phenyl Superosecolumn were dialyzed against storage buffer (0.6 M NaCI, 50 mM NaPO4(pH 7.5), 1 mM dithiothreitol, 10% glycerol), aliquotted at a peptideconcentration of 60zM and stored at -80°C. Concentration of purifiedhomeodomain peptide was determined using an e280 of 9600M- l cm-the yield was - 1 mg/I culture.

Selection of oligonucleotides containing Ubx binding sitesUbx homeodomain peptide was conjugated to cyanogen bromide activatedSepharose 4B (Pharmacia) in storage buffer and remaining active groupswere blocked by treatment with 1.0 M Tris-HCI (pH 8.3). Couplingefficiency was >97%, and the final concentration of Ubx homeodomainwas 100 Ag/ml gel. Double-stranded 32P-labeled 70mer (10 jg) wasprepared to a specific activity of 106 c.p.m./4g by annealing to three-foldmolar excess primer B and extending with the large fragment of DNApolymerase I in the presence of all four unlabeled dNTPs (500 AM each)and [a-32P]dATP (50 Al final reaction volume). Affinity chromatographywas carried out at -22 'C with 1 ml of Ubx - Sepharose resin in a column0.7 cm in diameter and - 2.5 cm in height. All loadings of double-stranded70mer were in 0.5 ml of buffer C (50 mM Tris-HCI pH 8.0, 1mMdithiothreitol, 10 ,g/ml gelatin) supplemented with 0.1 M NaCl, 5 itg polyd(I-C), and 10 ug E.coli tRNA. After loading 7.4ytg of double-stranded70mer, round 1 of selection proceeded with successive washes of the columnwith 3 ml buffer C containing 0.1 M, 0.25 M, 0.4 M and 1.0 M NaCI.Fractions from the 0.4 M and 1.0 M washes were pooled and of the 400ng in these fractions (5.4% of input, see Table I), 195 ng were recoveredafter phenol extraction and ethanol precipitation. This material was loadedand round 2 of selection proceeded with successive washes with 3 ml bufferC containing 0.1 M, 0.25 M, 0.3 M and 1.0 M NaCI. Fractions from the1.0 M wash were pooled, concentrated and amplified by PCR (see above).Approximately 2 Ag of the amplified DNA was used for labeling accordingto the protocol for labeling 70mer described above except that primers Aand B were both used. This material was loaded and round three of selectionproceeded with successive washes with 3 ml buffer C containing 0.1 M,0.3 M, 0.35 M and 1.0 M NaCl. Fractions from the 1.0 M wash werepooled, concentrated by ethanol precipitation, amplified by PCR, digestedwith EcoRl and XhoI and cloned into a Bluescript vector (see above).

1185

Page 8: The EMBO Optimal recognition by the Ultrabithorax ...biology.hunter.cuny.edu/molecularbio/Class%20... · The EMBOJournal vol.10 no.5 pp.1 179-1186, 1991 Optimal DNAsequence recognition

S.C.Ekker et al.

DNase I footprintingThe DNA used for Figure 5A was a 260 bp NotI-AseI fragment from aBluescript clone of an oligonucleotide containing the optimal binding sitesequence (Figure 4A, sequence 54; Table II,line A). This plasmid wasdigested with NotI, treated with the large fragment of DNA polymeraseI in the presence of [a-32P]dGTP and -dCTP, digested with AseI andpurified by agarose gel electrophoresis (Sambrook et al., 1989). The DNAused in FigureSB (supplied by R.Padgett and W.Gelbart) is a - 330 bpfragment from the short vein region of the decapentaplegic gene (St Johnstonet al., 1990) that extends from nucleotide 896 to - 1230 in the numberingof Padgett et al. (1987). This fragment was 3' end-labeled with[c_-32P]dATP at an EcoRI site in flanking vector sequence and purifiedfollowing BamHI (also present in flanking vector sequence) digestion andagarose gel electrophoresis.

Binding of homeodomain peptide to labeled DNA fragments (at aconcentration of - 18 pM) proceeded for 20 min at22°C in 300 11 bindingbuffer (10 mM HEPES (pH 7.0), 150 mM potassium acetate, 2 mMMgCl2, 0.1 mM EDTA and 0.1 mM dithiothreitol). To each reaction wasadded 40 /l of a solution containing 21 mM CaC12 and 37 mM MgCl2 andthis was followed immediately by addition of 4 ng pancreatic DNaseI. DNaseI digestion was stopped after 30 s at22°C by addition of 601tl of stop solution(1.5 M NaCl, 250mM Tris-HCl pH 8.0, 100mM EDTA, 5% Na-lauroylsarcosine and 175 tsg/ml E.coli tRNA). After extraction with 4001lphenol-chloroform (1:1), nucleic acids were concentrated by ethanolprecipitation and electrophoresed through 0.4 mm 8% polyacrylamidesequencing gels (Sambrook et al., 1989).

Equilibrium and kinetic DNA binding measurementsDNAs used for mobility shift assays were 34 bp EcoRI-XhoI fragmentsfrom selected single insert Bluescript clones of 70mer oligonucleotides (seeabove). Fragments were separated in 10% polyacrylamide gels and ethanolprecipitated following elution from gel slices (Sambrook et al., 1989).Fragments were then 3' end-labeled with [a-32P]dATP and -dTTP byincubation with the large fragment of DNA polymerase I (Sambrook et al.,1989). Eachfragment was purified with a NACS column (Bethesda ResearchLabs), ethanol precipitated, extracted with phenol -chloroform (1:1), andfurther purified with a NICK column (Pharmacia). The column purificationswere essential for reproducibility of binding measurements. All bindingexperiments were carried out with freshly-thawed protein aliquots.

Binding reactions for equilibrium experiments proceeded for 20 min at22°C in 50yl footprinting binding buffer (see above) modified to also contain50,ug/ml bovine serum albumin (Miles, Pentex Fraction V) and 10%glycerol. Reactions contained the specified amounts of homeodomain proteinand end-labeled DNA fragments at concentrations <5 pM.

Binding reactions for kinetic experiments contained Ubx homeodomainand an end-labeled DNA fragment at10 nM and -0.5 nM concentrationsrespectively, all in low salt binding buffer (20 mM Tris-HCI pH 7.6, 75mM KCI, 1 mM dithiothreitol, 50 pg/ml BSA, 10% glycerol). Followingbinding for a minimum of 30 min at 22°C, 20 Al aliquots were removedand mixed with 2 11 of a solution containing annealed oligonucleotides E(5'-TAATGGTAATGGTAATGGTAATGGTAATGGTAATGG-3') and F(5'-AGACAGCCATTACCATTACCATTACCATTACCATTACCAT-TA-3') at a concentration of 330 nM duplex DNA (final concentration, 30nM). Reactions were incubated with competitor at 220C, for the timesindicated, before loading on gels.Samples from equilibrium and kinetic experiments were electrophoresed

at room temperature for 1 hat 400 Von 7.5% polyacrylamide gels (30:0.8,acrylamide to bis-acrylamide) containing 0.5 xTBE and 3% glycerol. Driedgels were subjected to autoradiography at - 80°C with intensifying screens.Bands corresponding to free and bound DNA were excised, placed directlyinto Safety SolveTM (Research Products International) liquid scintillationcocktail and counted on a Beckman LS 6800 scintillation counter. Freeenergies were calculated from the formula AG = - Rl7n(Keq), where Ris the universal gas constant and T is absolute temperature. t112 wascalculated as the time required for half the complexes to dissociate: t112= -ln(O.S)lkd.

AcknowledgementsThe authors wish to thank J.Corden, J.Nathans, C.Pabo and C.Wolbergerfor helpful suggestions at various stages during this work, R.Padgett andW.Gelbart for supplying dpp DNA, R.Weinzierl and R.White for theirgenerous gift of anti-homeodomain antibody and A.Collector andC.Wendling for oligonucleotide synthesis. S.C.E. is a predoctoral fellowof the March of Dimes Birth Defects Foundation and P.A.B. is aNeuroscience Fellow of the Alfred P.Sloan Foundation.

ReferencesAffolter,M., Percival-Smith,A., Muller,M., Leupin,W. and Gehring,W.J.

(1990) Proc. Natl. Acad. Sci. USA, 87, 4093-4097.Beachy,P.A. (1990) Trends Genet., 6, 46-51.Beachy,P.A., Helfand,S.L. and Hogness,D.S. (1985) Nature, 313,

545 -551.Beachy,P.A., Krasnow,M.A., Gavis,E.R. and Hogness,D.S. (1988) Cell,

55, 1069-1081.Biggin,M.D. and Tjian,R. (1989) Cell, 58, 433-440.Blackwell,T. K. and Weintraub,H. (1990) Science, 250, 1104-1110.Gehring,W.J. (1987) Science, 236, 1245-1252.Gibson,G., Schier,A., Lemotte,P. and Gehring,W. J. (1990) Cell, 62,

1087-1103.Hanes,S.D. and Brent,R. (1989) Cell, 57, 1275-1283.Hattori,M. and Sakaki,Y. (1986) Anal. Biochem., 152, 232-238.Hayashi,S. and Scott,M.P. (1990) Cell, 63, 883-894.Hollis,M., Valenzuela,D., Pioli,D., Wharton,R. and Ptashne,M. (1988)

Proc. Natl. Acad. Sci. USA, 85, 5834-5838.Immergluck,K., Lawrence,P.A. and Bienz,M. (1990) Cell, 62, 261-268.Johnson,F.B. and Krasnow,M.A. (1990) Genes Dev., 4, 1044-1052.Karch,F.,Bender,W. and Weiffenbach B. (1990) Genes Dev., 4,

1573-1587.Kissinger,C.R., Liu,B., Martin-Blanco,E., Kornberg,T.B. and Pabo,C.O.

(1990) Cell, 63, 579-590.Kornfeld,K., Saint,R.B., Beachy,P.A., Harte,P.J., Peattie,D.A. and

Hogness,D.S. (1989) Genes Dev., 3, 243-258.Krasnow,M.A., Saffman,E.E., Kornfeld,K. and Hogness,D.S. (1989) Cell,

57, 1031-1043.Kuziora,M.A. and McGinnis,W. (1989) Cell, 59, 563-571.Mann,R.S. and Hogness,D.S. (1990) Cell, 60, 597-610.Mihara,H. and Kaiser,E.T. (1988) Science, 242, 925-927.Muller,M., Affolter,M., Leupin,W., Otting,G., Wutrich,K. and

Gehring,W.J. (1988) EMBO J., 7, 4299-4304.Muller,M.,Thuringer,F.,Biggin,M.,Zust, B. and Bienz, M. (1989) EMBO

J., 8, 4143-4151.O'Connor,M.B., Binari,R., Perkins,L.A. and Bender,W. (1988) EMBO

J., 7, 435-445.Oliphant,A.R., Brandl,C.J. and Struhl,K. (1989) Mol. Cell. Biol., 9,

2944-2949.Otting,G., Qian,Y.Q., Billeter,M., Muiller,M., Affolter,M., Gehring,W.

J. and Wuthrich,K. (1990) EMBO J., 9, 3085-3092.Padgett,R.W., St. Johnston,R.D. and Gelbart,W.M. (1987) Nature, 325,

81-84.Percival-Smith,A., Muller,M., Affolter,M. and Gehring,W. J. (1990)EMBOJ., 9, 3967-3974.

Qian,Y.Q., Billeter,M., Otting,M., Muller,M., Gehring,W.J. andWuthrich,K. (1989) Cell, 59, 573-580.

Reuter,R.,Panganiban,G.E.F.,Hoffmann,F.M. and Scott,M.P. (1990)Development, 110, 1031-1040.

Rosenberg,A.H., Lade,B.N., Chui,D.-s., Lin,S.-W., Dunn,J.J. and

Studier,F.W. (1987) Gene, 56, 125-135.Saiki,R.K., Gelfand,D.H., Stoffel,S., Scharf,S.J., Higuchi,R., Horn,G.T.,

Mullis,K.B. and Erlich,H.A. (1988) Science, 239, 487-491.St Johnston,R.D., Hoffmann,F.M., Blackman,R.K., Segal,D., Grimaila,R.,

Padgett,R.W., Irick,H.A. and Gelbart,W.M. (1990) Genes Dev., 4,1114-1127.

Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Cold Spring Harbor

Laboratory Press, Cold Spring Harbor, New York.

Samson,M.-L., Jackson-Grusby,L. and Brent,R. (1989) Cell, 57,1045-1052.

Scott,M.P., Tamkun,J.W. and Hartzell,G.W. (1989) Biochim. Biophys.Acta, 989, 25-48.

Thali,M., Muller,M.M., DeLorenzi,M., Mattias,P. and Bienz,M. (1988)

Nature, 336, 598-601.Treisman,J., Gonczy,P., Vashishtha,M., Harris,E. and Desplan,C. (1989)

Cell, 59, 553-562.Winslow,G.M.,Hayashi,S.,Krasnow,M.,Hogness,D.S. and Scott,M.P.

(1989) Cell, 57, 1017-1030.Wysocka-Diller,J.W., Aisemberg,G.O., Baumgarten,M., Levine,M. and

Macagno,E.R. (1989) Nature, 341, 760-763.

Received on January 3, 1991; revised on February 7,1991

1186