6
Structural analysis of the core COMPASS family of histone H3K4 methylases from yeast to human Yoh-hei Takahashi a,1 , Gerwin H. Westfield b,c,1 , Austin N. Oleskie b , Raymond C. Trievel c , Ali Shilatifard a,2 , and Georgios Skiniotis b,c,2 a Stowers Institute for Medical Research, Kansas City, MO 64110; b Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109; and c Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109 Edited by* Roger D. Kornberg, Stanford University School of Medicine, Stanford, CA, and approved October 14, 2011 (received for review June 10, 2011) Histone H3 lysine 4 (H3K4) methylation is catalyzed by the highly evolutionarily conserved multiprotein complex known as Set1/ COMPASS or MLL/COMPASS-like complexes from yeast to human, respectively. Here we have reconstituted fully functional yeast Set1/COMPASS and human MLL/COMPASS-like complex in vitro and have identified the minimum subunit composition required for histone H3K4 methylation. These subunits include the methyl- transferase C-terminal SET domain of Set1/MLL, Cps60/Ash2L, Cps50/RbBP5, Cps30/WDR5, and Cps25/Dpy30, which are all com- mon components of the COMPASS family from yeast to human. Three-dimensional (3D) cryo-EM reconstructions of the core yeast complex, combined with immunolabeling and two-dimensional (2D) EM analysis of the individual subcomplexes reveal a Y-shaped architecture with Cps50 and Cps30 localizing on the top two adja- cent lobes and Cps60-Cps25 forming the base at the bottom. EM analysis of the human complex reveals a striking similarity to its yeast counterpart, suggesting a common subunit organization. The SET domain of Set1 is located at the juncture of Cps50, Cps30, and the Cps60-Cps25 module, lining the walls of a central channel that may act as the platform for catalysis and regulative processing of various degrees of H3K4 methylation. This structural arrangement suggested that COMPASS family members function as exo-methy- lases, which we have confirmed by in vitro and in vivo studies. M ethylation of histone lysines, including H3K4, H3K9, H3K27, H3K36, H3K79, and H4K20, plays a crucial role in the regulation of key biological processes, such as cell cycle pro- gression, transcription, and DNA repair (13). Except for H3K79 methylation, which is catalyzed by the Dot1 family proteins, all other histone lysine methylations are carried out by the SET [ Su (var), Enhancer of Zeste, and Trithorax] domain-containing enzymes. While the majority of the SET domains can act as his- tone methyltransferases in an isolated form, the MLL/Set1 family methyltransferases and Polycomb group proteins, which respec- tively catalyze histone H3K4 and H3K27 methylation, must as- semble within their respective complexes for maximal catalytic and biological activities (2). The founding member of the MLL/Set1 family protein, Set1, forms a multiprotein complex named COMPASS ( COMplex of Proteins ASsociated with Set1) in Saccharomyces cerevisiae (4). Set1/COMPASS was the first identified histone H3K4 methylase capable of mono-, di-, and trimethylating H3K4, and several of the COMPASS subunits are required for proper methylation (48). In addition to the evolutionarily conserved SET domain located at the C terminus of Set1, most associating subunits are also conserved from yeast to human, forming Set1/COMPASS and MLL/COMPASS-like complexes (9, 10). Histone H3K4 methylation on chromatin, in particular di- and trimethylation, correlates with actively transcribed genes (9, 10). Previous work by us and others has shown that H3K4 methylation is subject to various layers of regulation mechanisms that are highly conserved from yeast to human (9). Different subunits of yeast and human COMPASS have also been shown to regulate H3K4 di- and/or trimethylation as well as Set1 stability. These subunits include Cps60 (Ash2L in human), Cps50 (RbBP5), Cps40 (Cfp1), Cps30 (WDR5), and Cps25 (Dpy30). The recently determined crystal structure of the MLL1/SET domain in complex with histone H3 peptide substrate (11) re- vealed that the active site was relatively solvent exposed due to an outward shifted inserted SET motif (iSET). The isolated MLL1/SET domain, free of any associated components of the natural MLL1 complex, exhibits a weak in vitro histone methyl- transferase activity (11). However, the addition of residual com- ponents of the MLL1 complex, namely WDR5, RbBP5, Ash2L, and Dpy30, greatly stimulated catalytic activity. This observation led to a model, in which the association of the MLL1 complex subunits induces reorientation of iSET to the optimal conforma- tion of its catalytic active site (11). Given the highly conserved C-terminal catalytic SET domain between yeast Set1 and human MLL1, and the common subunit composition of the COMPASS family, the above model might be applicable to yeast COMPASS as well. However, until now there has been no structural informa- tion regarding the architectural arrangement of any COMPASS or COMPASS-like complex. Here we present the identification, reconstitution, as well as the structural and biochemical characterization of a minimal core COMPASS for both yeast and human complexes capable of high levels of H3K4 methylase activity towards histone H3. Results Reconstitution of Core COMPASS. To obtain stable COMPASS for biochemical characterization, we reconstituted the yeast and human complex through baculovirus-mediated cotransfection and overexpression of different subunit combinations in insect cells. Based on affinity purification of the recombinant Set1 and Cps proteins, we obtained nearly pure preparations of reconsti- tuted COMPASS complexes (SI Appendix, Figs. S1S4) that are enzymatically active in in vitro H3K4 methylation assays with full-length histone H3 (Fig. 1A). This approach enabled us to pre- pare any combination of wild-type (WT) or modified COMPASS components, including truncated forms of Set1, for studies that have been previously hampered by the absence of subunits such as Cps50 or Cps30 (8). Accordingly, COMPASS containing full- length Set1, truncated Set1 (7621,080), Set1 (7801,080), Set1 (9381,080), or without Set1 were successfully prepared (SI Appendix, Figs. S1S2). Curiously, full-length Set1 purified from yeast COMPASS consistently runs on SDS/PAGE as a doublet, Author contributions: Y.-h.T., G.H.W., R.C.T., A.S., and G.S. designed research; Y.-h.T., G.H.W., and A.N.O. performed research; Y.-h.T., G.H.W., A.N.O., R.C.T., A.S., and G.S. analyzed data; and Y.-h.T., G.H.W., A.S., and G.S. wrote the paper. The authors declare no conflict of interest. *This Direct Submission article had a prearranged editor. Freely available online through the PNAS open access option. 1 Y.-h.T. and G.H.W. contributed equally to this work. 2 To whom correspondence may be addressed. E-mail: [email protected] or skinioti@ umich.edu. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1109360108/-/DCSupplemental. 2052620531 PNAS December 20, 2011 vol. 108 no. 51 www.pnas.org/cgi/doi/10.1073/pnas.1109360108

Structural analysis of the core COMPASS family of histone H3K4

  • Upload
    vuquynh

  • View
    221

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Structural analysis of the core COMPASS family of histone H3K4

Structural analysis of the core COMPASS family ofhistone H3K4 methylases from yeast to humanYoh-hei Takahashia,1, Gerwin H. Westfieldb,c,1, Austin N. Oleskieb, Raymond C. Trievelc,Ali Shilatifarda,2, and Georgios Skiniotisb,c,2

aStowers Institute for Medical Research, Kansas City, MO 64110; bLife Sciences Institute, University of Michigan, Ann Arbor, MI 48109; and cDepartment ofBiological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109

Edited by* Roger D. Kornberg, Stanford University School of Medicine, Stanford, CA, and approved October 14, 2011 (received for review June 10, 2011)

Histone H3 lysine 4 (H3K4) methylation is catalyzed by the highlyevolutionarily conserved multiprotein complex known as Set1/COMPASS or MLL/COMPASS-like complexes from yeast to human,respectively. Here we have reconstituted fully functional yeastSet1/COMPASS and human MLL/COMPASS-like complex in vitroand have identified the minimum subunit composition requiredfor histone H3K4 methylation. These subunits include the methyl-transferase C-terminal SET domain of Set1/MLL, Cps60/Ash2L,Cps50/RbBP5, Cps30/WDR5, and Cps25/Dpy30, which are all com-mon components of the COMPASS family from yeast to human.Three-dimensional (3D) cryo-EM reconstructions of the core yeastcomplex, combined with immunolabeling and two-dimensional(2D) EM analysis of the individual subcomplexes reveal a Y-shapedarchitecture with Cps50 and Cps30 localizing on the top two adja-cent lobes and Cps60-Cps25 forming the base at the bottom. EManalysis of the human complex reveals a striking similarity to itsyeast counterpart, suggesting a common subunit organization. TheSET domain of Set1 is located at the juncture of Cps50, Cps30, andthe Cps60-Cps25 module, lining the walls of a central channel thatmay act as the platform for catalysis and regulative processing ofvarious degrees of H3K4 methylation. This structural arrangementsuggested that COMPASS family members function as exo-methy-lases, which we have confirmed by in vitro and in vivo studies.

Methylation of histone lysines, including H3K4, H3K9,H3K27, H3K36, H3K79, and H4K20, plays a crucial role in

the regulation of key biological processes, such as cell cycle pro-gression, transcription, and DNA repair (1–3). Except for H3K79methylation, which is catalyzed by the Dot1 family proteins, allother histone lysine methylations are carried out by the SET[Su (var), Enhancer of Zeste, and Trithorax] domain-containingenzymes. While the majority of the SET domains can act as his-tone methyltransferases in an isolated form, the MLL/Set1 familymethyltransferases and Polycomb group proteins, which respec-tively catalyze histone H3K4 and H3K27 methylation, must as-semble within their respective complexes for maximal catalyticand biological activities (2).

The founding member of the MLL/Set1 family protein, Set1,forms a multiprotein complex named COMPASS (COMplex ofProteins ASsociated with Set1) in Saccharomyces cerevisiae (4).Set1/COMPASS was the first identified histone H3K4 methylasecapable of mono-, di-, and trimethylating H3K4, and severalof the COMPASS subunits are required for proper methylation(4–8). In addition to the evolutionarily conserved SET domainlocated at the C terminus of Set1, most associating subunits arealso conserved from yeast to human, forming Set1/COMPASSand MLL/COMPASS-like complexes (9, 10).

Histone H3K4 methylation on chromatin, in particular di- andtrimethylation, correlates with actively transcribed genes (9, 10).Previous work by us and others has shown that H3K4 methylationis subject to various layers of regulation mechanisms that arehighly conserved from yeast to human (9). Different subunits ofyeast and human COMPASS have also been shown to regulateH3K4 di- and/or trimethylation as well as Set1 stability. These

subunits include Cps60 (Ash2L in human), Cps50 (RbBP5),Cps40 (Cfp1), Cps30 (WDR5), and Cps25 (Dpy30).

The recently determined crystal structure of the MLL1/SETdomain in complex with histone H3 peptide substrate (11) re-vealed that the active site was relatively solvent exposed dueto an outward shifted inserted SET motif (iSET). The isolatedMLL1/SET domain, free of any associated components of thenatural MLL1 complex, exhibits a weak in vitro histone methyl-transferase activity (11). However, the addition of residual com-ponents of the MLL1 complex, namely WDR5, RbBP5, Ash2L,and Dpy30, greatly stimulated catalytic activity. This observationled to a model, in which the association of the MLL1 complexsubunits induces reorientation of iSET to the optimal conforma-tion of its catalytic active site (11). Given the highly conservedC-terminal catalytic SET domain between yeast Set1 and humanMLL1, and the common subunit composition of the COMPASSfamily, the above model might be applicable to yeast COMPASSas well. However, until now there has been no structural informa-tion regarding the architectural arrangement of any COMPASSor COMPASS-like complex.

Here we present the identification, reconstitution, as well asthe structural and biochemical characterization of a minimal coreCOMPASS for both yeast and human complexes capable ofhigh levels of H3K4 methylase activity towards histone H3.

ResultsReconstitution of Core COMPASS. To obtain stable COMPASS forbiochemical characterization, we reconstituted the yeast andhuman complex through baculovirus-mediated cotransfectionand overexpression of different subunit combinations in insectcells. Based on affinity purification of the recombinant Set1 andCps proteins, we obtained nearly pure preparations of reconsti-tuted COMPASS complexes (SI Appendix, Figs. S1–S4) that areenzymatically active in in vitro H3K4 methylation assays withfull-length histone H3 (Fig. 1A). This approach enabled us to pre-pare any combination of wild-type (WT) or modified COMPASScomponents, including truncated forms of Set1, for studies thathave been previously hampered by the absence of subunits suchas Cps50 or Cps30 (8). Accordingly, COMPASS containing full-length Set1, truncated Set1 (762–1,080), Set1 (780–1,080), Set1(938–1,080), or without Set1 were successfully prepared (SIAppendix, Figs. S1–S2). Curiously, full-length Set1 purified fromyeast COMPASS consistently runs on SDS/PAGE as a doublet,

Author contributions: Y.-h.T., G.H.W., R.C.T., A.S., and G.S. designed research; Y.-h.T.,G.H.W., and A.N.O. performed research; Y.-h.T., G.H.W., A.N.O., R.C.T., A.S., and G.S.analyzed data; and Y.-h.T., G.H.W., A.S., and G.S. wrote the paper.

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

Freely available online through the PNAS open access option.1Y.-h.T. and G.H.W. contributed equally to this work.2To whom correspondence may be addressed. E-mail: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1109360108/-/DCSupplemental.

20526–20531 ∣ PNAS ∣ December 20, 2011 ∣ vol. 108 ∣ no. 51 www.pnas.org/cgi/doi/10.1073/pnas.1109360108

Page 2: Structural analysis of the core COMPASS family of histone H3K4

although the reasons behind this property of Set1 or its biologicalsignificance remain unclear.

As a first step, we investigated the effects of specific Cpssubunits on in vitro H3K4 methylation by preparing full-lengthSet1 (1–1,080) with various Cps combinations (Fig. 1A). Eachintroduced component was purified through FLAG affinity pur-ification as shown by anti-FLAG or anti-Set1 Western blots(Fig. 1A, top and middle boxes, respectively). Consistent with ourprevious reports of in vivo and in vitro H3K4 methylation usingcps60Δ and cps25Δ strains (8), the absence of Cps60 and Cps25significantly impaired the catalytic activity of Set1 with the totalloss of H3K4 trimethylation, and more than an 80% reductionof H3K4 mono- and dimethylation (Fig. 1A, lane 7). In contrastto earlier in vivo studies (8, 12–14), almost no monomethylationor very marginal di- and trimethylation changes of H3K4 wereobserved in the absence of Cps40 and Cps35 under conditionswhere Set1 levels are not a limiting reagent in our in vitro experi-ments (Fig. 1A). This study suggests that full-length Set1 stabilityin vivo is reduced in either cps40Δ or cps35Δ strains, compared tothe strain expressing all five subunits. To this end, we found thatCps40 coeluted with other COMPASS components in the pre-sence of the N-terminally extended region of Set1, includingfull-length Set1, Set1 (762–1,080), Set1 (780–1,080), but not Set1(938–1,080) (SI Appendix, Fig. S2). Cps35 was also shown torequire a region beyond 762–1,080 of Set1, because it did notcoelute in the macromolecular fractions with any other truncatedSet1 variants (SI Appendix, Fig. S2).

Next, we evaluated the intrinsic enzymatic activity of solubleSet1 protein in isolation. We did not detect any H3K4 methyla-tion by Set1 alone or in the presence of putative Set1-stabilizingcomponents Cps50 and Cps30 (Fig. 1A, lanes 2 and 3), therebyfurther confirming that Cps60 and Cps25 are required for H3K4methylation activity by Set1. In the reconstituted system, Set1(938–1,080) was sufficient in replacing the full-length protein

in terms of stability and enzymatic activity (Fig. 1 A and B), exceptin the absence of Cps60 and Cps25 (Fig. 1A, lane 7 and B, lane 7).Whereas full-length Set1-containing COMPASS supported lowlevels of residual in vitro H3K4 mono- and dimethylation inthe absence of Cps60 and Cps25, this activity was completelyabolished in Set1 (938–1,080)-containing COMPASS.

Recent studies with H3 peptide substrates have suggested thatWDR5 RbBP5, Ash2L, and Dpy30 (also known as the WRADcomplex), the human homologs of Cps30, Cps50, Cps60, andCps25, respectively, display histone H3K4 methyltransferase ac-tivity independent of the Set1 domain of MLL1 (15–17). Despitethe highly conserved components between the yeast COMPASSand the human MLL/COMPASS-like complexes, we could notdetect any H3K4 methylation signals in a reconstituted Cps30,Cps60, Cps50, and Cps25 complex in the absence of Set1 (Fig. 1A and B; see lane 10 in both figures). This observation is consis-tent with our previous in vivo studies where we were unable todetect any histone H3K4 methylation in strains with SET1 dele-tions (8), suggesting that Set1 is the only histone H3K4 methylasein yeast, and that the yeast WRAD complex does not demon-strate substantial HMTase activity associated with it in vivo orin vitro. Indeed, our enzymatic analyses (SI Appendix, Fig. S5)demonstrated the presence of less than 1% of H3K4me1 and vir-tually undetectable levels of H3K4me2-3 when using WRAD ascompared to wild-type Set1/COMPASS or MLL/COMPASS-likecomplexes. Given the extremely weak catalytic rates reportedfor the reconstituted WRAD complex as a SET-domain indepen-dent histone H3K4 methylase (15–17), the biological relevance ofsuch activity needs to be further demonstrated.

Insect cell-expressed core COMPASS also enabled us to inves-tigate the regulation of histone H3K4 methylation by Cps50 andCps30. Systematic analysis for in vitro H3K4 methyltransferaseactivity (SI Appendix, Fig. S3) shows that core COMPASS israther insensitive to the lack of Cps30 with comparable H3K4

Fig. 1. Recombinant Set1/COMPASS complexes and in vitro histone methyltransferase activities. (A) Purifications of recombinant COMPASS consisting of full-length Set1 with various combinations of Cps60, Cps50, Cps40, Cps35, Cps30, and Cps25. Subunit composition of purified COMPASS was confirmed by Westernblotting using anti-FLAG (top box) and anti-Set1 (middle box) antibodies. In vitro H3K4 methyltransferase activities toward free histone H3 of various recom-binant COMPASS complexes were examined by Western blotting using anti-H3K4me1, me2, and me3 antibodies (bottom box). (B) Recombinant COMPASScomplexes were prepared using Set1 (938–1,080) instead of full-length Set1. COMPASS composition and in vitro H3K4 methyltransferase activities wereanalyzed in the same way as (A).

Takahashi et al. PNAS ∣ December 20, 2011 ∣ vol. 108 ∣ no. 51 ∣ 20527

BIOCH

EMISTR

Y

Page 3: Structural analysis of the core COMPASS family of histone H3K4

mono- and dimethylation, but partially defective trimethylationactivity (SI Appendix, Fig. S3D, lanes 2 and 10). This finding sug-gests that COMPASS is enzymatically active in the absence ofCps30, but the processivity of the SET domain from H3K4 di-to trimethylation is regulated by Cps30. In contrast, reconstitutedcore COMPASS lacking either Set1 (938–1,080), Cps60, Cps50,or Cps25 completely lacked enzymatic activities (SI Appendix,Fig. S3D, lanes 1, 6, 11, and 12; note that the low levels of Set1in lane 6 may also contribute to the absence of activity observed inthat condition). Thus, we have identified Set1 (938–1,080),Cps60, Cps50, Cps30, and Cps25 as the necessary core COM-PASS components for high levels of in vitro H3K4 mono-, di- andtrimethylation using histone H3 as a substrate. Both Cps60 andCps25 are essential for in vitro enzymatic activity regardless of thecombination of other core COMPASS subunits (Fig. 1A lanes 3,4, and 7, SI Appendix, Fig. S3D, lanes 3–8, 11–14). It is noted thatthe Cps60-Cps25 subcomplex (6) coeluted with Set1 (938–1,080)upon size exclusion chromatography when reconstituted together(fractions 21 to 23; SI Appendix, Fig. S4B). This result demon-strates a direct interaction between the SET domain and theCps60-Cps25 module. In contrast, in the absence of Cps60 andCps25, soluble Set1 (938–1,080) elutes broadly on the samesize exclusion chromatography system (fractions 13 to 29; SIAppendix, Fig. S4A). Given that coexpression of neither Cps50nor Cps30 had any effects on Set1 (938–1,080) elution profile(SI Appendix, Fig. S4C), Cps60 and Cps25 may play an additionalrole in the correct assembly of COMPASS through interactionswith Set1.

Electron Microscopic Mapping of Core COMPASS. To gain insight intothe architecture of yeast core COMPASS, we employed electronmicroscopy to visualize preparations consisting of purified Set1(938–1,080), Cps60, Cps50, Cps30, and Cps25 (Fig. 2A, fraction18; SI Appendix, Fig. S1B). Raw images of negative stained speci-men revealed a monodisperse population of complexes withsimilar sizes (SI Appendix, Fig. 6A). To analyze these particles,35,527 projections were interactively selected and groupedinto 100 classes by reference-free alignment and classification

(SI Appendix, Fig. S7A). The two-dimensional (2D) class averagesrevealed flexible conformers of two major populations of particleprojections: one displays a Y-shaped particle with a triangularbase connecting through an extension to two adjacent oval lobes,one of which is in closer proximity to the base (Fig. 2B, right;SI Appendix, Fig. S7A); the second population reveals a similartriangular base and an arm extension connecting to a singlecircular doughnut-like domain with a distinct stain accumulationregion in the center (Fig. 2B, left). The doughnut-shaped domainappears identical to the projection profile of a WD40 domain(inset in Fig. 2B), which defines both Cps50 and Cps30, furthersuggesting that a subpopulation of the reconstituted COMPASScomplexes might be missing one of the two WD40 domain-containing subunits.

In order to identify each domain within the core COMPASS,we visualized individual subcomplexes by negative stain EM.We first examined the structure of the Cps60-Cps25 subcom-plex, where classification and averaging (Fig. 2C; SI Appendix,Fig. S8 A and C) revealed a triangular structure with similardimensions and shape to the triangular base found in the imagesof core COMPASS (Fig. 2B). This result suggests that Cps60 andCps25 form the base of core COMPASS. We next analyzed EMimages of reconstituted core COMPASS lacking the Cps30 sub-unit (core COMPASS/-Cps30) (SI Appendix, Fig. S8 B and D).2D averages from this analysis show the well defined triangularbase, with an arm extension and an additional oval lobe near thetop of the complex (Fig. 2D). Missing, however, is the second,more distant oval lobe that is present in core COMPASS. Thisobservation indicates that the Cps30 subunit is the more terminalof the two lobes away from the triangular base, while the lobecloser to the middle of the complex belongs to the WD40 domainof Cps50. In addition, a few averages reveal a small density ex-tending from the connection between the Cps60-Cps25 base andthe Cps50 domain, which might be attributed to the SET domain(Fig. 2D, right box). To test this hypothesis, we incubated coreCOMPASS/-Cps30 with a Fab fragment conjugated to SET andanalyzed the complexes by single-particle EM. 2D projectionaverages of Fab-labeled complexes clearly reveal the extra Fab

Fig. 2. EMmapping of core COMPASS. (A) Purification of Core Set1/COMPASS by size exclusion chromatography and analysis of the active fractions by HMTaseassay. Core Set1/COMPASS consisting of Cps25, Cps60, Cps50, Set1 (938–1,080) was reconstituted and fractionated by size exclusion chromatography over aSuperose-6 PC 3.2∕30 (GE Healthcare). The H3K4 methylase activities of the resulting fractions were tested as shown. (B) Representative 2D averages of ne-gative stained core COMPASS (fraction 18 in A) including Cps25, Cps60, Cps50, Set1 (938–1,080) which is denoted as SET938 for simplicity in Figs. 3 and 4, andCps30. The inset shows the 2D projection profile of a WD40 domain for comparison (Scale bar, 10 nm). (C) Representative 2D averages of the Cps25-Cps60assembly. (D) Representative 2D averages of negative stained Cps25-Cps60-Cps50-SET938 complex. (E) 2D average of Cps25-Cps60-Cps50-SET938 complexlabeled with a Fab fragment against SET938. (F) Representative 2D class averages of negative stained human core MLL1/COMPASS-like complex. Schematicrepresentations: Cps25/DPY30-Cps60/Ash2L (gray), SET938/MLL1 (blue), Cps50/RbBP5 (orange), and Cps30/WDR5 (green).

20528 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1109360108 Takahashi et al.

Page 4: Structural analysis of the core COMPASS family of histone H3K4

density at the level of the region connecting the Cps60-Cps25base and Cps50. (Fig. 2E). Next, we examined the architectureof the reconstituted human core COMPASS consisting of MLLSET domain, RbBp5, Ash2L, WDR5, and Dpy30. Negative stain2D averages from this preparation revealed an architecture withstriking resemblance to the yeast complex (Fig. 2F). Thus, boththe structure and function and COMPASS-like complexes appearto be highly conserved from yeast to human.

3D Architecture of Core COMPASS.To further characterize the archi-tecture of the core COMPASS, we sought to examine the three-dimensional (3D) structure of the yeast assembly. In the first step,we calculated several 3D models from negative stained particlesbelonging to the individual groups produced by classification(SI Appendix, Fig. S9). For this approach, we used the corre-sponding 60° tilted projections and the random conical tilt meth-od to calculate initial reconstructions, which were further refinedafter the inclusion of 0° particle views. The 3D reconstructions ofparticles belonging to classes with only one globular domain con-nected to the Cps60-Cps25 base reveal that these particles haveindeed only one WD40-like domain. In contrast, 3D reconstruc-tions from Y-shaped particles reveal the Cps60-Cps25 modulehaving two globular domains connected to it at different distances(SI Appendix, Fig. S9). In these 3D reconstructions, we noticevariability in the disposition of the Cps60-Cps25 base as well asin the proximity between the two WD40-like lobes of Cps50 andCps30. We reasoned that the various conformations might be theresult of inherent complex flexibility, resulting in deformationsdue to the negative stain preparation on the carbon support ofthe grids. Therefore, we focused our efforts on cryo-EM of thecomplex in holey carbon grids, and successfully obtained particleimages from specimen suspended in thin vitreous ice (SIAppendix, Fig. S6B). Because our negative stain analysis of coreCOMPASS indicated the presence of two major populations(þ∕− one WD40 domain), we subjected 21,583 cryo-EM imagesto multiple reference-supervised alignment (18, 19) using two in-itial 3D references (SI Appendix, Fig. S10A): a 3D volume repre-senting only the Cps60-Cps25 base and one WD40-like domain(SI Appendix, Fig. S10A, left), and a 3D volume that included thebase and both WD40-like domains (SI Appendix, Fig. S10A,right). This approach allowed us to effectively separate the twomajor populations within the cryo-EM dataset, resulting in stableparticle assignments after several cycles of multireference align-ment. We then employed single reference alignment and recon-struction to produce a final cryo-EM map of the core COMPASSat a resolution of 24 Å (FSC ¼ 0.5) (SI Appendix, Fig. S10B). Thecryo-EM 3D map displayed features that are clearly distinct fromthose of the starting reference 3D map from negative stain EM.To validate our reconstruction, we also employed different initialreferences for cryo-EM projection alignment, thereby, producingvery similar final maps (SI Appendix, Fig. S11). Furthermore, 2Dclassification of the cryo-EM projections (SI Appendix, Fig. S12A)revealed averages that are in very good agreement with reprojec-tions of the 3D map (SI Appendix, Fig. S12B).

Molecular Modeling of Core COMPASS. To obtain deeper insights tothe architecture of the core COMPASS, we generated a model bydocking the available or homologous crystal structures into theircorresponding positions within the cryo-EM map and accordingto our assignment from the 2D projection analysis (Fig. 3). Due tothe limits in resolution, we performed rigid body manual dockingbased on visual inspection of the fit. The two neighboring glob-ular domains of the 3D map represent the two WD40 domains,with the lobe more distal to the Cps60-Cps25 base belonging toCps30 (Figs. 3 and 4B, green) and the other to Cps50 (orange). Tomodel Cps50, which contains a β-propeller fold and a C-terminaltail of ∼80 amino acids, we docked a homologous WD40 domain(orange; PDBID:2XL2) in its corresponding position. A recent

crystal structure of WDR5, a Cps30 homolog, has been obtainedwith peptides of both the C-terminal tail of RbBP5 (Cps50 homo-log) and the Win motif that is N-terminally adjacent to the SETdomain of MLL1 (Set1 homolog) bound onto the opposite facesof WDR5 (20, 21). We thus docked the cocrystal structure(PDBID:3P4F) of the MLL1 Win peptide-WDR5-RbBP5 tailpeptide in the position of Cps30 with an orientation that placesthe RbBP5 tail peptide close to the Cps50 position on the cryo-EMmap. Our Fab labeling experiments position the SET domainin the region connecting the two WD40 domains and the Cps60-Cps25 base. Accordingly, the Set1/SET domain from the homo-logous crystal structure (11, 22) was docked in this position.Although yeast Set1 does not have the canonical Win motif up-stream of the SET domain, we identified a similar Ala-Arg-Sermotif at positions 943–945 of the Set1 primary sequence, andfound that core COMPASS harboring either Set1 (R944A) orSet1 (S945A) substitutions almost completely loses its H3K4methyltransferase activity (SI Appendix, Fig. S13). This observa-tion suggests that the Ala-Arg-Ser motif is important in complexformation like the Win motif in the MLL1 core complex. Basedon this result, we oriented the N terminus of Set1 close to Cps30and opposite to Cps50 (SI Appendix, Fig. S14). According to theseinteractions, and given the spatial constraints of this region, wecould only dock the Set1 domain in a single orientation. In thismodel, one side of Set1 (938–1,080) appears to extensively inter-act with Cps30, while another side is forming a bridge with densitystemming from the Cps50 WD40 domain (Fig. 4A). Thus, ourmodeling places the active site of Set1 in the middle of a centralchannel that runs through the complex, starting from the connec-tion between Cps60-25 and Cps50, and running adjacent to Set1,exiting behind the interface of Cps30 with Cps50 (Fig. 4 A and B).In this configuration, the peptide bound to the active site of theSET domain, as shown in the crystal structure (PDBID:2W5Z),would reside directly in the middle of the COMPASS channel(shown in red in Fig. 4A). Thus, this configuration may limitthe substrates that COMPASS can recognize and methylate.

COMPASS Family Members Are Exo-Methylases. Our structural ana-lysis and modeling suggest that the centrally located active sitewithin the COMPASS channel may only be reached by flexiblepeptide terminals, indicating that COMPASS family may functionprimarily as exo- and not endo-methylases. To test this hypothesis,we engineered a Flag sequence to the N terminus of the only copy

Fig. 3. 3D cryo-EM reconstruction and modeling of core COMPASS. Differ-ent views of the cryo-EM 3D map for the Cps25-Cps60-Cps50-SET938-Cps30COMPASS complex. Each view shows the solid rendered map accompaniedby a transparent map with modeled crystal structures of the WD40 domainof Cps50 (orange), Cps30 (green), and MLL1/SET938 (cyan). The red arrowindicates the expected position of the Cps50 arm that bridges with theCps25-Cps60 module.

Takahashi et al. PNAS ∣ December 20, 2011 ∣ vol. 108 ∣ no. 51 ∣ 20529

BIOCH

EMISTR

Y

Page 5: Structural analysis of the core COMPASS family of histone H3K4

of histone H3 in our yeast strain and tested H3K4 methylationby Set1/COMPASS. As shown in Fig. 4C (lanes 1, 3), WT H3 canbe methylated on K4 by Set1/COMPASS in vivo. However, H3bearing a single Flag sequence on its N terminus (making theH3K4 site an internal site) is no longer methylated, although thisH3 can be fully methylated on K36 and K79 (Fig. 4C lanes 5, 6).This finding illustrates that Set1/COMPASS specifically methy-lates the N-terminal tail of histone H3 and that the additionof a short heterologous sequence to the substrate’s N terminus,making the site of methylation an internal site, blocks methyla-tion by Set1/COMPASS in yeast cells. To further confirm thisobservation in a reconstituted system, we also tested whetherN-terminally 10-His-tagged H3 can be methylated either by Set1/COMPASS or MLL/COMPASS-like complex. Unlike WT H3,an internally engineered H3K4 site can no longer serve as a sub-strate for COMPASS (SI Appendix, Fig. S15, lanes 1–14) furthersuggesting that the COMPASS family members preferentiallymethylate N-terminal and not internal lysine sites.

DiscussionIn this study, we established the 3D architecture of S. cerevisiaeand human COMPASS complexes through characterization ofthe fully functional core assembly, consisting of the SET domainof Set1, Cps60/Ash2L, Cps50/RbBP5, Cps30/WDR5, and Cps25/DPY30 (Fig. 3). In vitro reconstitution of yeast COMPASSallowed us to prepare multiple truncated forms of Set1 in com-bination with other subunits (Fig. 1), and analyzed the generationof H3K4 methylation by these complexes (Fig. 1, SI Appendix,Fig. S3). The characterization of core COMPASS clearly showsthat five components are required and sufficient for all forms(mono-, di-, and tri-) of H3K4 methylation in vitro, and thereforethis complex likely mediates this modification on active genes invivo. Furthermore, the obtained 3D architecture of core COM-PASS, revealing the centrally located SET domain of Set1 thatdirectly contacts Cps50, Cps30, and the Cps60-Cps25 subcom-plex, provides the fundamental architectural blueprint mediatingits enzymatic function and stability.

The core COMPASS subunits identified here are present in allsix Set1/COMPASS and MLL1-4 COMPASS-like complexes (9).While a partially reconstituted core MLL complex lacking Dpy30(Cps25) was able to trimethylate H3K4 in vitro (23), anotherstudy showed that the presence of Dpy30 enhanced the catalyticactivity of theMLL complexes (16). Recent studies (16, 24) reveala conserved physical interaction between Cps60/Ash2 and Cps25/Dpy30 both in yeast and humans. Using both biochemical andstructural data, we provide evidence that the Cps60-Cps25subcomplex interacts directly with the SET domain of Set1(SI Appendix, Fig. S2E) and activates the in vitro H3K4 methyl-transferase activity of core COMPASS (SI Appendix, Fig. S3D).We propose that our reconstituted core yeast and human COM-PASS can serve as a model for COMPASS family members inmetazoans.

The single-particle EM 2D and 3D analysis of core COMPASSand its subcomplexes have provided insights to the underlyingmolecular mechanisms of H3K4 methylation. The position ofthe SET domain, immediately adjacent to each component ofcore COMPASS, suggests a centralized organization that reflectsthe importance of the SET motif in catalysis, together with com-plex stabilization by its closely associated subunits. The histonemethyltransferase active site within the SET domain is likelylocated in the middle of a central channel formed by flankingCps50, Cps30, and the Cps60-25 subcomplex, suggesting thatH3K4 methylation takes place inside the channel. This interpre-tation provides a structural explanation for the regulation ofH3K4 di- and trimethylation by Cps60-Cps25 observed in vivoand in vitro. The Cps60-Cps25 subcomplex could directly alterthe structure of the SET domain to allow an inward shift of iSET,the precise positioning of which was previously proposed to allowMLL1’s SET domain to trimethylate H3K4 (11). Additionally,the 3D map and modeling of the complex presented here havesuggested that COMPASS family members may function as exo-methylases. We have shown that both in a reconstituted systemand in vivo the COMPASS family does indeed prefer N-terminallysines as substrate (Fig 4C).

Our reconstitution studies further suggest a previously un-known regulatory role for Cps30 in the progression from H3K4di- to trimethylation by COMPASS (SI Appendix, Fig. S3). Con-sistent with our in vitro findings with yeast COMPASS, metazoanCps30 (WDR5) has been shown to be specifically required forH3K4 trimethylation, but not for dimethylation, in Xenopus laevisembryos as well as in human HOX genes (25). Although themechanistic aspect of Cps30-mediated regulation is unclear, itis possible that Cps30, through its proximity and interactions,directly affects the structure of the SET domain in a distinct wayfrom that of Cps60-25, leading to an optimal conformation forH3K4 trimethylation.

Repressive H3K27 methylation, which has an apparentlyopposite function to H3K4 methylation, is also introduced by theSET domain-containing multiprotein complex PRC2 (26). COM-PASS and PRC2 share common features, such as closely relatedSET-containing catalytic subunits that require complex formationfor activity in vivo and in vitro (27, 28), and also the presenceof multiple WD40 proteins (RbBP5 and WDR5 in MLL1/COMPASS, and EED and RbAP46/48 in PRC2) (29). It will thusbe interesting to investigate whether and how substrate specifici-ties and/or enzymatic mechanisms are common or divergedbetween COMPASS and PRC2.

Experimental ProceduresPlasmids and Yeast Strains. Full-length Set1, Set1 (762–1,080), Set1(780–1,080), Set1 (938–1,080), and the Cps subunits of COM-PASS (Cps60, Cps50, Cps40, Cps35, Cps30, Cps25) except forCps15 were fused with the FLAG epitope tag on their N terminiand cloned into the transfer vector pBacPAK8 (Clontech). TheMLL1 core complex components including MLL1-Win-SET

Fig. 4. COMPASS family members are exomethylases. (A) Zoom-in view ofthe central channel formed in the complex, with the histone peptide of theMLL1 (as cocrystallized) shown in the red space-filling model. (B) Schematicmodel of core COMPASS family architecture. The red star indicates thehistone peptide-binding region of SET938. (C) The N-terminally extended ver-sion histone H3 was constructed by inserting FLAG sequence (DYKDDDDK)between the Met start codon and the second Ala codon. COMPASS showsno H3K4 methylase activity with the FLAG-extended H3 tails in vivo (thisfigure) and in vitro (SI Appendix, Fig. S15).

20530 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1109360108 Takahashi et al.

Page 6: Structural analysis of the core COMPASS family of histone H3K4

(3,745–3,969), full-length WDR5, Ash2L, RbBP5, and Dpy30were tagged with FLAG epitope tag on their N termini andcloned into the pBacPAK8 vector in the same manner. Yeastshuffle strain YBL574 was transformed with modified pWZ414-F12 plasmid, which encodes an N-terminally FLAG-tagged hht2gene, and used for the analysis of exo-methylation by COMPASS.

Protein Preparation. Recombinant COMPASS and MLL1 com-plexes were prepared through the BacPAK Baculovirus Expres-sion System (Clontech). To prepare COMPASS complexes ofvarious combinations of the Set1 and Cps subunits, exponentiallygrowing Sf9 insect cell cultures were cotransfected with a mixtureof viruses expressing specific combinations of COMPASS compo-nents and subsequently FLAG-purified. The MLL1 core complexwas prepared in the same manner. For further details refer toSI Appendix.

In Vitro and In Vivo H3K4 Methylation Analysis.Recombinant COM-PASS complexes were incubated with 0.5 μg of free histone H3and 200 μM S-adenosylmethionine in methyltransferase reactionbuffer (50 mM Tris-HCl [pH 8.8], 20 mM KCl, 5 mM MgCl2,0.5 mM dithiothreitol) for at least 2 h at 30 °C. The methylationof histone H3 was examined by Western analysis using anti-H3K4me1, me2, and me3 specific antibodies. Histone methyla-tion in vivo was determined by Western analysis of cleared celllysates using anti-H3K4me1, me2, me3, H3K79me2, me3, andH3K36me3 specific antibodes.

Specimen Preparation, EM Imaging, 2D Classification, and 3D Recon-struction of Negative Stained COMPASS Complexes.Core COMPASSand subcomplexes were prepared for electron microscopy using

the conventional negative staining protocol (30), and imagedat room temperature with a Tecnai T12 electron microscopeoperated at 120 kV using low-dose procedures. 2D reference-free alignment and classification of particle projections was per-formed using SPIDER (31). The random conical tilt technique(32) was used to calculate a first back projection map fromindividual classes using the images of the tilted specimen. FREA-LIGN (33) was then used for further refinement of the orienta-tion parameters and for correction of the contrast transferfunction to produce final 3D reconstructions. For further detailsrefer to SI Appendix.

Cryo-EM Specimen Preparation, Imaging, and 3D Reconstruction.Vitrified specimen was visualized on a Tecnai F20 electron micro-scope (FEI) equipped with a field emission electron sourceoperated at 200 kV. Particles from cryo-EM images were excisedusing Boxer [part of the EMAN 1.9 software suite] (34). A totalof 21,583 particles were initially subjected to multiple reference-supervised alignment with EMAN (18, 19), and the separatedparticle projections were then subsequently submitted to singlereference refinement and 3D reconstruction. For further detailsrefer to SI Appendix.

ACKNOWLEDGMENTS. We thank Dr. J.F. Couture for his comments and insightinto the assembly of COMPASS. We are also grateful to Laura Shilatifard foreditorial assistance. G.W. is supported by the National Institute ofGeneral Medical Sciences (NIGMS) Molecular Biophysics Training GrantGM008270-23. R.C.T. is supported by R01-GM073839. Studies in theShilatifard laboratory are supported by funds provided from the NationalInstitute of Health R01-CA150265 and R01-GM069905. G.S. is supported byR01-DK090165.

1. Bhaumik SR, Smith E, Shilatifard A (2007) Covalent modifications of histones duringdevelopment and disease pathogenesis. Nat Struct Mol Biol 14:1008–1016.

2. Shilatifard A (2006) Chromatin modifications by methylation and ubiquitination:implications in the regulation of gene expression. Annu Rev Biochem 75:243–269.

3. Zhang Y, Reinberg D (2001) Transcription regulation by histone methylation: interplaybetween different covalent modifications of the core histone tails. Genes Dev15:2343–2360.

4. Miller T, et al. (2001) COMPASS: a complex of proteins associated with a trithorax-related SET domain protein. Proc Natl Acad Sci USA 98:12902–12907.

5. Krogan NJ, et al. (2002) COMPASS, a histone H3 (Lysine 4) methyltransferase requiredfor telomeric silencing of gene expression. J Biol Chem 277:10753–10755.

6. Roguev A, et al. (2001) The Saccharomyces cerevisiae Set1 complex includes an Ash2homologue and methylates histone 3 lysine 4. EMBO J 20:7137–7148.

7. Schlichter A, Cairns BR (2005) Histone trimethylation by Set1 is coordinated by theRRM, autoinhibitory, and catalytic domains. EMBO J 24:1222–1231.

8. Schneider J, et al. (2005) Molecular regulation of histone H3 trimethylation byCOMPASS and the regulation of gene expression. Mol Cell 19:849–856.

9. Shilatifard A (2008) Molecular implementation and physiological roles for histoneH3 lysine 4 (H3K4) methylation. Curr Opin Cell Biol 20:341–348.

10. Eissenberg JC, Shilatifard A (2010) Histone H3 lysine 4 (H3K4) methylation in develop-ment and differentiation. Dev Biol 339:240–249.

11. Southall SM, Wong PS, Odho Z, Roe SM, Wilson JR (2009) Structural basis for therequirement of additional factors for MLL1 SET domain activity and recognition ofepigenetic marks. Mol Cell 33:181–191.

12. Dehe PM, et al. (2006) Protein interactions within the Set1 complex and their roles inthe regulation of histone 3 lysine 4 methylation. J Biol Chem 281:35404–35412.

13. Lee JS, et al. (2007) Histone crosstalk between H2B monoubiquitination and H3methylation mediated by COMPASS. Cell 131:1084–1096.

14. Nedea E, et al. (2008) The Glc7 phosphatase subunit of the cleavage and polyadenyla-tion factor is essential for transcription termination on snoRNA genes. Mol Cell29:577–587.

15. Cao F, et al. (2010) An Ash2L/RbBP5 heterodimer stimulates the MLL1 methyltransfer-ase activity through coordinated substrate interactions with the MLL1 SET domain.PLoS One 5:e14102.

16. Patel A, Dharmarajan V, Vought VE, Cosgrove MS (2009) On the mechanism ofmultiple lysine methylation by the human mixed lineage leukemia protein-1(MLL1) core complex. J Biol Chem 284:24242–24256.

17. Patel A, Vought VE, Dharmarajan V, Cosgrove MS (2011) A novel non-SET Domainmulti-subunit methyltransferase required for sequential nucleosomal histone H3methylation by the mixed lineage leukemia protein-1 (MLL1) core complex. J BiolChem 286:3359–3369.

18. Brink J, et al. (2004) Experimental verification of conformational variation of humanfatty acid synthase as predicted by normal mode analysis. Structure 12:185–191.

19. Menetret JF, et al. (2005) Architecture of the ribosome-channel complex derived fromnative membranes. J Mol Biol 348:445–457.

20. Odho Z, Southall SM, Wilson JR (2010) Characterization of a novel WDR5-bindingsite that recruits RbBP5 through a conserved motif to enhance methylation of histoneH3 lysine 4 by mixed lineage leukemia protein-1. J Biol Chem 285:32967–32976.

21. Avdic V, et al. (2011) Structural and biochemical insights into MLL1 core complexassembly. Structure 19:101–108.

22. Takahashi YH, et al. (2009) Regulation of H3K4 trimethylation via Cps40 (Spp1) ofCOMPASS is monoubiquitination independent: implication for a Phe/Tyr switch bythe catalytic domain of Set1. Mol Cell Biol 29:3478–3486.

23. Dou Y, et al. (2006) Regulation of MLL1 H3K4 methyltransferase activity by its corecomponents. Nat Struct Mol Biol 13:713–719.

24. South PF, Fingerman IM,Mersman DP, Du HN, Briggs SD (2010) A conserved interactionbetween the SDI domain of Bre2 and the Dpy-30 domain of Sdc1 is required forhistone methylation and gene expression. J Biol Chem 285:595–607.

25. Wysocka J, et al. (2005) WDR5 associates with histone H3 methylated at K4 and isessential for H3 K4 methylation and vertebrate development. Cell 121:859–872.

26. Cao R, et al. (2002) Role of histone H3 lysine 27 methylation in Polycomb-groupsilencing. Science 298:1039–1043.

27. Kuzmichev A, Nishioka K, Erdjument-Bromage H, Tempst P, Reinberg D (2002) Histonemethyltransferase activity associated with a human multiprotein complex containingthe Enhancer of Zeste protein. Genes Dev 16:2893–2905.

28. Nekrasov M, Wild B, Muller J (2005) Nucleosome binding and histone methyltransfer-ase activity of Drosophila PRC2. EMBO Rep 6:348–353.

29. Kuzmichev A, Jenuwein T, Tempst P, Reinberg D (2004) Different EZH2-containingcomplexes target methylation of histone H1 or nucleosomal histone H3. Mol Cell14:183–193.

30. OhiM, Li Y, Cheng Y,Walz T (2004) Negative staining and image classification—power-ful tools in modern electron microscopy. Biol Proced Online 6:23–34.

31. Frank J, et al. (1996) SPIDER and WEB: processing and visualization of images in 3Delectron microscopy and related fields. J Struct Biol 116:190–199.

32. Radermacher M, Wagenknecht T, Verschoor A, Frank J (1987) Three-dimensionalreconstruction from a single-exposure, random conical tilt series applied to the 50Sribosomal subunit of Escherichia coli. J Microsc 146:113–136.

33. Grigorieff N (2007) FREALIGN: high-resolution refinement of single particle structures.J Struct Biol 157:117–125.

34. Ludtke SJ, Baldwin PR, Chiu W (1999) EMAN: semiautomated software for high-resolution single-particle reconstructions. J Struct Biol 128:82–97.

Takahashi et al. PNAS ∣ December 20, 2011 ∣ vol. 108 ∣ no. 51 ∣ 20531

BIOCH

EMISTR

Y