15
GYF Domain Proteomics Reveals Interaction Sites in Known and Novel Target Proteins* S Michael Kofler, Kathrin Motzny, and Christian Freund‡ GYF domains are conserved eukaryotic adaptor domains that recognize proline-rich sequences. Although the structure and function of the prototypic GYF domain from the human CD2BP2 protein have been characterized in detail, very little is known about GYF domains from other proteins and species. Here we describe the binding prop- erties of four GYF domains of various origins. Phage dis- play in combination with SPOT analysis revealed the PPG(F/I/L/M/V) motif as a general recognition signature. Based on these results, the proteomes of human, yeast, and Arabidopsis thaliana were searched for potential in- teraction sites. Binding of several candidate proteins was confirmed by pull-down experiments or yeast two-hybrid analysis. The binding epitope of the GYF domain from the yeast SMY2 protein was mapped by NMR spectroscopy and led to a structural model that accounts for the differ- ent binding properties of SMY2-type GYF domains and the CD2BP2-GYF domain. Molecular & Cellular Proteomics 4:1797–1811, 2005. The GYF domain is a protein interaction domain ubiqui- tously expressed in eukaryotic species (1, 2). It belongs to the functional class of proline-rich sequence (PRS) 1 recognition domains such as SH3 (3, 4), WW (5), EVH1 (6), and UEV domains (7, 8) as well as profilin (9). Although the sequence requirements have been investigated in detail for most of the PRS-binding domains, only little is known about the recogni- tion code of GYF domains, and the biological role of GYF domain-containing proteins remains largely unknown. So far, an in-depth analysis has been performed solely for the GYF domain of CD2BP2. For this GYF domain a role in CD2 receptor-dependent T cell signaling has been observed (1, 10), and the identification of CD2BP2 as part of the U5 snRNP (11, 12) and as interaction partner of the core splicing protein SmB/B (13) suggests an independent role in splicing or splic- ing-associated processes. The structure of the complex of the CD2BP2-GYF domain with a CD2-derived proline-rich pep- tide defined a set of mostly conserved aromatic amino acids of the domain to act as the primary contact site for the interaction (2, 10). Analysis of the binding properties revealed two classes of ligands for CD2BP2-GYF. The CD2 class shows a charge dependence of binding and is characterized by the recognition motif PPGX(R/K), whereas the so-called PPGW class requires a hydrophobic residue directly C-termi- nal to the PPG core, and its recognition motif was identified as PPG(W/F/Y/M/L) (39). Sequence alignment suggests that the GYF domain of CD2BP2 belongs to a GYF domain subfamily that is characterized by a tryptophan at position 8 and an extended loop between -strands 1 and 2 (10) (see Fig. 1). Most GYF domains contain an aspartate at position 8 and a shorter 1-2 loop (Fig. 1) forming a second subfamily. Here we describe the analysis of four GYF domains from yeast, plant, and man that belong to the second subfamily, which we name the SMY2 subfamily of GYF domains. SMY2 was orig- inally cloned as a suppressor of the myo2-66 mutation in the motor protein MYO2 in Saccharomyces cerevisiae (14), and our investigation of the GYF domains of SMY2 and three other proteins shows that all sequences recognized by the different GYF domains converge on the PPG motif followed by one of the hydrophobic residues Phe, Ile, Leu, Met, and Val (). Except for this PPG motif there is little dependence on additional flanking residues within the respective ligands. The consensus motifs that were derived by a combination of phage display and substitution analysis allowed us to identify natural target sites. Peptides comprising these sites were tested for binding by membrane SPOT analysis. Several in- teractions were further investigated by fluorescence and NMR titration experiments revealing binding epitopes and affinities for the novel ligands. Yeast two-hybrid analysis and pull-down experiments confirmed the association of selected proteins with GYF domains under more physiological conditions and allowed the placement of GYF domain-containing proteins into known functional contexts. In S. cerevisiae, the GYF domains of SMY2 and its paralog YPL105C (named SYH1 for SMY2 homolog 1) interact with proteins that have been im- From the Protein Engineering Group, Forschungsinstitut fu ¨ r Mole- kulare Pharmakologie and Freie Universita ¨ t Berlin, Robert-Ro ¨ ssle-Str. 10, 13125 Berlin, Germany Received, May 4, 2005, and in revised form, August 9, 2005 Published, MCP Papers in Press, August 23, 2005, DOI 10.1074/ mcp.M500129-MCP200 1 The abbreviations used are: PRS, proline-rich sequence; EVH1, Ena-Vasp homology 1; GFP, green fluorescent protein; GYN4, GYF domain-containing protein binding to Not4; GYN4-PR, GYN4-GYF, comprising the C-terminal proline-rich motif; NPWBP, Npw38-bind- ing protein; PERQ1, PERQ amino acid rich with GYF domain protein 1; PERQ2, PERQ amino acid rich with GYF domain protein 2; PR- SYH1, SYH1-GYF, comprising the N-terminal proline-rich motif; SYH1, SMY2 homolog 1; SH3, Src homology 3; snRNP, small nuclear ribonucleoprotein; SWAN, SH3/WW domain anchor protein in the nucleus; UEV, ubiquitin E2 (ubiquitin carrier protein) variant; HA, he- magglutinin; SGD, Saccharomyces Genome Database; eIF, eukary- otic initiation factor; E3, ubiquitin-protein isopeptide ligase; TAIR, The Arabidopsis Information Resource. Research © 2005 by The American Society for Biochemistry and Molecular Biology, Inc. Molecular & Cellular Proteomics 4.11 1797 This paper is available on line at http://www.mcponline.org by guest on April 12, 2019 http://www.mcponline.org/ Downloaded from

GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

GYF Domain Proteomics Reveals InteractionSites in Known and Novel Target Proteins*□S

Michael Kofler, Kathrin Motzny, and Christian Freund‡

GYF domains are conserved eukaryotic adaptor domainsthat recognize proline-rich sequences. Although thestructure and function of the prototypic GYF domain fromthe human CD2BP2 protein have been characterized indetail, very little is known about GYF domains from otherproteins and species. Here we describe the binding prop-erties of four GYF domains of various origins. Phage dis-play in combination with SPOT analysis revealed thePPG(F/I/L/M/V) motif as a general recognition signature.Based on these results, the proteomes of human, yeast,and Arabidopsis thaliana were searched for potential in-teraction sites. Binding of several candidate proteins wasconfirmed by pull-down experiments or yeast two-hybridanalysis. The binding epitope of the GYF domain from theyeast SMY2 protein was mapped by NMR spectroscopyand led to a structural model that accounts for the differ-ent binding properties of SMY2-type GYF domains and theCD2BP2-GYF domain. Molecular & Cellular Proteomics4:1797–1811, 2005.

The GYF domain is a protein interaction domain ubiqui-tously expressed in eukaryotic species (1, 2). It belongs to thefunctional class of proline-rich sequence (PRS)1 recognitiondomains such as SH3 (3, 4), WW (5), EVH1 (6), and UEVdomains (7, 8) as well as profilin (9). Although the sequencerequirements have been investigated in detail for most of thePRS-binding domains, only little is known about the recogni-tion code of GYF domains, and the biological role of GYFdomain-containing proteins remains largely unknown. So far,

an in-depth analysis has been performed solely for the GYFdomain of CD2BP2. For this GYF domain a role in CD2receptor-dependent T cell signaling has been observed (1,10), and the identification of CD2BP2 as part of the U5 snRNP(11, 12) and as interaction partner of the core splicing proteinSmB/B� (13) suggests an independent role in splicing or splic-ing-associated processes. The structure of the complex of theCD2BP2-GYF domain with a CD2-derived proline-rich pep-tide defined a set of mostly conserved aromatic amino acidsof the domain to act as the primary contact site for theinteraction (2, 10). Analysis of the binding properties revealedtwo classes of ligands for CD2BP2-GYF. The CD2 classshows a charge dependence of binding and is characterizedby the recognition motif PPGX(R/K), whereas the so-calledPPGW class requires a hydrophobic residue directly C-termi-nal to the PPG core, and its recognition motif was identified asPPG(W/F/Y/M/L) (39). Sequence alignment suggests that theGYF domain of CD2BP2 belongs to a GYF domain subfamilythat is characterized by a tryptophan at position 8 and anextended loop between �-strands 1 and 2 (10) (see Fig. 1).Most GYF domains contain an aspartate at position 8 and ashorter �1-�2 loop (Fig. 1) forming a second subfamily. Herewe describe the analysis of four GYF domains from yeast,plant, and man that belong to the second subfamily, which wename the SMY2 subfamily of GYF domains. SMY2 was orig-inally cloned as a suppressor of the myo2-66 mutation in themotor protein MYO2 in Saccharomyces cerevisiae (14), andour investigation of the GYF domains of SMY2 and three otherproteins shows that all sequences recognized by the differentGYF domains converge on the PPG motif followed by one ofthe hydrophobic residues Phe, Ile, Leu, Met, and Val (�).Except for this PPG� motif there is little dependence onadditional flanking residues within the respective ligands. Theconsensus motifs that were derived by a combination ofphage display and substitution analysis allowed us to identifynatural target sites. Peptides comprising these sites weretested for binding by membrane SPOT analysis. Several in-teractions were further investigated by fluorescence and NMRtitration experiments revealing binding epitopes and affinitiesfor the novel ligands. Yeast two-hybrid analysis and pull-downexperiments confirmed the association of selected proteinswith GYF domains under more physiological conditions andallowed the placement of GYF domain-containing proteinsinto known functional contexts. In S. cerevisiae, the GYFdomains of SMY2 and its paralog YPL105C (named SYH1 forSMY2 homolog 1) interact with proteins that have been im-

From the Protein Engineering Group, Forschungsinstitut fur Mole-kulare Pharmakologie and Freie Universitat Berlin, Robert-Rossle-Str.10, 13125 Berlin, Germany

Received, May 4, 2005, and in revised form, August 9, 2005Published, MCP Papers in Press, August 23, 2005, DOI 10.1074/

mcp.M500129-MCP2001 The abbreviations used are: PRS, proline-rich sequence; EVH1,

Ena-Vasp homology 1; GFP, green fluorescent protein; GYN4, GYFdomain-containing protein binding to Not4; GYN4-PR, GYN4-GYF,comprising the C-terminal proline-rich motif; NPWBP, Npw38-bind-ing protein; PERQ1, PERQ amino acid rich with GYF domain protein1; PERQ2, PERQ amino acid rich with GYF domain protein 2; PR-SYH1, SYH1-GYF, comprising the N-terminal proline-rich motif;SYH1, SMY2 homolog 1; SH3, Src homology 3; snRNP, small nuclearribonucleoprotein; SWAN, SH3/WW domain anchor protein in thenucleus; UEV, ubiquitin E2 (ubiquitin carrier protein) variant; HA, he-magglutinin; SGD, Saccharomyces Genome Database; eIF, eukary-otic initiation factor; E3, ubiquitin-protein isopeptide ligase; TAIR, TheArabidopsis Information Resource.

Research

© 2005 by The American Society for Biochemistry and Molecular Biology, Inc. Molecular & Cellular Proteomics 4.11 1797This paper is available on line at http://www.mcponline.org

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 2: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

plicated in pre-mRNA branch point binding (MSL5) or regula-tion of translation (EAP1). For the human O75137 protein (alsoindicated as PERQ2 because of its homology to human PERQamino acid rich with GYF domain protein 1 (PERQ1)), our workidentified the core splicing proteins SmB/B� and snRNP-N asthe most likely interaction partners, whereas the Arabidopsisthaliana Q9FMM3-GYF domain selects for sequences in thehomolog of the human transcription regulator CNOT4 (ac-cordingly termed GYN4, GYF domain-containing proteinbinding to Not4 homolog). GYN4 and SYH1 contain internalbinding sites for their respective GYF domains, and we showthat intramolecular association precludes intermolecular bind-ing in the case of GYN4.

MATERIALS AND METHODS

Constructs—The estimated domain borders of the GYF domainsof proteins PERQ2 (Swiss-Prot accession number O75137; resi-dues 531–596), GYN4 (Swiss-Prot accession number Q9FMM3; res-idues 546–604), SMY2 (Swiss-Prot accession number P32909;residues 243–340), and SYH1 (Swiss-Prot accession numberQ02875; residues 150–226) were obtained from sequence align-ments, and the corresponding fragments were subcloned into ex-pression vectors. For GYN4 and SYH1, longer constructs containingan internal PRS (GYN4 residues 546–619, termed GYN4-PR, andSYH1 residues 141–226, termed PR-SYH1; see Fig. 1) were alsocloned into protein expression vectors. All constructs were amplifiedfrom the following DNA clones. PERQ2 was derived from the cDNAclone HJ03496 and GYN4 was derived from genomic P1 clone MBD2(kind gifts from Takahiro Nagase and Satoshi Tabata, Kazusa DNAResearch Institute, respectively), whereas SMY2 and SYH1 wereamplified from genomic DNA of yeast strain S288C. For GST fusionprotein expression, the fragments of PERQ2, SYH1, and SMY2 werecloned into pGEX4T-1 (Amersham Biosciences) via BamHI and XhoIrestriction sites, and the fragments of GYN4 were cloned via BamHIand NotI restriction sites. For yeast two-hybrid screens the GYFdomains of PERQ2 and GYN4 were cloned into the bait vector pG-BKT7 (Clontech) via NcoI and BamHI and via NcoI and NotI restrictionsites, respectively. For the yeast two-hybrid analysis of selectedcandidates, the same fragment of PERQ2 was also introduced intothe prey vector pGADT7 (Clontech). Fragments of the candidatesNPWBP (Swiss-Prot accession number Q9Y2W2; residues 388–551),SmB (Swiss-Prot accession number P14678-2; residues 148–231),and SWAN (Swiss-Prot accession number Q9NTZ6; residues 700–869) were amplified from the following I.M.A.G.E. Consortium (Law-rence Livermore National Laboratory) cDNA clones (15) obtained fromthe Deutsches Ressourcenzentrum fur Genomforschung GmbH:IRALp962P0114Q2 (NPWBP), IMAGp998D118415Q3 (SmB), andIRALp962K1725Q2 (SWAN). The respective I.M.A.G.E Consortiumclone identification numbers are 3829990, 3445210, and 3956772.These fragments, comprising the proline-rich regions of the respec-tive proteins, and the cytoplasmic tail of CD2 (Swiss-Prot accessionnumber P06729; residues 245–351) were inserted into pGBKT7 viaNcoI-NotI or EcoRI and BamHI restriction sites in the latter case.

The His6-tagged GYF domain of SMY2 was expressed from amodified pET28 vector, containing an N-terminal His6 tag, followed bya thrombin cleavage site and BamHI and XhoI restriction sites, whichallowed cloning similar to that for the GST fusion construct; forHis6-tagged PERQ2, the corresponding fragment was cloned into thepTFT74 vector (16) via NcoI and HindIII sites with an N-terminal His6

tag introduced via PCR.Cloning of the focused library RKRSHRXXPPPXXXVQ into PC89

was similar to the procedure described elsewhere (17). PC89 and the

PC89 nonapeptide library (X9) were a gift from Gianni Cesareni (Di-partimento di Biologia, Universita di Roma).

Protein Preparation—Proteins were expressed in Escherichia coliBL21 (DE3-pLys S) and purified from the soluble fraction after soni-cation. GST-GYF and His6-tagged GYF domains were purified byaffinity chromatography using glutathione-Sepharose and Ni2�-ni-trilotriacetic acid-agarose according to the manufacturer’s manual(Amersham Biosciences), respectively, and dialyzed against PBS. Toobtain NMR samples for titration experiments and backbone reso-nance assignments of the SMY-GYF domain, cells were grown ondefined media supplemented with [15N]NH4Cl and/or [13C]glucose.The His6 tag of SMY2-GYF and GST tag of SYH1-GYF were removedby thrombin cleavage (Calbiochem, 10 units/mg of protein, at 4 °Cand 16 °C overnight in PBS, respectively), and the domains werepurified by subsequent gel filtration (Superdex® 75, Amersham Bio-sciences) in PBS.

Phage Display—Phage displaying the nonapeptide (X9) or the fo-cused peptide library fused to the major capsid protein were pro-duced by transforming E. coli XL-1 Blue cells with PC89 constructsfollowed by superinfection with the VCS-M13 helper phage (Strat-agene). After overnight incubation in 2� YT medium in the presenceof ampicillin and kanamycin (30 °C at 270 rpm) phage particles werepurified by three successive polyethylene glycol/NaCl precipitations(18). Library screening was performed as follows. 30–50 �l of GST-GYF-loaded glutathione-Sepharose 4B beads (Amersham Bio-sciences) were incubated with 5 � 109–5 � 1011 infectious particlesat 4 °C overnight in PBS. After washing three times with PBS, boundphages were eluted with 100 mM glycine HCl, pH 2.2, and neutralizedwith 2 M Tris. For phage amplification, E. coli XL-1 Blue cells wereinfected with eluted phage followed by superinfection with helperphage and subsequent incubation as described above. After sixrounds of panning, the inserts of selected phage were sequenced toidentify their displayed peptides.

SPOT Analysis—Peptides were synthesized on Whatman 50 paperusing an Auto-Spot Robot (ASP 222, Invartis AG, Koln, Germany).Peptide synthesis using Fmoc (N-�-(9-fluorenylmethyloxycarbonyl))chemistry on �-alanine functionalized cellulose membranes was per-formed according to standard protocols (19). Membranes wereprobed with GST-GYF protein as described elsewhere (20). Briefly themembranes were incubated with GST-GYF (40 �g/ml) overnight. Afterwashing with PBS, bound GST fusion protein was detected withrabbit polyclonal anti-GST antibody (Z-5, Santa Cruz Biotechnology)and horseradish peroxidase-coupled anti-rabbit IgG antibodies(Rockland). An enhanced chemiluminescence substrate (SuperSignalWest Pico, Pierce) was used for detection by a LumiImagerTM (RocheApplied Science).

NMR Spectroscopy—The NMR experiments were performed at297 or 299 K on either a Bruker DRX600 or DMX750 instrumentequipped with standard triple resonance probes. Data processingand analysis were carried out with the XWINNMR (Bruker) andSPARKY (40) software packages. Backbone assignment of theSMY2-GYF domain was based on the CBCA(CO)NH (21), the CB-CANH (22), and the HNCO (23) experiments of SMY2-GYF in PBS. Inthe NMR titration experiments, increasing amounts of synthetic pep-tides (see Table IV) were added to a 0.2 mM sample of the 15N-labeledCD2BP2- or SMY2-GYF domain. The gradual change of chemicalshifts in the heteronuclear single quantum coherence spectra allowedthe resonances of the ligand-bound GYF domain to be unambigu-ously assigned. The sum of the chemical shift changes for 15N and 1Hatoms in a sample with peptide was determined as ((�1H)2 �(�15N)2)1/2 where �1H is in units of 0.1 ppm and �15N is in units of 0.5ppm. Curve fitting and KD calculations were performed with theMicrocalTM OriginTM program assuming a simple two-state bindingmodel.

GYF Domain Proteomics

1798 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 3: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

Fluorescence Titration—Fluorescence of GYF domains or GST-GYF fusions in the case of GYN4 (3 �M in PBS) was excited at 280 nmin the presence of increasing amounts of peptides (see Table IV) at25 °C on a PerkinElmer Life Sciences LS-50B fluorometer, and theemission spectra were recorded between 300 and 400 nm. Centroidshifts were calculated using the software SpecWin (a kind gift ofSebastian Modersohn). Binding data were analyzed as describedabove.

Yeast Two-hybrid Experiments—Yeast two-hybrid experimentswere performed using MATCHMAKER GAL4 two-hybrid system 3according to the manufacturer’s manual (Clontech). For libraryscreens, the pGBKT7 bait construct encoding for the GAL4 DNA-binding domain fused to the GYF domain of PERQ2 or GYN4, eitherwith or without its PRS extension, was introduced into the yeast strainAH109 followed by the transformation with the human lung cDNAlibrary in the pGAD vector (Clontech) or the Horwitz and Ma A.thaliana two-hybrid library (Arabidopsis Biological Resource Center)(24), respectively. Plasmids of cotransformants growing on syntheticdefined medium deficient for His, Leu, and Trp or Ala, His, Leu, andTrp were rescued from yeast according to a modified MATCHMAKERprotocol. After incubation with lyticase and lysis of cells with SDS andfreeze-thawing, lysates were mixed with N3 buffer of the Qiagenplasmid isolation kit, and plasmid preparation followed the protocolthereof. Selected candidates were sequenced to identify the polypep-tide interacting with the respective GYF domains. For the analysis ofsuggested interaction partners and for reconfirming candidates fromthe yeast two-hybrid screen, the corresponding bait/prey vector com-binations were introduced into yeast and cultured on synthetic de-fined medium as described above.

Lysate Preparation and GST Pull-down Assay—Yeast strainBY4741 was transformed with Yep352-5��3xHA/EAP1 (25) (a giftfrom Nahum Sonenberg) or GAL-MSL5-HA(3) (26) (a gift from MichaelRosbash). Exponentially growing cells were harvested at A600 of0.7–1.4, washed with 0.9% NaCl, and resuspended in lysis buffer (50mM Tris-HCl, pH 8.0, 1% DMSO, 100 mM NaCl, 1 mM EDTA, 1 mM

PMSF, 1 tablet/100 ml Complete Mini EDTA-free protease inhibitormixture, Roche Applied Science) supplemented with PMSF to a finalconcentration of 3.4 mM and Complete Mini EDTA-free proteaseinhibitor mixture to �2.8 tablets/10 ml. Cells were lysed by vortexingthree times for 1 min with �3 g of glass beads (425–600 �m, acidwashed, Sigma)/g of cell wet weight. Cell debris were removed by

centrifugation (20 min at 16,000 rpm), and lysates were snap frozen.25 �l of glutathione-Sepharose 4B beads (Amersham Biosciences),

loaded with GST or GST fusion proteins of the respective GYF do-mains (SYH1 and SMY2), were incubated with 100-�l lysates at 4 °Covernight in the absence or presence of 1 mM competing peptideMSL5L1 (see Table IV). Beads were washed three times with PBS.Bound proteins were eluted in SDS sample buffer, separated bySDS-PAGE, and transferred onto nitrocellulose membrane. HA-tagged MSL5 or EAP1 was detected by probing the membrane withanti-HA antibody (BD Biosciences) and a horseradish peroxidase-conjugated secondary antibody. The immunoblots were then devel-oped as described for the SPOT analysis.

RESULTS

Phage Display—GYF domains from evolutionary distantspecies of the eukaryotic kingdom were chosen for analysis.The sequence alignment of these domains is shown in Fig. 1and highlights the conserved amino acids of the N-terminalpart of the domains. In the case of GYN4 and SYH1, two GYFdomain constructs of varying length were used to account forthe presence of an internal PRS that is localized directlyadjacent to the anticipated domain borders (Fig. 1). The phagelibraries applied for the screen were of the format X2PPPX3 orX9, and the peptides were expressed as gene VIII fusionproteins (17). Individual clones were sequenced after sixrounds of panning, and the results are summarized in Table I.As can be seen, the obtained sequence motifs for both librar-ies are similar, and the preference for the PPG core motif isobserved for all of the GYF domains. A strict requirement fora third consecutive proline is not seen in the case of the X9

library and is in agreement with previous structural data of theCD2BP2-GYF domain�CD2 peptide complex, showing thatthe PPG motif contributes most of the van der Waals interac-tions (10). The general requirement for the PPG motif sug-gests that the binding mode for the four GYF domains andCD2BP2-GYF is similar. The requirement for only two prolines

FIG. 1. Alignment of the GYF domains used in this study in comparison with the CD2BP2-GYF domain. Conserved residues arehighlighted as white letters on a black background. Residues characteristic for the two subfamilies of GYF domains are shown as white letterson a gray background. Sequences and numbers in brackets belong to the respective single chain constructs used in this study (GYN4-PR andPR-SYH1) or analyzed previously (CD2BP2). Note that for CD2BP2, two single chain constructs are displayed, either with an N-terminal or witha C-terminal extension (27). Numbers above the alignment refer to the CD2BP2-GYF domain; numbers flanking the sequences indicate theposition within the respective full-length proteins. For CD2BP2-GYF single chain constructs, linker residues and the GYF domain-binding motiffrom CD2 are depicted as small italicized letters in regular type and in bold on a gray background, respectively. Residues depicted as boldcapitals on a gray background comprise intramolecular binding motifs for the GYF domains of GYN4 and SYH1. Secondary structure elementsare depicted above the alignment.

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1799

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 4: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

raises the question, however, whether formation of thepolyproline II helix is a prerequisite for binding of ligands toGYF domains in general.

For the position directly C-terminal to the PPG motif, theGYF domains of SMY2, SYH1, and PERQ2 seem to acceptmost hydrophobic amino acids with a preference for leucineor isoleucine. For the GYN4-GYF domain (containing aminoacids 546–604 of the full-length protein), phenylalanine is thepreferred amino acid at this position, whereas a longer con-struct (GYN4-PR; amino acids 546–619) that includes the

proline-rich sequence KSGPPPGFTG failed to select for pep-tides defining a consensus (Table I and data not shown).Interestingly the internal PRS comprises the consensus motifPPGF for this GYF domain as can be seen from Table I.Because previous experiments from our group have shownthat a C-terminally linked proline-rich ligand can bind intramo-lecularly to the CD2BP2-GYF domain (Fig. 1) (27), we suggestthat the intramolecular binding of the ligand in GYN4-PRmasks the GYF domain binding site accordingly. In contrast,the presence of the internal PRS directly N-terminal to the

TABLE IPhage display results

Phage display results for various GYF domains after six rounds of panning. Glutathione bead-coupled GST-GYF domain fusions from theproteins indicated in the first row were used for the selection procedure. PR-SYH1 includes the N-terminal intramolecular PRS. The tripleproline stretch in the focused library as well as prolines of the binding motif from the X9 library are depicted in white letters on a graybackground. Residues other than proline that show significant presence in both screens or that are characterized by chemical properties similarto the frequently found amino acids are shown in white on a black background. Based on this classification the strict consensus sequence foreach GYF domain was defined as shown in the row “Strict Consensus.” For the relaxed consensus, additional amino acid types at the lastposition of the consensus were included to comprise also the essential binding motifs derived from substitution analysis (last row). For bothstrict consensus and relaxed consensus, the second position following the PPG motif was excluded from all consensus sequences due tosubstitution analysis results (see Fig. 2).

GYF Domain Proteomics

1800 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 5: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

GYF domain in SYH1 does not interfere with phage selection(Table I). This is in agreement with experiments of CD2BP2-GYF domain single chain constructs, showing that a shortlinker sequence is not sufficient to allow the intramolecularinteraction between an N-terminal binding motif and the GYFdomain to take place (27).

Analysis of Phage Display Peptides—Substitution analysisof individual peptides selected by phage display were con-ducted to identify residues within the peptide that are impor-tant for binding. Therefore, peptides were synthesized on acellulose membrane, and the binding to GST-GYF domainswas analyzed by anti-GST antibody detection (Fig. 2). Thesubstitution analysis highlights the requirement for the PPG(F/I/L/V) motif for the GYF domains of PERQ2, SMY2, and SYH1.A large hydrophobic amino acid is clearly preferred as themost C-terminal amino acid of the motif in these cases,whereas GYN4-GYF exclusively binds to the PPGF motif. Forall four domains, amino acids outside of the central PPG�

motif of the phage display-derived peptides are not critical forbinding. This suggests that hydrophobic interactions almostentirely account for the observed binding behavior. In sum-mary, these results identify the general recognition code forthe SMY2 subfamily of GYF domains to be PPG(F/I/L/M/V)with subtle domain-dependent variations in specificity (TableI and Fig. 2).

Proteome Analysis for GYF Domain Binding Sites—Thephage display results allowed us to define the recognitionsignature for each individual domain (Table I, strict consen-sus). Strict consensus sequences included the most fre-quently found PPGX motifs or motifs with amino acids at

position X that show physicochemical properties similar to thefrequently found amino acid. The relaxed consensus motifscomprised the essential binding motifs derived from the sub-stitution analysis. Additional positions were not taken intoaccount for both strict and relaxed consensus because thesubstitution analysis results showed that the PPGX motif isthe major determinant for binding. We performed databasesearches in the Swiss-Prot/TrEMBL, The Arabidopsis Infor-mation Resource (TAIR), and SGD databases for the differentGYF domains with relaxed and strict consensus motifs (TableI) following a strategy that has also been used by others (28).For the yeast GYF domains, the search was performed withthe relaxed consensus PPG(A/C/F/G/I/L/M/V/W/Y). For thehuman PERQ2-GYF domain and the A. thaliana GYF domainof GYN4, the relaxed consensus PPG(A/E/F/H/I/L/M/S/T/W/Y) and PPG(F/I/L/M/V/W/Y) were applied in the databasesearch, respectively. In the latter two cases, motifs were onlyselected if present twice in the protein with a maximal linkerdistance of 40 amino acids. This accounts for avidity effects oftandem binding motifs and reduces the number of databasehits below 1000. Based on our phage display results, data-base searches were also performed with the strict motifsPPGL and PPGF for the GYF domains of PERQ2 and GYN4,respectively, to identify optimal individual binding motifs. Allhuman proteins (except for procollagens, collagens, and hy-pothetical proteins) were considered in the search for PERQ2.349 non-redundant motifs in 152 proteins for the relaxed and166 non-redundant motifs in 157 proteins for the strict con-sensus were found. Correspondingly for A. thaliana the re-laxed motif was found 34 times in 15 proteins, and the strict

FIG. 2. Substitution analysis of peptides that were selected by phage display. All possible single substitution analogs of the peptide weresynthesized on a membrane. The single letter code above each column indicates the amino acid that replaces the corresponding wild-typeresidue; rows define the position of the substitution within the peptide. Spots in the most left column (WT) are identical and represent thewild-type peptide. The membranes were incubated with GST-GYF fusion constructs of the indicated proteins and processed as describedunder “Materials and Methods.” PR-SYH1 includes the N-terminal intramolecular PRS of the protein (Fig. 1).

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1801

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 6: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

consensus was found 227 times in 221 proteins of the A.thaliana proteome. For S. cerevisiae, 168 motifs in 153 se-quences containing the relaxed consensus were analyzed. Totest binding, the peptides obtained from the databasesearches were synthesized on a membrane and incubatedwith the respective GST-GYF domain constructs. Fig. 3shows the results for the various domains and clearly indi-cates that many of the sequences deduced from the databasebind to the GYF domains (Supplemental Table 1). To validatethe results from the proteome analysis, several interactioncandidates for the different GYF domains were chosen for

further analysis according to signal strength and number ofconsensus motifs present within the proteins (Table II). Inaddition, fluorescence and NMR titrations of GYF domainswith selected peptides were performed. The dissociation con-stants for the GYF domain peptide interactions are in therange of 4–300 �M (see Table IV and Supplemental Fig. 1)when a simple two-state binding model was assumed.

Pull-down Experiments: Novel Interaction Partners of theYeast GYF Domains—From the proteome analysis a large setof interactions was derived for the individual GYF domains.For the two yeast GYF domains of SMY2 and SHY1, the most

FIG. 3. Search for GYF domain binding sites in natural proteins. Swiss-Prot/TrEMBL, TAIR, or the SGD databases were searched forsequences meeting the relaxed or strict consensus motifs of the GYF domains of PERQ2, GYN4, SYH1, and SMY2. Selected candidates weresynthesized on a membrane and incubated with the GST-GYF fusion constructs of the indicated proteins. The consensus sequences that wereused as profiles in the respective searches are depicted above the membranes. Peptide spots comprising the internal recognition motif forGYN4 and SYH1, respectively, are circled.

GYF Domain Proteomics

1802 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 7: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

likely interaction candidates are MSL5, EAP1, and PRP8 withseven, four, and three motifs fitting to the relaxed consensus,respectively (Table II), whereas three other proteins containthe motif twice. SPOT analysis confirmed that several of the

PRSs in the yeast proteins MSL5 and EAP1 present bona fidebinding sites for SMY2 and SYH1 in vitro, and avidity effectslikely contribute to enhanced binding in vivo. To validateMSL5 and EAP1 as interaction partners, we performed pull-

TABLE IICandidates from the proteome analysis

Selected candidates obtained from the proteome SPOT analysis of proline-rich peptides comprising the identified recognition motifs areshown. For the different GYF domains, candidates were selected from the respective proteome according to binding strength and number ofmotifs present. Portions of the proteins that comprise the putative binding sites are shown, and motifs fitting to the indicated relaxed or strictconsensus motifs are depicted as underlined or bold and underlined letters, respectively. The indicated fragments of candidates for PERQ2were tested for binding in the yeast two-hybrid system with CD2 and lamin C as controls. Numbers in parentheses below the protein namesrefer to the position of the cloned region within the full-length protein. The two candidates MSL5 and EAP1 were subjected to pull-downanalysis with SYH1 and SMY2.

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1803

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 8: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

down experiments with the two GST-tagged yeast GYF do-mains and lysates of cells overexpressing HA-tagged MSL5or EAP1.

The experiment confirmed the interaction between theyeast GYF domains and both MSL5 and EAP1 (Fig. 4, lanes 3and 5), and the interactions could be competed out with theproline-rich peptide MSL5L1 (Fig. 4, lanes 4 and 6, and seeTable IV). This establishes that the PRSs of MSL5 and EAP1mediate the direct interaction with SMY2 and SYH1. Interest-ingly MSL5 has been reported as an interaction partner forboth SMY2 and SYH1 in a yeast two-hybrid screen (see theSGD at www.yeastgenome.org), further supporting our find-ings. EAP1 was described as a protein binding to the cap-binding protein eIF4E (25), and subsequently a role in trans-lational attenuation in response to vesicular transport defectshas been reported (29). Because SMY2 was cloned as asuppressor of the myo2-66 mutation of type V myosin (14), amotor protein playing a decisive role in vesicular transport,this interaction establishes a particularly interesting link be-tween transport processes and splicing or translational con-trol. From a biochemical point of view, the binding of identicaltarget proteins by SMY2- and SYH1-GYF is not surprisingbecause the two GYF domains are very homologous. Thebiological implications of this redundancy are not clear, butthe identified interaction partners hint to a role of SMY2and/or SYH1 in processes that are coupled to the transport ofsplicing factors and translational attenuators.

Yeast Two-hybrid Analysis: Novel Interaction Partners of thePERQ2- and GYN4-GYF Domains—For the GYF domain ofthe human PERQ2 protein we chose the three nuclear pro-teins SmB, NPWBP, and SWAN for further investigations byyeast two-hybrid analysis. As can be seen from Fig. 5, aninteraction with PERQ2-GYF could be established for all threeproteins, and the results show that the proposed interactioncan take place under more physiological conditions. In addi-tion to this knowledge-driven identification of binding part-ners, we performed a yeast two-hybrid screen for the human

PERQ2-GYF and the A. thaliana GYN4-GYF domain. Unbi-ased screening of a human lung cDNA library for interactionpartners of the PERQ2-GYF domain resulted in the identifica-tion of a number of putative interactors, including the smallnuclear ribonucleoproteins SmB/B�, the closely relatedsnRNP-associated protein N (snRNP-N, SmN), U1 snRNPprotein C, and the SWAN protein (Fig. 6A and Table III). Thesefindings are in good agreement with the interaction partnersthat were proposed by the knowledge-based strategy (Figs. 3and 5) described elsewhere (28) because all of the abovementioned proteins contain the phage display-derived recog-nition motif (Table I). SmB/B� was previously identified as abinding partner of CD2BP2 (13), and it will be interesting tofurther investigate whether SmB/B� and SmN also attract thePERQ2-GYF domain in vivo. Such an interaction would implythat the two GYF domain-containing proteins converge in therecognition of proteins of the SmB/SmN family. Because U1snRNP protein C and SWAN also represent nuclear proteinsand because NPWBP colocalizes with splicing factors (30), afunction of PERQ2-GYF in splicing or splicing-associated pro-cesses is well conceivable. Initial localization experimentswith GFP-PERQ2-GYF support this notion, showing a pre-dominant nuclear localization (data not shown). However, themouse homolog of PERQ2, GIGYF2, was identified as a bind-ing partner of the cytosolic adaptor protein Grb10 by a yeasttwo-hybrid screen. Because human Grb10, in contrast tomouse Grb10, is devoid of a PPG� recognition motif, thefunction of the human PERQ2 and the mouse GIGYF2 proteinmay well be different.

For the GYN4-GYF domain of A. thaliana, the yeast two-hybrid screen resulted in the selection of At2g28540.1,At3g45630.1, At5g60170.1, and At5g65410.1 (Fig. 6B and Ta-ble III). All of these proteins contained the PPGF motif that wasalso selected by the phage display screen (Table I). Interestinglythe first three proteins obtained from the yeast two-hybrid anal-

FIG. 4. Pull-down of HA-tagged protein with yeast GYF do-mains. Lysate of yeast overexpressing HA-tagged MSL5 (A) or EAP1(B) was incubated with GST- or GST-GYF-loaded glutathione beadsin the absence (�) or presence (�) of 1 mM competing MSL5L1peptide. Bound proteins were separated by 12% SDS-PAGE, trans-ferred onto a cellulose membrane, and probed with anti-HA antibody.Lane 1 contains lysate, lane 2 contains GST-loaded beads, and lanes3–6 contain GST-GYF-loaded beads.

FIG. 5. Yeast two-hybrid analyses of selected candidates. Baitplasmids containing fragments of proteins selected from the pro-teome analysis (Table II) were co-transformed with the prey vectorconstruct of the PERQ2-GYF domain. Empty bait and prey vectorswere used as control. Colonies were replated on non-selective (�Leu,�Trp), medium stringency (�His, �Leu, �Trp), and high stringencymedia (�Ade, �His, �Leu, �Trp). Plates were analyzed after 4 daysof incubation at 30 °C. Borders of cloned fragments of the differentproteins are indicated in parentheses.

GYF Domain Proteomics

1804 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 9: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

ysis contained two PPGF motifs spaced by six or seven aminoacids (Table III), a pattern previously found in the CD2BP2-GYFligand CD2 (1). These proteins were also identified by our pro-teome analysis as potential interaction partners (Fig. 3, strictconsensus) and show that the two complementary methodsresult in the identification of an overlapping set of proteins. Oneprotein, Atg45630.1, has the same domain composition but alow overall sequence homology to the CNOT4 protein (39.5%similarity). CNOT4 has been described to be part of theCCR4�NOT transcription complex (31). It is a potential transcrip-tional repressor and possesses E3 protein ubiquitin ligase ac-tivity. The biochemical characterization of this complex in plantis elusive, but our data hint toward an involvement of GYFdomain-containing proteins in transcriptional regulation of A.thaliana gene expression. Based on these findings, the bindingof the human CNOT4 protein to GYF domain-containing pro-teins should be further tested for its potential biological rele-vance since the human protein contains the PERQ2-GYF do-main recognition motif PPGL.

NMR Analysis of the SMY2/MSL5 Interaction—To obtain adetailed molecular picture of the interaction between a GYFdomain of the SMY2 subfamily of GYF domains and a PRS,backbone assignments of NMR heteronuclear triple reso-nance spectra of the 13C/15N isotope-labeled SMY2-GYF do-main were performed. This allowed us to map the binding sitefor the MSL5S1 peptide (Table IV). The overlay of spectra ofisolated SMY2-GYF (red) with SMY2-GYF domain in the pres-ence of equimolar amounts (green) and a 10-fold excess(blue) of MSL5S1 peptide is shown in Fig. 7A. The deducedNH chemical shift changes of residues of the SMY2- andCD2BP2-GYF domain upon addition of MSL5S1 and CD2Speptides, respectively, are depicted in Fig. 7B. Most of theresidues that show large chemical shift changes are part ofthe characteristic bulge-helix-bulge motif within the N-termi-nal half of the domains and are indicated as green bars. Thethree residues Leu-289, Gln-290, and Ile-291 in the C-terminalhalf of SMY2-GYF display significant chemical shift changes,

but the homologous residues in CD2BP2-GYF only showsmall changes (Fig. 7B, blue bars). Because large chemicalshift changes are indicative of residues that are close in spaceto the ligand, the data can be qualitatively used to comparethe binding epitopes of SMY2 (or homologous SMY2-typeGYF domains) and CD2B2-GYF. The structure of one SMY2-type GYF domains was recently deposited in the Protein DataBank (GYF domain from the A. thaliana protein Q9FT92, Pro-tein Data Bank code 1WH2),2 and based on sequence com-parisons (Fig. 1) we plotted the results for the SMY2-GYFdomain (Fig. 7B) onto the surface of the GYF domain ofQ9FT92 (Fig. 7C). Residues forming the novel bindingepitope, as depicted in blue in Fig. 7, B and C, are partiallysolvent-exposed in the case of SMY2-like GYF domains butnot in CD2BP2-GYF, which might explain the differences seenin the binding preferences for the two types of GYF domains(see “Discussion”).

Intramolecular Recognition of Proline-rich Motifs by GYFDomains—The two proteins GYN4 and SYH1 display internalsequence motifs that match the consensus of the respectiveGYF domain. Fig. 3 shows that the two GYF domains recog-nize these motifs when synthesized on a membrane. Thepresence of the internal PRS in the GYF domain constructPR-SYH1 had no influence on phage display (Table I) orpull-down experiments (Fig. 4), whereas the KSGPPPGFTGsequence of GYN4 interfered with binding in phage display(data not shown) and yeast two-hybrid experiments (Fig. 6). Incontrast, the shorter GYN4-GYF construct allowed us to de-fine a recognition motif (Table I) and to identify interactionpartners (Fig. 6). Misfolding of the domain due to the proline-rich extension could be ruled out because the GST fusionprotein of GYN4-PR was able to bind to peptides synthesizedon a membrane where concentrations are high enough tocompete out the intramolecular interaction (data not shown).

2 N. Nameki, S. Koshiba, M. Inoue, T. Kigawa, and S. Yokoyama,unpublished data.

FIG. 6. Analysis of yeast two-hybrid screening results. Clones of interaction partners for PERQ2-GYF (A) or GYN4-GYF (B) selected byyeast two-hybrid screens were retransformed with different bait constructs and spread on non-selective (�Leu, �Trp) or high stringencymedium (�Ade, �His, �Leu, �Trp). Plates were analyzed after 4 days of incubation at 30 °C. The bait constructs containing the GYF domainsof PERQ2, GYN4, or GYN4 containing the internal PRS are indicated by PERQ2, GYN4, and GYN4-PR, respectively. Empty bait and preyvectors as well as the bait vector containing the unrelated protein lamin C (Lam) serve as negative control. Names or accession numbers ofinteraction partners are shown on the left side of the panels.

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1805

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 10: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

Taken together, these findings suggest that interaction in cisbetween the GYN4-GYF domain and the internal PRS masksits binding site, whereas for the SYH1 protein the intramolec-ular sequence does not influence the interaction competenceof its GYF domain. Further support for intramolecular encoun-ter between GYN4-GYF and its internal PRS is derived fromfluorescence titration experiments (Fig. 8). For the GYF do-main-only construct (residues 546–604 of GYN4), a KD valueof �4 �M for binding to the GYN4 peptide was obtained (TableIV). The presence of the internal recognition motif within theconstruct GYN4-PR decreased the affinity to the titratedGYN4 peptide by a factor of 8. In addition, the spectrum of theGYN4-PR and the shorter construct in its bound form displaysimilar centroid positions (Fig. 8). The most likely explanationfor the different behavior of GYN4-PR is the intramolecular

binding of a large fraction of the GYF domain to the internalPRS. Additional binding is only observed at a large excess ofexternally added peptide. The assumption of an intramolecu-lar interaction is substantiated by studies on artificialCD2BP2-GYF single chain constructs with N- or C-terminallylinked interaction sites (Fig. 1) that were used to obtain NMRrestraints for structural investigations of the CD2BP2/CD2interaction (27). Only C-terminal linkage resulted in the forma-tion of a soluble, intramolecular interaction and supported theview that the CD2 peptide SHRPPPPGHRV binds exclusivelyin one orientation to the CD2BP2-GYF domain (27). Similarlythe linker between the GYF domain and the C-terminal inter-action site within the GYN4 protein is of sufficient length toenable the autoinhibited conformation, whereas the attach-ment of a PRS directly N-terminal to the GYF domain, as it is

TABLE IIIYeast two-hybrid screening results

Interaction partners for PERQ2- and GYN4-GYF identified by yeast two-hybrid screening of a human lung and an A. thaliana cDNA library,respectively. Protein codes refer to Swiss-Prot/TrEMBL and TAIR accession codes and names. In the column “Total/Individual,” the totalnumber of identified clones encoding for the same interaction partner found on selective plates and the number of individual clones of therespective interaction partners are shown. Residue numbers and sequences of the protein fragments identified in all clones for the individualproteins are listed in the columns “Region” and “Sequence,” respectively. For 10 of the 15 clones of At2g28540.1, the insert was too long toallow complete sequencing of the indicated fragments with the standard forward sequencing primer.

GYF Domain Proteomics

1806 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 11: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

present within the SYH1 protein, excludes such an encounter.The observation of intramolecular masking is likely to be offunctional relevance and might prevent unwanted low affinityinteractions from taking place. Co-compartmentalization of aprotein containing a high affinity binding site can be envis-aged to outcompete the intramolecular interaction under ap-propriate conditions. In addition, post-translational modifica-tion of residues close to the binding site, as for examplephosphorylation or methylation, might act as regulatoryswitches for the GYF domain-mediated interactions.

DISCUSSION

In this first comprehensive study of GYF domains of theSMY2 subfamily, it is shown that PRS recognition of thesedomains converges on the four-amino acid recognition motifPPG�. Because all of the analyzed GYF domains contain theconserved signature (Y/F/W)X(Y/F)X6–11GPFX4(M/V/I)X2WX3-GYF, it is very likely that the hydrophobic cavity formed mainlyby the aromatic side chains of this signature constitutes themain side of interaction with the PPG� moiety. Albeit there isno three-dimensional complex structure available for a GYFdomain of the large SMY2 subfamily, the high conservation ofthe peptide-binding signature of GYF domains allows us toput our new results into a structural context and to comparethe recognition code defined in this study to the well charac-terized CD2BP2-GYF�CD2 peptide complex (see below). De-fining the recognition rules sets the stage for further functionalinvestigations of GYF domain-containing proteins. While ourresults reveal a considerable number of possible protein tar-gets for GYF domains, yeast two-hybrid analysis and pull-down experiments indicate that the selected interactionscould take place under more physiological conditions.

PERQ2—It was of some surprise to find the core splicingprotein SmB/B� as the most prominent interaction partnerobtained from the yeast two-hybrid analysis of the PERQ2-

GYF domain. SmB/B� contains numerous proline-rich motifsin its C-terminal tail that represent several binding sites forGYF domains. It was previously shown that CD2BP2 andSmB/B� interact and colocalize within the same nuclear com-partment and that the PPPGMR motifs within the SmB/B� tailrepresent the target sites for CD2BP2-GYF (2, 13). Accordingto our data, PERQ2-GYF can also bind to these PPPGMRmotifs. It is therefore plausible that CD2BP2 and PERQ2 cansimultaneously interact with SmB/B� under conditions whereSmB/B� is not limiting. Alternatively CD2BP2 could be themajor interaction partner for SmB/B� in most cells, whereasPERQ2 could preferentially bind to SmN, the brain-specificvariant of SmB/B�. Because the two human proteins CD2BP2and PERQ2 display no sequence homology in regions outsidethe GYF domain, functional redundancy of the two proteins isunlikely. It is rather possible that the two proteins utilize theSmB/B� recruitment in a different functional context. The thirdhuman protein containing a GYF domain is PERQ1. Its GYFdomain shows a high degree of sequence homology toPERQ2-GYF and is therefore expected to exhibit very similarbinding specificities. Competition for the same interactionpartners, as for example SmB/B�, is a possible scenario forPERQ1 and PERQ2 proteins, and the presence of homolo-gous regions in the two proteins suggests similar but notnecessarily identical functions. The role of SmB/B� as part ofthe seven-protein core of all snRNP complexes is firmly es-tablished and supports its role as an important structuralcomponent of the spliceosomal machinery (32). A recent re-port identified SmB/B� at the sites of initial adhesion of certainprimary cells, so-called spreading initiation centers (33). Strik-ingly no mRNA was found at these sites, opening the possi-bility that SmB/B� and also other Sm proteins play an impor-tant role in processes other than splicing. It is also known thatthe snRNPs undergo a complex maturation process in Homo

TABLE IVTitration data of different GYF domains

Different GYF domains were tested with various ligands. KD values were either determined by using NMR chemical shift changes (NMR) orthe shift of the fluorescence emission spectrum (Fluorescence), defined by the change in the position of the centroid, of the respective GYFdomain upon ligand addition. KD values were calculated assuming a two-state binding model (see also Supplemental Fig. 1).

GYF domain Ligand SequenceKD

NMR Fluorescence

�M

PERQ2 CD2S SHRPPPPGHRV 302CD2L HPPPPPGHRSQAPSHRPPPPGHRVQH 162SmB2 GTPMGMPPPGMRPPPPGMRGLL �20

snRNPA MPPPGMIPPPGLAPGQIPPGAM �33AKNA VSMKPPGFQAS �100

SMY2 MSL5S1 SSIAPPPGLSG 43 14MSL5L1 SIAPPPGLSGPPGFSN 25 8MSL5L2 DINKPTPPGLQGPPGL 47 32

SYH1 MSL5S1 SSIAPPPGLSG 12MSL5L1 SIAPPPGLSGPPGFSN 5MSL5L2 DINKPTPPGLQGPPGL 27

GYN4 Internal AKSGPPPGFTGAKQN 4GYN4-PR Internal AKSGPPPGFTGAKQN 31

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1807

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 12: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

FIG. 7. SMY2-GYF epitope mapping and comparison with the CD2BP2-GYF binding epitope. A, overlay of 15N-1H correlation spectraof the isolated SMY2-GYF domain (red), the SMY2-GYF domain upon equimolar addition of the MSL5S1 peptide (green), and the SMY2-GYFdomain in the presence of a 10-fold excess of MSL5S1 peptide (blue). NH resonances of residues with weighted geometrical differences of

GYF Domain Proteomics

1808 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 13: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

sapiens that includes shuttling between the nuclear and cy-toplasmic compartment (32), and the GYF domain-containingproteins could be involved in transport and/or localization ofSmB/B� and associated proteins. Interestingly Bedford et al.(34, 35) have described the proline-rich motifs RPP and PPRwithin NPWBP and SmB/B� as interaction sides for the WWdomains of FBP21 and FBP30. Furthermore these motifs arealso recognized by the SH3 domain of Fyn and p85, andassembly of different signaling proteins with these sequenceshas been suggested (34). Our results extend the list of proline-rich sequence recognition domains with similar or overlappingbinding motifs to include subgroups of WW, SH3, and GYFdomains. Several different signaling pathways could convergeon the protein SmB/B� mediated by the interactions of FBP21and FBP30 WW domains, SH3 domain-containing proteins,and CD2BP2- and PERQ2-GYF. However, additional informa-tion concerning the subcellular localization of PERQ2 is re-quired to validate the proposed functions of PERQ2 associ-ated with SmB/B� or its variants.

Yeast GYF Domains—In yeast, two-hybrid experimentshave previously identified MSL5 as an interaction partner ofthe homologous proteins SMY2 and SYH1 (see www.yeast-genome.org/). Our analysis shows that the GYF domains ofthe two proteins strongly interact with the PPG� motif-con-

taining MSL5 C-terminal tail. MSL5 is the yeast branch point-binding protein that is involved in spliceosome assembly (36).Global analysis of yeast protein localization suggests a pre-dominant nuclear occurrence of MSL5, whereas SMY2 andSYH1 are mostly found in the cytoplasm (37). A candidate fora cytoplasmic interaction partner of SMY2 and SYH1 is EAP1that could also be detected by pull-down experiments ofcellular lysates (Fig. 4). EAP1 was originally found as a trans-lational inhibitor protein that competes with eIF4G and p20 forinteraction with the eIF4E cap-binding protein. In addition,EAP1 has a separate function in mediating genetic stability(38), and a recent report highlights the involvement of EAP1 inthe attenuation of translation in cells with mutations in thesecretory pathway (29). Further investigations should addressthe question whether the two GYF domain-containing pro-teins are involved in transport processes that depend on thecap structure of mRNA.

Comparing Different Classes of GYF Domains—The SMY2-and CD2BP2-GYF domains represent two subfamilies of GYFdomains. All the GYF domains investigated here belong to theSMY2 type. This large subfamily is characterized by a short�1-�2 loop and an aspartate as the last residue of the �1strand (position 8), whereas the CD2BP2-GYF domain con-tains a tryptophan (Trp-8) at this position. Trp-8 contributes tobinding of the CD2 ligand SHRPPPPGHRV by directly con-tacting glycine 8 and arginine 10 of this peptide. At the sametime, the bulky Trp-8 side chain of the domain shields theconserved Tyr-6 and Phe 34 residues from solvent. A recentNMR structure of the A. thaliana Q9FT92-GYF domain (Pro-tein Data Bank code 1WH2) shows that Asp-8 allows a partialsolvent exposure of the conserved aromatic residues Tyr-6and Phe-34, which are now able to contribute to the hydro-phobic surface that most likely represents the interaction sitefor the proline-rich ligand. Fig. 7C shows a comparison of themolecular surface of the CD2BP2-GYF domain and theQ9FT92-GYF domain from A. thaliana. The binding epitope forQ9FT92-GYF/ligand interactions is not known; however,alignment of the SMY2- and Q9FT92-GYF domain (Fig. 1) wasused to color the suggested Q9FT92 epitope based on the

the chemical shifts colored in green or blue in B are labeled according to residue type and number. B, combined chemical shift changes ofCD2BP2-GYF and SMY2-GYF for their respective ligands, CD2S and MSL5S1. The weighted geometrical differences of the chemical shifts foreach residue of the domains upon addition of a 10-fold excess of peptide are plotted against the corresponding sequence. Residues of bothdomains (black for CD2BP2-GYF and red for SMY2-GYF) have been aligned for a better comparison. The key residues that define theconserved GYF domain signature are depicted as white letters on a black background. Weighted geometrical differences of the chemical shiftsof residues comprising the common binding epitope of CD2BP2-GYF and SMY2-GYF are depicted as green bars. Blue bars belong to residuesthat extend the binding epitope in SMY2-GYF. The NMR titration experiments were performed at 297 K with 0.2 mM sample of the 15N-labeledCD2BP2- or SMY2-GYF domain in 50 mM sodium phosphate buffer, pH 6.3, and PBS, pH 7.3, respectively. C, proposed model of ligandbinding for the SMY2 subfamily in comparison to the known orientation of the CD2BP2-GYF ligand CD2. Because the structure of theSMY2-GYF domain is not known so far, the binding epitope of SMY2-GYF�MLS5S1 was plotted onto the structure of Q9FT92-GYF based onsequence homology (Fig. 1). The known orientation of a CD2 class ligand bound to CD2BP2-GYF (left) and the suggested orientation of a ligandfor the SMY2 domain (right) is depicted as a red line above the indicated epitopes. The binding epitopes are color coded as in B. ResiduesLys-7, Pro-19, and Pro-33 in CD2BP2-GYF and the corresponding residues in Q9FT92 are also depicted in green because they are part of thebinding epitope. Prolines cannot be observed in 15N heteronuclear single quantum coherence spectra, whereas the backbone NH resonanceof Lys-7 displayed line broadening and could therefore not be followed during the titration.

FIG. 8. Fluorescence titration of the GYN4-GYF domain with theintramolecular proline-rich peptide of GYN4. Increasing amountsof the peptide were added to GST fusions of GYN4-GYF excluding(GYN4; gray diamonds) or including (GYN4-PR; black triangles) thisintramolecular recognition sequence. Fitting curves (gray and black)were calculated with the Microcal Origin program assuming a simpletwo-state binding model.

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1809

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 14: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

results for SMY2 (Fig. 7B). The substitution W8D in Q9FT92extends the hydrophobic cleft in the A. thaliana GYF domain,probably allowing the ligand to bind along its main axis asindicated by the red line (Fig. 7C). This mode of binding issupported by the observation of significant chemical shiftchanges for residues 38–40 of the SMY2-GYF domain (indi-cated as a blue surface) when interacting with the MSL5S1peptide. A reorganization of the binding pocket by the W8Dreplacement could explain the preference for the large hydro-phobic residues Phe, Leu, and Ile that was found in theconsensus motifs of all GYF domains analyzed in this study.On the other hand, our results with CD2BP2-GYF show thatthe PPGW motif represents an optimal binding motif (39) thatis rarely found in the phage display sequences of the SMY2subfamily of GYF domains. We propose that the tryptophan ofthe ligand forms a “stacked” aromatic interaction with theTrp-8 of CD2BP2-GYF at the surface of the domain, whereasit is too bulky to be optimally placed in the deep bindingpocket of the SMY2-GYF subfamily. In conclusion, the datasuggest a common binding mode for PPG�-containing li-gands of the SMY2 subfamily of GYF domains. For theCD2BP2-GYF domain, two classes of ligands were identified,the PPGW class and the PPGX(R/K) class (CD2 class), thatrequire a positively charged amino acid within the flankingregions of the PPG core. The PPGW class for the CD2BP2-GYF domain and the PPG� class of ligands are charge-independent and dominated by hydrophobic interactions.However, although stacked aromatic interactions at the sur-face of the domain probably characterize the CD2BP2-GYF/ligand interaction, the hydrophobic residue of the PPG� motifis expected to insert into the extended hydrophobic cleft ofthe SMY2-GYF domain family (Fig. 7C).

Acknowledgments—We are thankful to Gianni Cesareni for provid-ing the pC89 vector and the randomized X9 phage library. We are alsograteful to Angelika Ehrlich for peptide membrane synthesis, M. Bey-ermann for peptide synthesis, and Sebastian Modersohn for sharingthe SpecWin software. Furthermore we thank Nahum Sonenberg andMichael Rosbash for providing EAP1 and MSL5 clones, respectively.We acknowledge Benjamin Horwitz, Hong Ma, and the ArabidopsisBiological Resource Center for providing the A. thaliana Horwitz andMa two-hybrid cDNA library and Takahiro Nagase and Satoshi Tabatafrom the Kazusa DNA Research Institute for providing the cDNA cloneHJ03496 and the genomic P1 clone MBD2, respectively. We alsothank Christine Lang for the yeast strain BY4741.

* This work was supported by Deutsche ForschungsgemeinschaftGrant FR 1325/2-1 (to C. F.). The costs of publication of this articlewere defrayed in part by the payment of page charges. This articlemust therefore be hereby marked “advertisement” in accordance with18 U.S.C. Section 1734 solely to indicate this fact.

□S The on-line version of this article (available at http://www.mcponline.org) contains supplemental material.

‡ To whom correspondence should be addressed. Tel.: 49-030-94793-181; Fax: 49-030-94793-189; E-mail: [email protected].

REFERENCES

1. Nishizawa, K., Freund, C., Li, J., Wagner, G., and Reinherz, E. L. (1998)Identification of a proline-binding motif regulating CD2-triggered T lym-phocyte activation. Proc. Natl. Acad. Sci. U. S. A. 95, 14897–14902

2. Freund, C., Dotsch, V., Nishizawa, K., Reinherz, E. L., and Wagner, G.(1999) The GYF domain is a novel structural fold that is involved inlymphoid signaling through proline-rich sequences. Nat. Struct. Biol. 6,656–660

3. Mayer, B. J., Hamaguchi, M., and Hanafusa, H. (1988) A novel viral onco-gene with structural similarity to phospholipase C. Nature 332, 272–275

4. Stahl, M. L., Ferenz, C. R., Kelleher, K. L., Kriz, R. W., and Knopf, J. L. (1988)Sequence similarity of phospholipase C with the non-catalytic region ofsrc. Nature 332, 269–272

5. Bork, P., and Sudol, M. (1994) The WW domain: a signalling site in dystro-phin? Trends Biochem. Sci. 19, 531–533

6. Niebuhr, K., Ebel, F., Frank, R., Reinhard, M., Domann, E., Carl, U. D.,Walter, U., Gertler, F. B., Wehland, J., and Chakraborty, T. (1997) A novelproline-rich motif present in ActA of Listeria monocytogenes and cy-toskeletal proteins is the ligand for the EVH1 domain, a protein modulepresent in the Ena/VASP family. EMBO J. 16, 5433–5444

7. Sancho, E., Vila, M. R., Sanchez-Pulido, L., Lozano, J. J., Paciucci, R.,Nadal, M., Fox, M., Harvey, C., Bercovich, B., Loukili, N., Ciechanover,A., Lin, S. L., Sanz, F., Estivill, X., Valencia, A., and Thomson, T. M. (1998)Role of UEV-1, an inactive variant of the E2 ubiquitin-conjugating en-zymes, in in vitro differentiation and cell cycle behavior of HT-29-M6intestinal mucosecretory cells. Mol. Cell. Biol. 18, 576–589

8. Pornillos, O., Alam, S. L., Davis, D. R., and Sundquist, W. I. (2002) Structureof the Tsg101 UEV domain in complex with the PTAP motif of the HIV-1p6 protein. Nat. Struct. Biol. 9, 812–817

9. Carlsson, L., Nystrom, L. E., Sundkvist, I., Markey, F., and Lindberg, U.(1977) Actin polymerizability is influenced by profilin, a low molecularweight protein in non-muscle cells. J. Mol. Biol. 115, 465–483

10. Freund, C., Kuhne, R., Yang, H., Park, S., Reinherz, E. L., and Wagner, G.(2002) Dynamic interaction of CD2 with the GYF and the SH3 domain ofcompartmentalized effector molecules. EMBO J. 21, 5985–5995

11. Hartmuth, K., Urlaub, H., Vornlocher, H. P., Will, C. L., Gentzel, M., Wilm,M., and Luhrmann, R. (2002) Protein composition of human prespliceo-somes isolated by a tobramycin affinity-selection method. Proc. Natl.Acad. Sci. U. S. A. 99, 16719–16724

12. Laggerbauer, B., Liu, S., Makarov, E., Vornlocher, H. P., Makarova, O.,Ingelfinger, D., Achsel, T., and Luhrmann, R. (2005) The human U5snRNP 52K protein (CD2BP2) interacts with U5–102K (hPrp6), a U4/U6.U5 tri-snRNP bridging protein, but dissociates upon tri-snRNP for-mation. RNA 11, 598–608

13. Kofler, M., Heuer, K., Zech, T., and Freund, C. (2004) Recognition se-quences for the GYF domain reveal a possible spliceosomal function ofCD2BP2. J. Biol. Chem. 279, 28292–28297

14. Lillie, S. H., and Brown, S. S. (1992) Suppression of a myosin defect by akinesin-related gene. Nature 356, 358–361

15. Lennon, G., Auffray, C., Polymeropoulos, M., and Soares, M. B. (1996) TheI.M.A.G.E. Consortium: an integrated molecular analysis of genomes andtheir expression. Genomics 33, 151–152

16. Freund, C., Ross, A., Guth, B., Pluckthun, A., and Holak, T. A. (1993)Characterization of the linker peptide of the single-chain Fv fragment ofan antibody by NMR spectroscopy. FEBS Lett. 320, 97–100

17. Felici, F., Castagnoli, L., Musacchio, A., Jappelli, R., and Cesareni, G.(1991) Selection of antibody ligands from a large library of oligopeptidesexpressed on a multivalent exposition vector. J. Mol. Biol. 222, 301–310

18. Yamamoto, K. R., Alberts, B. M., Benzinger, R., Lawhorne, L., and Treiber,G. (1970) Rapid bacteriophage sedimentation in the presence of poly-ethylene glycol and its application to large-scale virus purification. Virol-ogy 40, 734–744

19. Kramer, A., and Schneider-Mergener, J. (1998) Synthesis and screening ofpeptide libraries on continuous cellulose membrane supports. MethodsMol. Biol. 87, 25–39

20. Heuer, K., Kofler, M., Langdon, G., Thiemke, K., and Freund, C. (2004)Structure of a helically extended SH3 domain of the T cell adapter proteinADAP. Structure (Camb.) 12, 603–610

21. Grzesiek, S., and Bax, A. (1992) Correlating backbone amide and side-chain resonances in larger proteins by multiple relayed triple resonanceNMR. J. Am. Chem. Soc. 114, 6291–6293

GYF Domain Proteomics

1810 Molecular & Cellular Proteomics 4.11

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from

Page 15: GYF Domain Proteomics Reveals Interaction Sites in Known and Novel

22. Grzesiek, S., and Bax, A. (1992) An efficient experiment for sequentialbackbone assignment of medium-sized isotopically enriched proteins. J.Magn. Reson. 99, 201–207

23. Kay, L. E., Ikura, M., Tschudin, R., and Bax, A. (1990) 3-Dimensionaltriple-resonance NMR-spectroscopy of isotopically enriched proteins. J.Magn. Reson. 89, 496–514

24. Fan, H. Y., Hu, Y., Tudor, M., and Ma, H. (1997) Specific interactionsbetween the K domains of AG and AGLs, members of the MADS domainfamily of DNA binding proteins. Plant J. 12, 999–1010

25. Cosentino, G. P., Schmelzle, T., Haghighat, A., Helliwell, S. B., Hall, M. N.,and Sonenberg, N. (2000) Eap1p, a novel eukaryotic translation initiationfactor 4E-associated protein in Saccharomyces cerevisiae. Mol. Cell.Biol. 20, 4604–4613

26. Abovich, N., and Rosbash, M. (1997) Cross-intron bridging interactions inthe yeast commitment complex are conserved in mammals. Cell 89,403–412

27. Freund, C., Kuhne, R., Park, S., Thiemke, K., Reinherz, E. L., and Wagner,G. (2003) Structural investigations of a GYF domain covalently linked toa proline-rich peptide. J. Biomol. NMR 27, 143–149

28. Landgraf, C., Panni, S., Montecchi-Palazzi, L., Castagnoli, L., Schneider-Mergener, J., Volkmer-Engert, R., and Cesareni, G. (2004) Protein inter-action networks by proteome peptide scanning. PLoS Biol. 2, 94–103

29. Deloche, O., de la Cruz, J., Kressler, D., Doere, M., and Linder, P. (2004) Amembrane transport defect leads to a rapid attenuation of translationinitiation in Saccharomyces cerevisiae. Mol. Cell 13, 357–366

30. Craggs, G., Finan, P. M., Lawson, D., Wingfield, J., Perera, T., Gadher, S.,Totty, N. F., and Kellie, S. (2001) A nuclear SH3 domain-binding proteinthat colocalizes with mRNA splicing factors and intermediate filament-containing perinuclear networks. J. Biol. Chem. 276, 30552–30560

31. Albert, T. K., Hanzawa, H., Legtenberg, Y. I., de Ruwe, M. J., van denHeuvel, F. A., Collart, M. A., Boelens, R., and Timmers, H. T. (2002)Identification of a ubiquitin-protein ligase subunit within the CCR4-NOTtranscription repressor complex. EMBO J. 21, 355–364

32. Will, C. L., and Luhrmann, R. (2001) Spliceosomal UsnRNP biogenesis,structure and function. Curr. Opin. Cell Biol. 13, 290–301

33. de Hoog, C. L., Foster, L. J., and Mann, M. (2004) RNA and RNA bindingproteins participate in early stages of cell spreading through spreadinginitiation centers. Cell 117, 649–662

34. Bedford, M. T., Frankel, A., Yaffe, M. B., Clarke, S., Leder, P., and Richard,S. (2000) Arginine methylation inhibits the binding of proline-rich ligandsto Src homology 3, but not WW, domains. J. Biol. Chem. 275,16030–16036

35. Bedford, M. T., Reed, R., and Leder, P. (1998) WW domain-mediatedinteractions reveal a spliceosome-associated protein that binds a thirdclass of proline-rich motif: the proline glycine and methionine-rich motif.Proc. Natl. Acad. Sci. U. S. A. 95, 10602–10607

36. Rutz, B., and Seraphin, B. (2000) A dual role for BBP/ScSF1 in nuclearpre-mRNA retention and splicing. EMBO J. 19, 1873–1886

37. Huh, W. K., Falvo, J. V., Gerke, L. C., Carroll, A. S., Howson, R. W.,Weissman, J. S., and O’Shea, E. K. (2003) Global analysis of proteinlocalization in budding yeast. Nature 425, 686–691

38. Chial, H. J., Stemm-Wolf, A. J., McBratney, S., and Winey, M. (2000) YeastEap1p, an eIF4E-associated protein, has a separate function involvinggenetic stability. Curr. Biol. 10, 1519–1522

39. Kofler, M. M., Motzny, K., Beyermann, M., and Freund, C. (2005) Novelinteraction partners of the CD2BP2-GYF domain. J. Biol. Chem., in press

40. Goddard, T. D., and Kneller, D. G. (2000) SPARKY 3, Version 3.1, Universityof California, San Francisco, CA

GYF Domain Proteomics

Molecular & Cellular Proteomics 4.11 1811

by guest on April 12, 2019

http://ww

w.m

cponline.org/D

ownloaded from