11
Mapping low-affinity/high-specificity peptideprotein interactions using ligand-footprinting mass spectrometry Benjamin W. Parker a , Edward J. Goncz a , David T. Krist b , Alexander V. Statsyuk b,c , Alexey I. Nesvizhskii d,e , and Eric L. Weiss a,1 a Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208; b Chemistry of Life Processes Institute, Department of Chemistry, Northwestern University, Evanston, IL 60208; c Department of Pharmacological and Pharmaceutical Sciences, College of Pharmacy, University of Houston, Houston, TX 77204; d Department of Pathology, University of Michigan, Ann Arbor, MI 48109; and e Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109 Edited by Natalie G. Ahn, University of Colorado, Boulder, CO, and approved September 10, 2019 (received for review November 16, 2018) Short linear peptide motifs that are intracellular ligands of folded proteins are a modular, incompletely understood molecular in- teraction language in signaling systems. Such motifs, which fre- quently occur in intrinsically disordered protein regions, often bind partner proteins with modest affinity and are difficult to study with conventional structural biology methods. We developed LiF-MS (ligand-footprinting mass spectrometry), a method to map peptide binding sites on folded protein domains that allows consideration of their dynamic disorder, and used it to analyze a set of D-motif peptidemitogen-activated protein kinase (MAPK) associations to validate the approach and define unknown binding structures. LiF- MS peptide ligands carry a short-lived, indiscriminately reactive cleavable crosslinker that marks contacts close to ligand binding sites with high specificity. Each marked amino acid provides an in- dependent constraint for a set of directed peptideprotein docking simulations, which are analyzed by agglomerative hierarchical clus- tering. We found that LiF-MS provides accurate ab initio identifica- tion of ligand binding surfaces and a view of potential binding ensembles of a set of D-motif peptideMAPK associations. Our anal- ysis provides an MKK4JNK1 structural model, which has thus far been crystallographically unattainable, a potential alternate binding mode for part of the NFAT4JNK interaction, and evidence of bi- directional association of MKK4 peptide with ERK2. Overall, we find that LiF-MS is an effective noncrystallographic way to understand how short linear motifs associate with specific sites on folded pro- tein domains at the level of individual amino acids. docking interactions | peptide ligands | MAP kinases | disordered protein | mass spectrometry S LiMs (short linear motifs) are small regions of larger poly- peptide chains, usually 3 to 14 amino acids long, that interact with folded protein domains in a biochemically specific and functionally meaningful way (1, 2). A speculative estimate sug- gests there are over a million SLiMs in the human proteome, with 150,000 distinct functional classes (3); specific SLiMs may be linked to disease processes (4). Interaction of SLiMs with specific folded domains, referred to as peptide docking,occurs in diverse cellular processes such as transcription and signaling. Exacting analysis of intracellular peptide ligands, notably of SH2 and SH3 domains (57) and mitogen-activated protein ki- nase (MAPK) peptide docking sites (8, 9), provides a picture of short molecular patterns that bind folded domains dynamically, often with high specificity but modest affinity. In general, peptide ligand SLiMs do not have a stable structure when not bound to an interacting partner, and they adopt diverse conformations that are difficult to predict when associated with binding surfaces (10, 11). This structural flexibility makes identifying and char- acterizing sites of peptide ligandprotein interaction exception- ally challenging. Moreover, SLiMs frequently occur in rapidly evolving intrinsically disorderedpolypeptide regions, and it has become clear that the presence, distribution, and precise amino acid sequences of SLiMs change quickly over evolution (12). The number of classes of functional peptide ligands present in eukaryotic cells thus remains obscure, and SLiMs represent a dimly understood language of macromolecular interaction. A significant number of ligandprotein interactions are not understood in biochemical and structural detail (3). This is partly because this kind of analysis largely relies on X-ray crystallog- raphy and NMR structure determination, which are technically challenging and often do not account for conformational flexi- bility present in modest-affinity interactions between peptide motifs and folded protein domains (8, 13). It is helpful to have complementary experimental ways to characterize peptide ligand binding. Hydrogendeuterium exchange/mass spectrometry (MS), for example, uses quantitative MS/MS to reveal direct and in- direct structural effects of peptide binding (14). Chemical modification approaches, which have provided deep insight into the structure of DNAprotein complexes and RNA (15, 16), could also provide useful structural information. However, con- ventional crosslinking reagents cannot be used in this way because Significance Short peptides can act as chemical wordsthat bind specific sites on folded proteins. These interactions underlie a large range of dynamic phenomena, but weak binding and conformational het- erogeneity of the peptides makes them difficult to study. We developed a simple technique, ligand-footprinting mass spec- trometry, that maps peptide ligand binding on folded domains with a cleavable crosslinker that leaves behind covalent marks of distinct mass. We used this approach to characterize functionally crucial binding of D-motif peptides to human mitogen-activated protein kinases (MAPKs), validating the method and revealing information about binding sites and conformational dynamics of these peptides as they dockwith their MAPK partner. Overall, this approach will help decode a heretofore dimly understood language of dynamic protein interaction. Author contributions: B.W.P., E.J.G., A.V.S., and E.L.W. designed research; B.W.P., E.J.G., and D.T.K. performed research; D.T.K., A.V.S., and A.I.N. contributed new reagents/ana- lytic tools; B.W.P., A.V.S., A.I.N., and E.L.W. analyzed data; and B.W.P. and E.L.W. wrote the paper. The authors declare no competing interest. This article is a PNAS Direct Submission. Published under the PNAS license. Data deposition: The data reported in this paper have been deposited in the MassIVE database, ftp://massive.ucsd.edu/MSV000083748/ (accession no. MSV000083748). 1 To whom correspondence may be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1819533116/-/DCSupplemental. First published October 2, 2019. www.pnas.org/cgi/doi/10.1073/pnas.1819533116 PNAS | October 15, 2019 | vol. 116 | no. 42 | 2100121011 BIOCHEMISTRY Downloaded by guest on October 9, 2020

Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

Mapping low-affinity/high-specificity peptide–proteininteractions using ligand-footprintingmass spectrometryBenjamin W. Parkera, Edward J. Goncza, David T. Kristb, Alexander V. Statsyukb,c, Alexey I. Nesvizhskiid,e,and Eric L. Weissa,1

aDepartment of Molecular Biosciences, Northwestern University, Evanston, IL 60208; bChemistry of Life Processes Institute, Department of Chemistry,Northwestern University, Evanston, IL 60208; cDepartment of Pharmacological and Pharmaceutical Sciences, College of Pharmacy, University of Houston,Houston, TX 77204; dDepartment of Pathology, University of Michigan, Ann Arbor, MI 48109; and eDepartment of Computational Medicine andBioinformatics, University of Michigan, Ann Arbor, MI 48109

Edited by Natalie G. Ahn, University of Colorado, Boulder, CO, and approved September 10, 2019 (received for review November 16, 2018)

Short linear peptide motifs that are intracellular ligands of foldedproteins are a modular, incompletely understood molecular in-teraction language in signaling systems. Such motifs, which fre-quently occur in intrinsically disordered protein regions, often bindpartner proteins with modest affinity and are difficult to study withconventional structural biology methods. We developed LiF-MS(ligand-footprinting mass spectrometry), a method to map peptidebinding sites on folded protein domains that allows considerationof their dynamic disorder, and used it to analyze a set of D-motifpeptide–mitogen-activated protein kinase (MAPK) associations tovalidate the approach and define unknown binding structures. LiF-MS peptide ligands carry a short-lived, indiscriminately reactivecleavable crosslinker that marks contacts close to ligand bindingsites with high specificity. Each marked amino acid provides an in-dependent constraint for a set of directed peptide–protein dockingsimulations, which are analyzed by agglomerative hierarchical clus-tering. We found that LiF-MS provides accurate ab initio identifica-tion of ligand binding surfaces and a view of potential bindingensembles of a set of D-motif peptide–MAPK associations. Our anal-ysis provides an MKK4–JNK1 structural model, which has thus farbeen crystallographically unattainable, a potential alternate bindingmode for part of the NFAT4–JNK interaction, and evidence of bi-directional association of MKK4 peptide with ERK2. Overall, we findthat LiF-MS is an effective noncrystallographic way to understandhow short linear motifs associate with specific sites on folded pro-tein domains at the level of individual amino acids.

docking interactions | peptide ligands | MAP kinases | disordered protein |mass spectrometry

SLiMs (short linear motifs) are small regions of larger poly-peptide chains, usually 3 to 14 amino acids long, that interact

with folded protein domains in a biochemically specific andfunctionally meaningful way (1, 2). A speculative estimate sug-gests there are over a million SLiMs in the human proteome,with 150,000 distinct functional classes (3); specific SLiMs maybe linked to disease processes (4). Interaction of SLiMs withspecific folded domains, referred to as “peptide docking,” occursin diverse cellular processes such as transcription and signaling.Exacting analysis of intracellular peptide ligands, notably ofSH2 and SH3 domains (5–7) and mitogen-activated protein ki-nase (MAPK) peptide docking sites (8, 9), provides a picture ofshort molecular patterns that bind folded domains dynamically,often with high specificity but modest affinity. In general, peptideligand SLiMs do not have a stable structure when not bound toan interacting partner, and they adopt diverse conformationsthat are difficult to predict when associated with binding surfaces(10, 11). This structural flexibility makes identifying and char-acterizing sites of peptide ligand–protein interaction exception-ally challenging. Moreover, SLiMs frequently occur in rapidlyevolving “intrinsically disordered” polypeptide regions, and it has

become clear that the presence, distribution, and precise aminoacid sequences of SLiMs change quickly over evolution (12). Thenumber of classes of functional peptide ligands present ineukaryotic cells thus remains obscure, and SLiMs represent adimly understood language of macromolecular interaction.A significant number of ligand–protein interactions are not

understood in biochemical and structural detail (3). This is partlybecause this kind of analysis largely relies on X-ray crystallog-raphy and NMR structure determination, which are technicallychallenging and often do not account for conformational flexi-bility present in modest-affinity interactions between peptidemotifs and folded protein domains (8, 13). It is helpful to havecomplementary experimental ways to characterize peptide ligandbinding. Hydrogen–deuterium exchange/mass spectrometry (MS),for example, uses quantitative MS/MS to reveal direct and in-direct structural effects of peptide binding (14). Chemicalmodification approaches, which have provided deep insight intothe structure of DNA–protein complexes and RNA (15, 16),could also provide useful structural information. However, con-ventional crosslinking reagents cannot be used in this way because

Significance

Short peptides can act as chemical “words” that bind specific siteson folded proteins. These interactions underlie a large range ofdynamic phenomena, but weak binding and conformational het-erogeneity of the peptides makes them difficult to study. Wedeveloped a simple technique, ligand-footprinting mass spec-trometry, that maps peptide ligand binding on folded domainswith a cleavable crosslinker that leaves behind covalent marks ofdistinct mass. We used this approach to characterize functionallycrucial binding of D-motif peptides to human mitogen-activatedprotein kinases (MAPKs), validating the method and revealinginformation about binding sites and conformational dynamics ofthese peptides as they “dock” with their MAPK partner. Overall,this approach will help decode a heretofore dimly understoodlanguage of dynamic protein interaction.

Author contributions: B.W.P., E.J.G., A.V.S., and E.L.W. designed research; B.W.P., E.J.G.,and D.T.K. performed research; D.T.K., A.V.S., and A.I.N. contributed new reagents/ana-lytic tools; B.W.P., A.V.S., A.I.N., and E.L.W. analyzed data; and B.W.P. and E.L.W. wrotethe paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

Published under the PNAS license.

Data deposition: The data reported in this paper have been deposited in the MassIVEdatabase, ftp://massive.ucsd.edu/MSV000083748/ (accession no. MSV000083748).1To whom correspondence may be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1819533116/-/DCSupplemental.

First published October 2, 2019.

www.pnas.org/cgi/doi/10.1073/pnas.1819533116 PNAS | October 15, 2019 | vol. 116 | no. 42 | 21001–21011

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 2: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

they are selective for side-chain primary amines or sulfhydryls,which are not necessarily close to relevant binding surfaces.Recently, however, new crosslinking reagents that are highlyreactive and relatively nonspecific have proven useful in protein–protein interaction studies and show promise for protein struc-ture mapping (17, 18).We have developed ligand-footprinting mass spectrometry

(LiF-MS), a simple and robust way to map protein surfaces thatbind specific peptide motifs that combines ligand-directedchemical modification with computational modeling of peptidebinding. The approach uses a highly reactive photoactivatedcrosslinker that can be cleaved to leave small chemical marks atcrosslink sites, tethered to peptide ligands of interest (Fig. 1A).In addition to a sulfur-reactive iodoacetamide for attachment tocysteine in a peptide ligand, this reagent comprises 1) a diazirinemoiety that crosslinks relatively nonspecifically to nearby mole-cules and 2) a sulfamide cleavable by acid treatment, which re-leases the peptide ligand and leaves a covalent butanol modificationat the site of crosslinking (Fig. 1A). Diazirine is the smallest com-

monly used photoactivatable group (19) and an ideal crosslinker forligand-directed labeling of peptide binding sites. Upon irradiationwith long-wavelength ultraviolet (UV) light, it produces a carbenethat quickly reacts with nearby C–C, C–H, O–H, and otherbonds. Carbene quickly reacts very quickly with solvent and thusonly consistently crosslinks to protein surfaces it is frequentlypositioned nearby (19, 20). Covalent marks left after crosslinkersulfamide cleavage are identifiable by MS, providing constraintsfor independent computational models of peptide binding.Peptides of interest may also contain other functional elements,such as biotin, that facilitate conjugate detection and affinitypurification (Fig. 1B).Peptide ligand binding to MAPKs is paradigmatic of func-

tional short motif docking. In particular, association of “D-motif”peptides in substrates and partner proteins with a conservedgroove on the MAPK C-lobe plays an important role in thesekinases’ signaling. Analysis of vertebrate JNK1, ERK2, andp38α association with D-motif peptides has identified mecha-nisms of peptide binding and subtype specificity (8, 9, 21–25).

Fig. 1. Overview of the LiF-MS workflow. (A) General structure of the synthetic docking peptides used in this study. Peptides are alkylated with anSN2 reaction between the crosslinker iodoacetamide group and an added cysteine within the peptide. Irreversible crosslinking upon UV irradiation followedby acid cleavage yields a 72-Da butanol mark on amino acids on the interacting protein. (B) LiF-MS footprinting methodology. Biotinylated peptides arecrosslinked to folded protein domains, the crosslinker cleaved, and LC-MS/MS is used to detect butanol-modified residues. The resulting marked residues aremapped in silico to identify the potential docking site. (C) LiF-MS can detect conformational dynamics exhibited by disordered docking peptides. Cartoonof potential peptide conformations is shown. Each population of peptide conformation can be detected by LC-MS/MS and then used to map a radius ofinteraction.

21002 | www.pnas.org/cgi/doi/10.1073/pnas.1819533116 Parker et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 3: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

Intriguingly, D-motifs interact with cognate MAPKs in differentways: MKK6 is hypothesized to form a loop when binding p38α(8, 9), and the NFAT4 D-motif peptide appears to bindJNK1 with 2 different conformations (8). The MKK6 andNFAT4 D-motifs selectively associate with p38α and JNK1, re-spectively. In contrast, the MKK4 D-motif binds JNK and ERK/p38α families with similar affinity (8, 21, 24, 25): The ligand ishypothesized to adopt an MKK6-like conformation when incomplex with ERK and p38α kinases and an NFAT4-like con-formation when bound to JNK1 (8). However, no MKK4–MAPK crystal structure has been solved.We used LiF-MS to characterize D-motif peptide binding to

human JNK1, ERK2, and p38α MAP kinases to validate the ap-proach and explore unresolved questions about these importantpeptide docking associations. We evaluated MKK6, NFAT4, andMKK4 D-motif binding with partner MAPKs. There are relevantcrystal structures of NFAT4 and MKK6 complexes, providing a testof the LiF-MS method’s ability to independently arrive at a similarstructure and perhaps define plausible noncrystallized conforma-tions. As there is no crystal structure available of MKK4 D-motifbound to a MAPK, we hypothesized our methodology might shedlight on the unusual binding mechanism of this D-motif, most no-tably providing information to help assess the idea that the MKK4D-motif peptide ligand readily adopts different conformations toallow binding to divergent MAPKs (8, 9).Here we show that LiF-MS analysis recapitulates D-motif–

MAPK crystal structures with striking accuracy, placing theMKK6 D-motif in the previously verified ED site of p38α indistinct conformations and strongly validating the approach (8,26, 27). LiF-MS provides informative structural information aboutthe MKK4 D-motif’s binding of different MAPKs, revealing distinctbinding modes for JNK1 and p38α and supporting prior hypotheses(8, 9) and indicating previously unobserved conformation of theNFAT4 D-motif bound to JNK1. Moreover, we find clear evidencethat the MKK4 D-motif engages in both “forward” and “reverse”binding of ERK2. Using crosslinking data to guide computationaldocking of D-motifs dramatically increases the accuracy of bindingsite identification, even if there is significant off-target crosslinking.Overall, this straightforward approach can provide useful in-formation about binding site, structural organization, and confor-mational flexibility of intrinsically disordered peptide ligands thatform complexes with folded protein domains (Fig. 1C).

ResultsLiF-MS Methodology: Expected Results.We attached the crosslinkerdescribed above to synthetic biotinylated peptides containingSLiMs of interest, in this case different D-motifs (Fig. 2A) (8).Incubating with the protein of interest (in this case, differentMAPKs) followed by UV irradiation should produce a pop-ulation of target protein covalently bound to the peptide, withnegligible crosslinking to other proteins that lack a binding sitefor the peptide. Trypsinization of the conjugated complex can befollowed by biotin affinity enrichment of crosslinked trypticpeptides to boost the signal. Crosslink cleavage at low pH shouldproduce a set of tryptic peptides covalently modified by a masstag, which can be used to map the motif’s binding site by liquidchromatography (LC)-MS (Fig. 1; also seeMaterials and Methodsand SI Appendix, Fig. S1), with location of mass-tagged aminoacids mapped to the protein in silico. Because SLiM interactionswith folded domains can have high conformational heterogeneity(28, 29), LiF-MS should produce a set of protein-motif crosslinksconsistent with the peptide ligand’s dynamic motion in its bind-ing pocket, as well as the conformational flexibility of thecrosslinker and linker amino acids (Fig. 1C).

CrossLinking Synthetic D-Motifs to Purified MAPKs. We used LiF-MS to examine interaction of D-motif peptides with the humanMAPKs JNK1, ERK2, and p38α. In an initial analysis to assess

bulk crosslinking of docking peptides to these MAPKs, we at-tached the cleavable diazirine crosslinker described in Fig. 1 tosynthetic peptides containing different D-motifs and biotin tofacilitate affinity purification and detection (Fig. 2A) (8). Asnoted, crystal structures of some of these complexes are availablewhile others are structurally uncharacterized. We found thatJNK1, ERK2, and p38α were significantly biotinylated upon ir-radiation with UV light, consistent with diazirine-mediated co-valent crosslinking of the D-motif peptide (Fig. 2 B and C). Wealso observed a very small but consistent amount of UV-independent biotinylation of MAPKs likely due to ambient light.Diazirine-mediated protein crosslinking was previously shown

to occur when the moiety is positioned very close to a protein’ssurface by covalent tethering or association with a tightly boundpartner (19). In contrast, SLiMs often bind folded protein do-mains with relatively modest affinity. Thus, we evaluated selec-tivity of D-motif peptide–MAPK crosslinking over nonspecificprotein by performing UV crosslinking reactions using JNK1 anda molar excess of bovine serum albumin (BSA), with increasingconcentrations of MKK4 and NFAT4 peptide (Fig. 2D). Strik-ingly, while MKK4 D-motif–JNK1 crosslinking occurred therewas little to no MKK4 peptide crosslinking to BSA, with weakevidence of a small amount of nonspecific biotinylation of BSAat the highest peptide concentration (20 μM; Fig. 2D). Similarly,we found minimal BSA biotinylation with more hydrophobicNFAT4 peptides, which also crosslinked well with JNK1 (Fig.2D). Thus, diazirine-linked peptide ligands like MAPK D-motifsform crosslinks with bona fide binding partners with little back-ground crosslinking to other proteins.

LiF-MS Confirms Published Structural and Biochemical Data. To as-sess the utility of LiF-MS we used the approach to define thebinding of the MKK6 D-motif with the MAPK p38α. Two

Fig. 2. UV-dependent crosslinking of D-motif peptides to MAP kinases. (A)Sequences of peptides used in this study. Orange B, biotin; yellow, cysteines(added) used to conjugate the crosslinker. The core D-motif is bolded. (B andC) MAP kinases are crosslinked to the indicated peptides. MKK4 peptides Iand II crosslink to purified JNK1 (B), ERK2 and p38α (C) in a UV-dependentmanner. The indicated peptide was incubated with purified kinase in thepresence or absence of UV irradiation. The crosslinked, biotinylated peptide/protein was detected by Western blot with an antibiotin antibody. (D)Concentration-dependent crosslinking of MKK4 peptides I and II (Left) andNFAT4 peptides I and II (Right) to JNK1. (Left) JNK1 (10 μM) plus the indicatedamount of crosslinker–containing MKK4 D-motif peptide were irradiated withUV in the absence or presence of 40 μM BSA. (Right) JNK1 (40 μM) was in-cubated in the presence of the indicated amount of NFAT4 peptides I and II.BSA was added as indicated at a concentration of 500 μM. Blots were probedwith IRDye 800CW streptavidin (LI-COR) to detect crosslinked protein.

Parker et al. PNAS | October 15, 2019 | vol. 116 | no. 42 | 21003

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 4: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

MKK6–p38α crystal structures have been solved with the D-motifin almost identical conformations (Protein Data Bank [PDB]ID codes 2Y8O and 5ETF) (8, 26). In both structures theN terminus of the MKK6 D-motif is not resolved, although itspredicted conformation has been extrapolated based on otherstructures (9) and biochemical data (27, 30). We synthesized theshortened D-motif from MKK6 with an N-terminal cysteine forcrosslinker attachment (Fig. 2A) and mapped its binding to p38αusing LiF-MS. Notably, MS/MS analysis indicated that a verynarrow range of p38α residues (160 to 165) were frequentlycrosslinked (Fig. 3A) above a false discovery rate (FDR) cutoffof 2.44%. These residues are very close to the N terminus of theMKK6 D-motif, which is highly consistent with published crystalstructures and the predicted binding conformation of MKK6 inwhich the D-motif’s N terminus loops back toward the C ter-minus (Fig. 3 B and C) (9). The crosslinked region represents the“ED site,” a D-motif docking site secondary to the commondocking (CD) groove in ERK2 and p38α that is required for theinteraction of ERK2/p38α with its target D-motifs (9, 26, 27, 30).Crystal structures of the NFAT4 D-motif bound to JNK1 have

also been solved (8), and we performed LiF-MS on JNK1 boundto the NFAT4 D-motif ligand to further test the technique. Wesynthesized 2 NFAT4 D-motif peptides with cysteines forcrosslinker attachment located at the N and C terminus (termedNFAT peptide I and II, respectively) (Fig. 2A). We foundNFAT4 peptide I (N-terminal) crosslinked 2 sites that do notoverlap with the published crystal structure (SI Appendix, Fig. S2A, Top). The crosslinks at JNK1 residues 79 to 85 does not ap-pear to be out of reach of the length of the peptide in thestructures and is close to the D site despite not overlapping withelectron density in NFAT4 crystallography data. We suggest this

is a third conformation of the NFAT4 D-motif interacting withJNK1 in solution not found in previously crystallized samples.Interestingly, this conformation is similar to ERK2/p38α binding ofD-motifs (9). Importantly, the N terminus of NFAT4 D-motif isflexible, with substantially different conformations in the 2 pub-lished crystal structures (see SI Appendix, Fig. S2C; compare PDBID codes 2XS0 and 2XRW). LiF-MS with NFAT4 peptide II (C-terminal crosslinker) highlighted residues 161 to 166 that are closeto the crystallographically supported C terminus of NFAT4 peptide(SI Appendix, Fig. S2 A, Bottom). These residues are shared be-tween the N- and C-terminal D-motifs (NFAT4 peptide I and II)and are within range of NFAT4 in the crystal structure, furthersuggesting the observed binding modes are not artifacts.We also found a third region with statistically significant

crosslinks that is inconsistent with D-site binding in all NFAT4–JNK1 LiF-MS datasets (see SI Appendix, Fig. S2B, labeled **).This area, which roughly corresponds to JNK1 residues 245 to250, is far from JNK1’s D site and close to a region corre-sponding to ERK2’s DEF binding pocket, another canonicaldocking site that binds the conserved motif FxFP (31, 32). Thesespecific “off target” crosslinks formed whether we placed thecrosslinker on the N or C terminus of the NFAT4 peptide; ad-dition of 30 mg/mL of BSA did not reduce this crosslinking,suggesting that it results from a specific interaction (SI Appendix,Fig. S3, compare A and B). We interpret this interaction asan in vitro artifact, possibly arising from combined associationof a peptide subregion and the prereaction crosslinker itself.Notably, putatively artifactual crosslink sites were significantlyoutnumbered by ones that are strongly consistent with D-site binding.

Fig. 3. LiF-MS verifies the MKK6–p38α interaction architecture. (A) Confidence scoring diagram of crosslinked p38α residues (above the indicated FDR) are shownacross the length of the protein. Peptides marked with sulfated butanol (+152 Da) were scored (Materials and Methods). In scoring diagrams for Figs. 3–5,crosslinked qualitatively near the known D site are labeled in red; “off-target” qualitatively distal from the D-site residues are gray. (B and C) The highest-confidenceresidues map to the published MKK6 binding site on p38α (PDB ID code 5ETF). MKK6 bound to p38α from 2 published structures (PDB ID codes 5ETF, blue and 2Y8O,magenta) are shown overlaid for reference; dashed blue line and arrowhead represent predicted location of MKK6 peptide N terminus (see ref. 9). Side chains offootprinted residues are shown in red. (B) Overview of the p38α structure with footprinted regions highlighted. (C) Magnified views of the footprinted residuesclose to the CD groove and ED site (Top, cartoon view; Bottom, surface view). Note that FDR cutoffs are established based on the location of a decoy in a list of top-ranked PSMs and thus differ between datasets, as in the rest of this paper. FDR cutoffs for each dataset are established as shown in SI Appendix, Fig. S1.

21004 | www.pnas.org/cgi/doi/10.1073/pnas.1819533116 Parker et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 5: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

LiF-MS Maps the JNK1 D Site with the MKK4 D-Motif. We nextmapped the interaction site of MKK4 on purified JNK1 usingpeptide I (Fig. 4A) and the LiF-MS workflow described in Fig. 1.We used COMET (33, 34) to identify crosslinks in the resultingMS/MS data. This revealed a set of peptide spectral matches(PSMs) matching JNK1 as described in SI Appendix, Fig. S1. Wescored these data, mapping the top consecutive clusters ofcrosslinked residues with an FDR of 0.45% (Fig. 4A). Since astructure of JNK1 complexed with MKK4 is not available, wemapped the location of each butanol-marked residue onto one ofthe published JNK1–NFAT4 structures (PDB ID code 2XS0)(8), comparing the location of the marked residues to the

NFAT4 binding site. Importantly, while we identified peptidesboth proximal to and distal from the docking site, the highest-scoring peptides consistently mapped to the negatively chargedCD groove close to the putative N terminus of MKK4 (Fig. 4A).We also identified high-scoring peptides from a loop that isdistinct from the CD groove but still potentially within range ofan interaction with the flexible N terminus of the peptide (Fig.4A). This area, JNK1 residues 284 to 288, is consistent with theNFAT4 peptide’s conformation in the second published NFAT4–JNK1 crystal structure, PBD ID code 2XRW (8). We hypothe-size this loop is mobile in solution and could provide a secondpart of a binding site that contacts the peptide. Overall, LiF-MS

Fig. 4. JNK1 and p38α’s docking sites mappedwithMKK4 peptides. MKK4 peptides I and II were incubated with purified JNK1 or p38α, crosslinked, and analyzed byLiF-MS. (A) Length-dependentmapping ofMKK4 peptides with JNK1. Confidence scores of crosslinked JNK residues (above the indicated FDR) are shown across the lengthof the protein. Peptides marked with butanol (+72 Da) were scored (Materials andMethods) and high-confidence clusters of modified residues near the D site are shownin red. Off-target crosslinks distant from the D site are shown in gray and marked with an asterisk (*). The longer MKK4 peptide I (Top) marks the edge of the D site(residues 324 to 331), whereas the shorter peptide II (Bottom) marks residues 128 to 132 closer to the center of the cleft as seen when mapped onto the JNK1 structurewith the NFAT4 peptide (blue) from PDB ID code 2XS0 shown for reference. (B) LiF-MS confirms MKK4 binds p38α in an MKK6-like binding mode. MKK4 peptide I wascrosslinked to purified p38α. Confidence scores of p38α crosslinked residues (above the indicated FDR) are shown across the length of the protein. Peptides marked withbutanol (+72 Da) were scored and high-confidence clusters of modified residues near the D site are shown in red. Below the graph, the resulting residues were mapped top38αwithMKK6 peptide (blue) from PDB ID code 5ETF as reference. The predicted N-terminal location of theMKK6D-motif is shown as a dashed blue line and arrowhead(see ref. 9). (C) Confidence diagram of MKK4 peptide II crosslinked to p38α. In all cases FDR cutoffs for each dataset are established as shown in SI Appendix, Fig. S1.

Parker et al. PNAS | October 15, 2019 | vol. 116 | no. 42 | 21005

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 6: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

can identify both the site and conformational flexibility of shortmotif binding.

Shorter and Longer MKK4 D-Motifs Provide Length-Dependent JNK1Binding-Site Information. As linear motifs have much smallerbinding surfaces than folded domains, peptide–protein interac-tions are much more prone to disruption by even small changesincluding point mutations, modified amino acids, and crosslinkerplacement (35). To determine the effect of peptide length andcrosslinker placement on our footprinting data, we synthesizedMKK4 peptide II (Fig. 2A) in which the core D-motif–cross-linker distance was shortened by 1 amino acid. This peptidesuccessfully crosslinked to all 3 kinases studied (Fig. 2 B and C).We then mapped the binding of peptide II onto JNK1 (Fig. 4

A, Bottom), comparing the results with peptide I (Fig. 4 A, Top).We found residues 128 to 132 in JNK1 were labeled with peptideII but not peptide I at the high end of the confidence score.Conversely, residues 324 to 331 were found at high confidence inthe peptide I dataset but not in peptide II. The increased prox-imity of the cysteine, which is the point of crosslinker attach-ment, to the “core” motif in peptide II is reflected when thesedatasets are compared, as the primary crosslinked region is far-ther toward the center of the NFAT4 D-motif in the crystalstructure (Fig. 4A). Thus, for JNK1, the shorter peptide II revealedadditional information about the MKK4–JNK1 binding surface.

The MKK4 D-Motif Binds p38α in an MKK6-Like Conformation.MKK4’s D-motif exhibits unusually plastic binding and can in-teract with both JNK1 and p38α, which contain differently shapedbinding grooves (8, 9). Previously published biochemical datasuggest MKK4 binds to p38α in an MKK6-like binding con-formation (8), distinct from its binding to JNK1. We mappedthe binding of MKK4 peptide I (Fig. 2A) onto p38α using LiF-MS. The mapped binding groove was highly similar to the ob-served MKK6 dataset (Fig. 4B, compare with Fig. 3). This isconsistent with the MKK6-like binding mode in which the pep-tide interacts with the ED site near the kinase’s hinge region (8,30). Notably, crosslinking with the shorter MKK4 peptide II didnot significantly change the distribution of labeled residues onp38α (Fig. 4C) when compared with peptide I (Fig. 4B).

LiF-MS Suggests That the MKK4 D-Motif Binds ERK2 Bidirectionally.ERK2 can bind MKK4’s D-motif. However, ERK2 does notappear to be an MKK4 regulatory target in vivo, and this asso-ciation has a lower affinity than MAPK–D-motif associations ofclear physiological relevance (8, 22, 24). To better understandthis interaction, we used LiF-MS to map binding of MKK4D-motif peptide I onto ERK2’s crystal structure (36). As expected,we found covalent modifications near ERK2’s CD site, similar toboth JNK1 and p38α (Fig. 5A, compare with Fig. 3). Surprisingly,we found that a region on the kinase located on the opposite end ofthe D site was also marked extensively (Fig. 5A, “reverse site,”purple). Crosslinks in these regions were consistent with the MKK4D-motif binding in the opposite (C→N) orientation in ERK2.We hypothesized that reverse-orientation ligand binding

reflected relatively nonspecific interactions due to saturation ofthe ERK2 docking site. We performed LiF-MS using ERK2 witha range of MKK4 peptide concentrations and graphed thecrosslinked residues (Fig. 5B). Again, we found marked residueson both the predicted and the reverse sites at all MKK4 peptideconcentrations. Interestingly, at the highest concentrations athird, non-D-site–located region near the DEF binding cleftbegan to predominate in the signal (Fig. 5B, marked *). This regionwas not as strongly enriched in the other 2 datasets. While theseexperiments indicate that MKK4 may bind ERK2 in 2 differentorientations the biochemical and in vivo significance of this obser-vation remains unclear. Overall, these ERK2 experiments establishLiF-MS as an effective way to study effects of varying parameters

such as peptide length and concentration on a potentially difficult tostudy peptide–protein interaction.

LiF-MS Crosslink Sites Dramatically Improve Computational PeptideDocking.Algorithmically modeling the binding of peptide ligandsto folded protein domains is computationally daunting and alsolargely inaccurate without foreknowledge of peptide bindingsites. We hypothesized that locations of ligand-directed cross-links identified through LiF-MS may provide sufficient in-formation about peptide binding sites to significantly increaseaccuracy of computational peptide docking (Fig. 6). We testedthis using CABS-dock (11), a global docking method that allowsflexibility of both peptide ligand and binding protein. Briefly,naïve CABS-dock first uses energy-minimizing trajectories of10 peptides placed randomly around the binding protein togenerate 10,000 peptide–protein configurations then divides themost plausible 1,000 of these binding models into 10 clustersusing k-medoid clustering. The algorithm then provides modelstructures representing each of these 10 clusters, assigning eachmodel a “quality” score based on its underlying cluster’s averagermsd (lower is better). The best-scoring structure is consideredthe final peptide docking model (11). Importantly, CABS-dockcan be constrained to a specific region of the binding protein,limiting placement of the initial 10 peptides. When such con-straints are based on information from experimentally charac-terized peptide docking sites, CABS-dock produces structuralmodels with significantly better scores.To determine if LiF-MS crosslink sites provide useful CABS-

dock constraints we used the approach to model JNK1 binding tothe MKK4 D-motif peptide. We compared simulations con-strained to 20 high-confidence crosslinked residues from the LiF-MS analysis of the JNK1–MKK4 peptide interaction to anunconstrained control (Fig. 6A and Materials and Methods). Simu-lated peptides were constrained to the unbound (apo)JNK1 structure PDB ID code 3PZE (37). We performed 3 in-dependent CABS-dock simulations for each crosslink, producing30 peptide binding models for each site, and took the bestscoring structure (lowest cluster rmsd) as representative of thatcrosslinking site (Fig. 6B). The top scoring model from simu-lations run with crosslinking constraints had significantly lowercluster rmsd than the best model from unconstrained controlsimulations. Moreover, 5 of the 20 crosslink sites used as con-straints produced model clusters with an rmsd of ≤5.5 Å, sufficientfor near-native modeling by applications such as PepFlexDock(Fig. 6B) (38). We further found that anchoring the peptide toJNK1 residues 128, 131, and 325 gave the best models whencompared with the reference control (Fig. 6C). Similar resultswere observed when the D-motif-bound JNK1 structure PDB IDcode 2XS0 was used for constraint (SI Appendix, Fig. S4A). For areference structure we used NFAT4–JNK1 (PDB ID code 2XS0)(8) as a template to model the conformation of the JNK1–MKK4D-motif complex.When analyzing the full trajectories (all 10,000 models per rep-

licate, 30,000 models total per crosslinked site) gains in rmsd weremore modest, though still significant, when compared with theunconstrained control. Importantly, constraining MKK4 to severalcrosslinked sites in both bound and unbound JNK1 structuresproduced models with rmsd <3 Å considered to be “high-quality”by the CABS-dock scoring prediction (11), where controls were ≥3Å (SI Appendix, Fig. S4B). We note that D-motif binding (Ste7D-motif to Fus3) modeled poorly with unconstrained CABS-dock(39). Using crosslinked sites as constraints substantially increasesaccuracy of computational modeling.

Highly Ranked Crosslinking-Generated Constraints Can Be Used forDocking Cleft Discovery. While constraining D-motifs to the cross-linked residues on the folded kinase domain during CABS-docksimulations improves the distribution of solutions constituent with

21006 | www.pnas.org/cgi/doi/10.1073/pnas.1819533116 Parker et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 7: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

previously published crystal structures (Fig. 6), we sought to use theconstraint data for ab initio docking site discovery. We ran CABS-dock simulations with constraints between known crosslinked kinasesites on the folded kinase structures and the indicated D-motif se-quences (Fig. 7). One CABS-dock simulation was run per crosslinkconstraint and the top 1,000 solutions were combined. After aninitial filtering to remove structures far from the constraints, sub-ensembles were divided by hierarchical clustering into 10 clusters byrmsd using the R cluster package (40) (Fig. 7A). Individual coor-dinates of the ensemble backbone atoms were averaged to create arepresentative coordinate set of the cluster. This is mapped, alongwith residues up to 7 Å away from each point, as beads on thereference crystal structures as shown in Fig. 7A. Color of the av-erage backbone point indicates variance in the clustered ensembleof peptides: Orange is less variable and red is more variable.We performed binding cleft discovery on several LiF-MS

datasets. The NFAT4 D-motif ligand is a 14-residue peptide witha bipartite binding interface. We started by modeling JNK1binding of only the last 6 amino acids of this D-motif using theset of crosslink sites produced by NFAT4 peptide II, which has aC-terminal crosslinker, as constraints (SI Appendix, Fig. S2). In

this analysis we did not use likely off-target crosslinks in JNK1residues 245 to 249 from the NFAT4 datasets as constraints,instead including them in a different approach described below.Mapping these residues produced a structure that superimposesaccurately with the NFAT4–JNK1 crystal structure (Fig. 7B, FarLeft). We then performed a 2-constraint model with the N ter-minus of the full-length NFAT4 D-motif constrained to thepeptide I (N-terminal) specific crosslinked residues and the Cterminus constrained to peptide II residues; while not evident inthe superimposed crystal structure, this may still reflect theunique binding mode of NFAT4 in our LiF-MS experiments(Fig. 7 B, Center Left). Mapping N-terminally constrainedMKK4 peptide II and MKK6 D-motifs onto the JNK1 and p38αreference crystal structures using the highest ranked crosslinkingdata created structures representative of the expected bindingmodes (Fig. 7 B, Center Right and Far Right). While the MKK6model does not match the visible crystallography data (Fig. 7B,Far Right), the computationally discovered binding cleft is con-sistent with the theoretical placement of the N terminus ofthe MKK6 D-motif and similar peptides (8, 9) and overlaps withthe ED site (27, 30). Here we show the predicted location of the

Fig. 5. LiF-MS reveals bidirectional binding of MKK4 to ERK2. The MKK4 peptide I–ERK2 interaction was mapped with LiF-MS. (A) Confidence scores ofcrosslinked ERK2 residues (above the indicated FDR) are shown across the length of the protein. Peptides marked with butanol (+72 Da) were scored andhigh-confidence clusters of modified residues near the D site are shown in red or purple. Off-target crosslinks distant from the D site are shown in gray andmarked with an asterisk (*). Shown below the graph, the MKK6 peptide structure from PDB ID code 5ETF is shown in blue as a reference on the ERK2 structure(PDB ID code 2GPH), as this most closely represents the binding conformation of the MKK4–ERK2 complex (see ref. 8). The N terminus of the reference peptideMKK6 is marked with a yellow dot and the C terminus as a green dot. Red residues represent modifications at the predicted D site based on binding in the N-terminal direction; purple residues represent the discovered “reverse site” consistent with MKK4 binding in the opposite orientation. Residue ranges labeledin gray are qualitatively judged as “off-target” based on distance from the known D site. (B) The reverse ERK2 docking site is crosslinked independently ofMKK4 peptide I concentration. Three separate LiF-MS experiments were performed at the indicated peptide concentration. The labeled clusters representingthe forward (red) and reverse (purple) D-motif binding above the indicated FDR cutoff are graphed across the length of ERK2. FDR cutoffs for each datasetare established as shown in SI Appendix, Fig. S1.

Parker et al. PNAS | October 15, 2019 | vol. 116 | no. 42 | 21007

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 8: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

MKK6 D-motif’s N-terminal region, which is not visible in crystalstructure, as a starred blue footprint near the resolved portion ofthe binding cleft.We next asked if running CABS-dock simulations for all

crosslinked residues above the FDR cutoff, rather than just thehighest-confidence residues, can correctly identify peptide ligandbinding sites. For this we turned to agglomerative hierarchicalclustering of CABS-dock peptide models, using rmsd of peptidemodels with one another as a distance metric. Hypothetically,with authentic peptide binding the diazirine crosslinker is posi-tioned near the ligand binding site, contacting a set of proteinsurfaces within a radius defined by ligand-constrained molecularmobility. This should produce multiple crosslink sites close to the“correct” docking site, which represents a point of energy con-vergence for the simulation, while off-target interactions shouldproduce fewer statistically significant crosslinks. As a result, thecluster of LiF-MS/CABS-dock peptide models that share con-formational features with each other and the bona fide bindingstructure should be larger than clusters of models constrained byoff-target crosslinks.We evaluated this idea by combining JNK–NFAT4 crosslinks

from the NFAT4 peptide I and II datasets, which have a numberof off-target crosslink sites, and ran CABS-dock simulations withNFAT4 constrained to the respective residues (SI Appendix, Fig.S5A). We used all residues in both datasets as constraints for atotal of 32 total simulations, with one simulation per constraint.We then compiled the top 10 CABS-dock output models andclustered using agglomerative hierarchical clustering (SI Ap-pendix, Fig. S5B). This produced 4 major subclusters, and we av-eraged coordinates from the largest cluster and mapped thesepositions on JNK1 (SI Appendix, Fig. S5C). The cluster of averaged

peptides accurately locates the JNK1 D site and overlaps with theknown NFAT4 location (SI Appendix, Fig. S5 C, Bottom; comparewith NFAT4 cartoon). Importantly, this includes CABS-dock sim-ulations with constraints to multiple “off-target” crosslinks such asJNK1 26 to 27 and 245 to 250 (SI Appendix, Fig. S2 A and B). Alsonotably, this analysis does not consider the “quality score” of theCABS-dock models.We further demonstrated the robustness of this technique by

repeating it on 2 LiF-MS datasets, MKK6–p38α and MKK4peptide II–JNK1 (SI Appendix, Fig. S6). In each case, con-strained residues above the FDR selected independently of scoreor location still correctly identified the D site. In both testedcases, the largest cluster among 4 or 5 subclusters accuratelyidentified the D site in both MAPKs (SI Appendix, Fig. S6). Mostnotably, combining LiF-MS, CABS-dock binding simulation withagglomerative hierarchical clustering allowed correct identifica-tion of peptide ligand binding site from the MKK4 peptide II–JNK1 dataset, which has a significant number of off-targetcrosslinks (Fig. 4A). Thus, this discovery method is less sensi-tive to “off-target” in vitro interactions.

DiscussionChemical crosslinking has been an important approach for probingthe makeup and structure of macromolecular complexes, notably

Fig. 7. LiF-MS–generated constraints can be used for ab initio docking cleftdiscovery. (A) Docking and ensemble selection overview. The top 1,000models from CABS-dock simulations constrained to crosslinked residuesare aggregated. After discarding solutions with an rmsd greater than 6 to7 Å from the butanol constraints, the remaining models are binned usingagglomerative hierarchical clustering and the coordinates of the backboneatoms averaged (shown as beads, Right). To predicted interacting residueson the kinase surface, amino acids within 7 Å were highlighted in red-brown. (B) Interacting cleft discovery using 4 different peptide modes. In allJNK1 models, PDB ID code 2XSO is shown as a reference with NFAT4 in blue.In the p38α model, PDB ID code 5ETF is shown as a reference with MKK6 inblue. (Far Left) The hydrophobic last 6 residues of the bipartite NFAT4D-motif were mapped using a C-terminal constraint from Glu-14 (magenta)onto crosslinked residues (red). (Center Left) the full-length NFAT4 D-motifwas mapped using combined data from both the N- and C-terminal cross-linkers. N-terminal crosslinked residues were constrained to Leu-1 and C-terminal crosslinked residues were constrained to Glu-14 (magenta) of theD-motif sequence. (Center Right) MKK4 modeled onto JNK1 using thehighest-ranked crosslinked residue cluster from the MKK4 peptide II exper-iment as simulation constraints (Fig. 4A). (Far Right) MKK6 modeled onto thep38α structure 5ETF using the crosslinked residue cluster from MKK6 asconstraints (Fig. 3). The predicted location of the MKK6 D-motif N terminus,which is not visible in the crystal structure, is shown as a blue footprint andmarked with an asterisk (*).

Fig. 6. LiF-MS–generated constraints assist in molecular docking. (A) Over-view of modeling procedure. MKK4 peptide I was constrained individually toa set of high-confidence crosslinks (Materials and Methods) and docked withCABS-dock along with an unconstrained control. The top 10 peptide struc-tures are scored against a reference. (B) The MKK4 D-motif was docked toJNK1 (PDB ID code 3PZE) constrained to the indicated amino acid residuesalongside an unconstrained experiment (Unconstrained). IndependentCABS-dock runs were completed for each individual residue. Models in thetop 10 were scored by backbone rmsd against a reference structure and thehighest-quality models from 3 independent runs compiled (30 models total).Models with sufficient quality for near-native modeling are marked withan asterisk (*). (C ) The top-3 scoring models are shown overlaid on thereference MKK4 D-motif (Ref, blue) model which was used for rmsdscoring. The peptide color indicates which JNK1 amino acid was used forconstraint. Shown in red are crosslinked residues from the JNK1–MKK4 LiF-MSexperiment.

21008 | www.pnas.org/cgi/doi/10.1073/pnas.1819533116 Parker et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 9: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

RNA (16, 41). Its usefulness for characterization of protein struc-ture has been limited by the relatively specific reactivity of reagents,which only allow crosslinking of glutamine, lysine, and cysteine.Thus, with less capability to map the structure and dynamics offolded domains and especially intrinsically disordered regions, mostchemical crosslinking analysis of proteins has focused on identifyinginteraction partners and domain-level organization. Traditionalcrosslinking tools provided key information about the structure ofp53 (42, 43). Development of crosslinkers with nonspecific reactivitylike diazirine and benzophenone has dramatically increased theusefulness of MS for analysis of structure and interactions of foldedproteins and intrinsically disordered regions (44).LiF-MS, which uses a cleavable diazirine crosslinker attached

to peptide ligands of interest, provides a method to identify siteswhere peptide ligands bind folded protein domains and map thefine structure of these interfaces. In this approach, the peptideligand provides site specificity to otherwise nonselective diazirinecrosslinking, and cleavage of the crosslinker leaves behind achemical mark at the reaction site. Importantly, our analysisindicates that diazirine crosslinking is highly ligand-directed, withthe vast majority occurring close to the site of ligand binding.The very short lifetime of carbene in solution probably helpsminimize the frequency of off-target crosslinks. We found thatthe small mass tag left behind after crosslinker cleavage is easilydetectable in MS/MS data with an appropriate search strategy.The diazirine crosslinker can be combined with affinity tags likebiotin, allowing straightforward enrichment of peptides con-taining crosslink–generated marks after proteolytic digestion forMS. This signal amplification is technically important becauseon-target diazirine crosslinking is inefficient, also probably at-tributable to fast reaction of carbene with solvent. We believethis may be crucial for mapping interactions of SLiMs withfolded proteins, which often have low affinity.As this work was under review a method using diazirine linked

to disulfide-based cleavable crosslinker was reported in analysisof the well-defined BID–MCL1 complex and interactions of thechaperone Skp with a substrate protein OmpA (45). Rather thancleavage by acid to leave a butanol or butanol derivative thisapproach uses thiol to detect crosslink sites. The interactionsprobed in this work have affinities 100 to 1,000 times higher thanD-motif–MAPK complexes we mapped using LiF-MS, and theanalysis does not simulate identification of unknown bindingsites. However, while the approaches are not directly comparableas described, they clearly illustrate the broad potential of ligand-directed cleavable crosslinkers with nonselective chemistry. In-deed, a different cleavable diazirine-based crosslinker, encodedinto protein by codon suppression, has been used to map con-formational changes of the acid chaperone HdeA (46). Cleavageof this crosslinker with hydrogen peroxide leaves a residual markthat can be modified with dyes or other functional groups (47,48). The butanol group produced in LiF-MS cannot be modified,but we believe easy synthesis of large amounts of prey peptidethrough cysteine alkylation (Fig. 1A) rather than codon sup-pression gives LiF-MS an advantage when specifically mappingtransient interactions using purified components.In our D-motif–MAPK analysis we found that LiF-MS con-

sistently produced clusters of high-confidence crosslinks close toknown D-motif peptide binding regions. In some cases we alsosaw significant crosslinking in discrete places far from the pre-viously identified ligand binding site. We consider it unlikely thatthis reflects a novel biologically important D-motif bindingmode. Rather, we hypothesize that low-affinity “off-target” in-teraction of a specific protein surface with a peptide ligandsubregion increases the likelihood that the short-lived carbenewill form crosslinks nearby before it reacts with solvent. Con-sistent with this, more such crosslinks sites appeared with in-creasing ligand concentration (Fig. 5B). Importantly, in all caseswe saw significantly stronger evidence of crosslinking close to

previously confirmed D-motif peptide binding sites: Clusters ofamino acids exhibed high-confidence modification by MS, andthere were multiple distinct clusters near the D-motif bindingsite consistent with flexible positioning of the diazirine-bearingregion. Intriguingly, the presence of consistent “off-target”crosslinking sites in some experiments suggests that LiF-MScan detect very weak interactions that have some specificity;we do not yet know the lower affinity boundary of binding in-teractions the approach can detect in vitro.We developed 2 modeling strategies that use LiF-MS cross-

linking data to identify likely peptide binding sites. In both, weused crosslink sites as distance constraints, stipulating that allsolutions must place the crosslinker–bearing portion of thepeptide within an appropriate distance of the crosslink site. Inthe first approach we used only clusters of crosslinked residueswith highly ranked confidence scores. This accurately predictedthe binding clefts of the MKK4–JNK1 and MKK6–p38α inter-actions, predicting docking clefts along the length of MKK6 orMKK4 D-motif peptides (Fig. 7B). Moreover, the N-terminalLiF-MS dataset predicts a potentially novel NFAT4 D-motif–JNK1 binding mode. In the second approach we used all sitesabove the FDR cutoff as constraints, regardless of their locationon the structure or crosslinking status of neighboring residues,and then analyzed the full set of structural models by hierar-chical clustering. As noted, most crosslinking occurred near theD-motif binding site in all experiments; consistent with this, thelargest coherent cluster identified MAPK D sites (SI Appendix,Figs. S5 and S6). Overall, we found that the first approachprovides superior accuracy of peptide structural detail, while thesecond is robust to off-target crosslinking or “noisy” data andmay be more useful for initial identification of ligand bindingsites in newly identified peptide docking interactions.Information from LiF-MS should significantly enhance in-

terpretation of data from other structural techniques. For ex-ample, site-specific footprinting maps produced by LiF-MS maybe combined with NMR data to produce a more confident pic-ture of dynamic contacts between an unstructured region and afolded domain. Additionally, conformations not resolved in X-ray crystal structures may be detected with LiF-MS, as with themode of NFAT4–JNK1 supported by our analysis (SI Appendix,Fig. S2). Moreover, LiF-MS may further be used in tandem withhydrogen–deuterium exchange to verify the broad location of apeptide ligand–folded domain interaction, as both methodsprovide data strongly influenced by conformational dynamics ofmacromolecular assemblies.Notably, LiF-MS indicates the location of a single labeled

“point” on the peptide studied and thus can determine both thebinding conformation and directionality of a disordered peptidewithin a binding site on a protein domain. Here we used peptideligands with N- or C-terminal crosslinker to demonstrate that theMKK4 D-motif binds JNK1 and p38α in divergent conforma-tions, confirming a previous hypothesis based on structural pre-diction (8, 9). Interestingly, most of MKK6’s basic N terminus isinvisible in the p38α–MKK6 crystal structure, suggesting flexiblebinding and intrinsic disorder that may be similar for the MKK4interaction (26).LiF-MS provided excellent evidence for bidirectional in-

teraction of MKK4’s D-motif with ERK2 that would be difficultto obtain using other biophysical methods. While the MKK4ligand binds ERK2 with significantly lower affinity than D-motif–MAPK associations of known functional significance, we read-ily detected it and saw evidence of both docking peptide orien-tations across a range of peptide ligand concentrations. Weconsider it unlikely that ERK2’s D site promiscuously bindspeptides with both reduced affinity and sequence specificity.Rather, we speculate that “reverse” binding of MKK4 toERK2 is nonproductive for signaling, evolving in combination

Parker et al. PNAS | October 15, 2019 | vol. 116 | no. 42 | 21009

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 10: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

with affinity-lowering interface differences to limit off-pathwayMKK4–ERK2 signaling.Overall, LiF-MS as detailed here is a straightforward method

with unique ability to identify and map mobile interactions ofpeptide motifs with their specific binding sites on structuredprotein domains. The approach complements other structuralmethods, notably providing useful guidance to computationalanalysis of ligand–protein interactions. We anticipate that LiF-MS and related chemical modification techniques will proveimportant for understanding dynamic structures of intrinsicallydisordered protein regions and their sequence-specific associa-tions with stably folded protein regions.

Materials and MethodsProtein Preparation, Crosslinking, and MS. SI Appendix, SupplementaryMethods details methods we used to perform LiF-MS analysis of D-motifpeptide–MAPK binding. This includes MAPK protein production, peptidecrosslinking, and mass spectrometric analysis.

Crosslinked Residue Scoring and Mapping.We calculated confidence scores foreach PSM by taking the reciprocal of the expect value (“E-value”) returned byCOMET (SI Appendix, Fig. S1). We then added scores of distinct PSMs withcrosslinked residues in the same location in the protein sequence to create a“confidence” value for each residue of the protein using the followingformula:

Cr =Xn

p=1

�E−1p

�,

where Cr is the confidence value C of a crosslink at residue r in the proteinamino acid sequence. Ep is the E-value of PSM p which contains a crosslink atresidue r. n represents the total number of PSMs containing crosslinks atlocation r in the protein sequence. We dynamically established an FDR cutoffon a per-dataset basis based on the location of the first decoy in a list of E-values sorted rather than using a flat cutoff due to the high variation incrosslinker signal strength between datasets (SI Appendix, Fig. S1).

CABS-Dock Modeling of D-Motif Peptide Binding Using LiF-MS Constraints. Forall computational modeling of peptide binding we performed CABS-docksimulations (39) using each amino acid from indicated ranges as a point ofconstraint in an individual simulation run, adding a 12.0-Å flexible chain toinclude the approximate length of the postreaction crosslinker. For initialanalysis of MKK4–JNK1 association we used the top 3 highly ranked clustersof 4 or more crosslinked residues from experimental binding of MKK4peptide I (GKRKALKLNFAN) to JNK1 as CABS-dock constraints, as well asJNK1 residue 165 from the fourth cluster. We used the crystal structureof JNK1 without the peptide present (PDB ID code 3PZE) (5), Rosetta

PepFlexDock (38) for refinement, and the Bio3D R package (49) to calcu-late rmsds. Notably, repeating this approach using a JNK1 structure from aD-motif bound conformation (PDB ID code 2XS0) (6) produced lowermodel quality (SI Appendix, Fig. S4A).

Binding Cleft Discovery Using LiF-MS Constraints. For the C terminus ofNFAT4 we constrained the C-terminal end of the sequence LYLPLE to JNK1residues 161 to 166, and for full-length NFAT4 D-motif (LERPSRDHLYLPLE)we combined JNK1 residues 81 to 85 and 161 to 166 as N-terminal con-straints with 161 to 166 as C-terminal constraints. We simulated all com-binations of N- and C-terminal constraints for a total of 30 CABS-dock runs.For MKK4 to JNK1 docking we used JNK1 residues 128 to 132 as N-terminalconstraints for MKK4 peptide II (KRKALKLNFAN). For MKK6–p38α dockingwe constrained the N terminus of the MKK6 D-motif (KKRNPGLKIPK) top38α residues 160 to 165. In all cases CABS-dock compiled the top 1,000peptide conformation solutions in each simulation removed peptideslacking at least one backbone alpha carbon within an rmsd of 6 to 8 Å ofthe crosslinked constraint residues. We analyzed the distribution of con-formational states using complete-linkage agglomerative hierarchicalclustering using the cluster and Bio3D R packages (40, 49), using inter-ensemble rmsd of peptide binding structures as a distance metric. We thenaveraged the coordinates of alpha backbone carbons from the mostpopulous cluster of peptide models and mapped these positions onto kinasestructures, retaining SD information for each point.

We employed a different approach to identify peptide binding sites usingall crosslink sites that clear the FDR cutoff (SI Appendix, Figs. S5 and S6). Tomodel JNK1–NFAT4 D-motif binding we combined consideration ofNFAT4 peptide I (N-terminal crosslinker) and NFAT4 peptide II (C-terminalcrosslinker). We used NFAT4 peptide I–JNK1 residues 26 to 27, 79 to 85,161 to 167, and 245 to 251 and NFAT4 peptide II–JNK1 residues 26 to 27,161 to 166, and 245 to 249 as constraints and compiled all NFAT4 peptidebinding simulations for a total of 320 models. For JNK1 binding ofMKK4 peptide I we used JNK1 residues 28 to 32, 128 to 132, 199 to 203,237 to 241, 281 to 288, and 324 to 331 as constraints. For MKK6–p38α weused residues 160 to 165as constraints. We performed agglomerative hier-archical clustering (complete linkage method) using rmsd between peptidemodels as distance metric.

ACKNOWLEDGMENTS. We thank Felipe da Veiga Leprevost for help withMS/MS data analysis, Gerg}o Gógl and Attila Reményi for reagents, and JenniferBrace for critical review and editing of the manuscript. This research wassupported by the National Institute of General Medical Sciences of the NIHgrant R01GM084223 to E.L.W. and in part by NIH grants R01GM94231 andU24CA210967 to A.I.N. This research used facilities supported by NationalCancer Institute CCSG P30 CA060553 awarded to the Robert H. Lurie Com-prehensive Cancer Center (Northwestern University Proteomics Core Facilityand Structural Biology Facility) and the National Resource for Translationaland Developmental Proteomics supported by P41 GM108569 (NorthwesternUniversity Proteomics Core Facility).

1. N. E. Davey et al., Attributes of short linear motifs. Mol. Biosyst. 8, 268–281 (2012).2. N. London, D. Movshovitz-Attias, O. Schueler-Furman, The structural basis of peptide-

protein binding strategies. Structure 18, 188–199 (2010).3. P. Tompa, N. E. Davey, T. J. Gibson, M. M. Babu, A million peptide motifs for the

molecular biologist. Mol. Cell 55, 161–169 (2014).4. B. Uyar, R. J. Weatheritt, H. Dinkel, N. E. Davey, T. J. Gibson, Proteome-wide analysis

of human disease mutations in short linear motifs: Neglected players in cancer? Mol.

Biosyst. 10, 2626–2642 (2014).5. T. Kaneko et al., Loops govern SH2 domain specificity by controlling access to binding

pockets. Sci. Signal. 3, ra34 (2010).6. T. Kaneko, S. S. Sidhu, S. S. C. Li, Evolving specificity from variability for protein in-

teraction domains. Trends Biochem. Sci. 36, 183–190 (2011).7. B. J. Mayer, SH3 domains: Complexity in moderation. J. Cell Sci. 114, 1253–1263 (2001).8. Á. Garai et al., Specificity of linear motifs that bind to a common mitogen-activated

protein kinase docking groove. Sci. Signal. 5, ra74 (2012).9. A. Zeke et al., Systematic discovery of linear binding motifs targeting an ancient

protein interaction surface on MAP kinases. Mol. Syst. Biol. 11, 837 (2015).10. K. Van Roey et al., Short linear motifs: Ubiquitous and functionally diverse protein

interaction modules directing cell regulation. Chem. Rev. 114, 6733–6778 (2014).11. M. Kurcinski, M. Jamroz, M. Blaszczyk, A. Kolinski, S. Kmiecik, CABS-dock web server

for the flexible docking of peptides to proteins without prior knowledge of the

binding site. Nucleic Acids Res. 43, W419–W424 (2015).12. N. E. Davey, M. S. Cyert, A. M. Moses, Short linear motifs–Ex nihilo evolution of

protein regulation. Cell Commun. Signal. 13, 43 (2015).13. W. Peti, R. Page, Molecular basis of MAP kinase regulation. Protein Sci. 22, 1698–1710 (2013).14. S. R. Marcsisin, J. R. Engen, Hydrogen exchange mass spectrometry: What is it and

what can it tell us? Anal. Bioanal. Chem. 397, 967–972 (2010).

15. J. B. Lucks et al., Multiplexed RNA structure characterization with selective 2′-hydroxylacylation analyzed by primer extension sequencing (SHAPE-Seq). Proc. Natl. Acad. Sci.U.S.A. 108, 11063–11068 (2011).

16. K. M. Weeks, Advances in RNA structure analysis by chemical probing. Curr. Opin.Struct. Biol. 20, 295–304 (2010).

17. D. T. Krist, A. V. Statsyuk, Catalytically important residues of E6AP ubiquitin ligase identifiedusing acid-cleavable photo-cross-linkers. Biochemistry 54, 4411–4414 (2015).

18. A. Belsom, M. Schneider, O. Brock, J. Rappsilber, Blind evaluation of hybrid proteinstructure analysis methods based on cross-linking. Trends Biochem. Sci. 41, 564–567(2016).

19. L. Dubinsky, B. P. Krom, M. M. Meijler, Diazirine based photoaffinity labeling. Bioorg.Med. Chem. 20, 554–570 (2012).

20. J. Wang, J. Kubicki, H. Peng, M. S. Platz, Influence of solvent on carbene intersystemcrossing rates. J. Am. Chem. Soc. 130, 6604–6609 (2008).

21. A. J. Bardwell, L. Bardwell, Two hydrophobic residues can determine the specificity ofmitogen-activated protein kinase docking interactions. J. Biol. Chem. 290, 26661–26674 (2015).

22. A. J. Bardwell, E. Frankson, L. Bardwell, Selectivity of docking sites in MAPK kinases. J.Biol. Chem. 284, 13165–13173 (2009).

23. L. Bardwell, Mechanisms of MAPK signalling specificity. Biochem. Soc. Trans. 34, 837–841 (2006).

24. D. T. Ho, A. J. Bardwell, M. Abdollahi, L. Bardwell, A docking site in MKK4 mediateshigh affinity binding to JNK MAPKs and competes with similar docking sites in JNKsubstrates. J. Biol. Chem. 278, 32662–32672 (2003).

25. D. T. Ho, A. J. Bardwell, S. Grewal, C. Iverson, L. Bardwell, Interacting JNK-dockingsites in MKK7 promote binding and activation of JNK mitogen-activated protein ki-nases. J. Biol. Chem. 281, 13169–13179 (2006).

21010 | www.pnas.org/cgi/doi/10.1073/pnas.1819533116 Parker et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020

Page 11: Mapping low-affinity/high-specificity …Peptides of interest may also contain other functional elements, such as biotin, that facilitate conjugate detection and affinity purification

26. E. Pellegrini et al., Structural basis for the subversion of MAP kinase signaling by anintrinsically disordered parasite secreted agonist. Structure 25, 16–26 (2017).

27. T. Tanoue, R. Maeda, M. Adachi, E. Nishida, Identification of a docking groove on ERKand p38 MAP kinases that regulates the specificity of docking interactions. EMBO J.20, 466–479 (2001).

28. S.-H. Lee et al., Understanding pre-structured motifs (PreSMos) in intrinsically un-folded proteins. Curr. Protein Pept. Sci. 13, 34–54 (2012).

29. J. G. Olsen, K. Teilum, B. B. Kragelund, Behaviour of intrinsically disordered proteins inprotein-protein complexes with an emphasis on fuzziness. Cell. Mol. Life Sci. 74, 3175–3183 (2017).

30. T. Tanoue, M. Adachi, T. Moriguchi, E. Nishida, A conserved docking motif in MAPkinases common to substrates, activators and regulators. Nat. Cell Biol. 2, 110–116(2000).

31. C. A. Dimitri, W. Dowdle, J. P. MacKeigan, J. Blenis, L. O. Murphy, Spatially separatedocking sites on ERK2 regulate distinct signaling events in vivo. Curr. Biol. 15, 1319–1324 (2005).

32. X. Liu et al., A conserved motif in JNK/p38-specific MAPK phosphatases as a de-terminant for JNK1 recognition and inactivation. Nat. Commun. 7, 10879 (2016).

33. J. K. Eng, T. A. Jahan, M. R. Hoopmann, Comet: An open-source MS/MS sequencedatabase search tool. Proteomics 13, 22–24 (2013).

34. J. K. Eng et al., A deeper look into Comet–Implementation and features. J. Am. Soc.Mass Spectrom. 26, 1865–1874 (2015).

35. H. Okada et al., Peptide array X-linking (PAX): A new peptide-protein identificationapproach. PLoS One 7, e37035 (2012).

36. T. Zhou, L. Sun, J. Humphreys, E. J. Goldsmith, Docking interactions induce exposureof activation loop in the MAP kinase ERK2. Structure 14, 1011–1019 (2006).

37. V. Oza et al., Discovery of checkpoint kinase inhibitor (S)-5-(3-fluorophenyl)-N-(piperidin-3-yl)-3-ureidothiophene-2-carboxamide (AZD7762) by structure-based de-sign and optimization of thiophenecarboxamide ureas. J. Med. Chem. 55, 5130–5142(2012).

38. B. Raveh, N. London, O. Schueler-Furman, Sub-angstrom modeling of complexes

between flexible peptides and globular proteins. Proteins 78, 2029–2040 (2010).39. M. Blaszczyk et al., Modeling of protein-peptide interactions using the CABS-dock

web server for binding site search and flexible docking. Methods 93, 72–83 (2016).40. M. Maechler, P. Rousseeuw, A. Struyf, M. Hubert, K. Hornik, Cluster: Cluster analysis

basics and extensions. R package version 2.1.0 (2019).41. B. T. Chait, M. Cadene, P. D. Olinares, M. P. Rout, Y. Shi, Revealing higher order

protein structure using mass spectrometry. J. Am. Soc. Mass Spectrom. 27, 952–965

(2016).42. C. Arlt et al., An integrated mass spectrometry based approach to probe the structure

of the full-length wild-type tetrameric p53 tumor suppressor. Angew. Chem. Int. Ed.

Engl. 56, 275–279 (2017).43. C. Arlt, C. H. Ihling, A. Sinz, Structure of full-length p53 tumor suppressor probed by

chemical cross-linking and mass spectrometry. Proteomics 15, 2746–2755 (2015).44. A. Sinz, Cross-Linking/mass spectrometry for studying protein structures and protein-

protein interactions: Where are we now and where should we go from here? Angew.

Chem. Int. Ed. Engl. 57, 6390–6396 (2018).45. J. E. Horne et al., Rapid mapping of protein interactions using tag-transfer photo-

crosslinkers. Angew. Chem. Int. Ed. Engl. 57, 16688–16692 (2018).46. Y. Yang et al., Genetically encoded protein photocrosslinker with a transferable mass

spectrometry-identifiable label. Nat. Commun. 7, 12299 (2016).47. Y. Yang et al., Genetically encoded releasable photo-cross-linking strategies for

studying protein-protein interactions in living cells. Nat. Protoc. 12, 2147–2168 (2017).48. S. Lin et al., Genetically encoded cleavable protein photo-cross-linker. J. Am. Chem.

Soc. 136, 11860–11863 (2014).49. B. J. Grant, A. P. C. Rodrigues, K. M. ElSawy, J. A. McCammon, L. S. D. Caves, Bio3d: An

R package for the comparative analysis of protein structures. Bioinformatics 22, 2695–

2696 (2006).

Parker et al. PNAS | October 15, 2019 | vol. 116 | no. 42 | 21011

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Oct

ober

9, 2

020