11
Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis of STAT6 CF , the D max of STAT6 CF decreased from 164.2 to 150.9 Å after binding of DNA. Furthermore, the R g and V porod calculated from SAXS data also decreased after DNA binding (Fig. S7), indicating an obvious change in morphology of the STAT6 CF dimer induced by DNA binding. Moreover, Kratky plots generated from SAXS data reveal a bump around 0.1 s for STAT6 CF compared with the STAT6 CF -N4 complex, suggesting that the DNA-free form of phosphorylated STAT6 CF is probably more flexible than the STAT6 CF -N4 complex in the mid-q region (Fig. S7). This observation also indicates that the conformation of STAT6 CF could be stabilized by DNA. To further investigate the DNA binding dynamics, MD sim- ulations of the STAT6 CF -N4 complex and apo STAT6 CF (after removal of DNA from the STAT6 CF -N4 complex structure) were performed. The total potential energy remained steady during the simulation (Fig. S6 C and D). The stabilized structures from the simulations do not deviate significantly from the respective crystal structures. This observation suggests that the crystal struc- tures could be reasonable representatives of these molecules in their physiological states. Using the distance between two N-terminal coiled coil domains as the reaction coordinate, the one- dimensional free energy profile was calculated. The structure of the apo STAT6 CF shows large structural flexibility as revealed by the large distance fluctuation between the apo STAT6 CF state and DNA-bound state (Fig. S6 C and D). In contrast, the tendency of protein to undergo conformational change is significantly reduced in the STAT6 CF -DNA complex. These simulation results clearly indicate a decreased structural flexibility of STAT6 CF after DNA binding. Thus, the crystal structures, SAXS analysis, and MD sim- ulation experiments indicate that the conformation of STAT6 CF changes substantially after DNA binding. This conformation of DNA- bound STAT6 CF is stable compared with the unliganded STAT6 CF . SI Discussion In the crystal structure of the STAT6 CF -DNA complex, two molecules of STAT6 CF form a homotypic dimer to bind one duplex of DNA. Each protomer of STAT6 within a dimer binds half of the palindromic DNA. However, the DNA sequences used for crystallization are not palindromic in a strict sense. Consequently, there is a possibility that the DNAs may have gotten packed in random orientations similar to the crystallo- graphic average observed in STAT1 CF -DNA and STAT3 CF - DNA complex structures (9, 10). In our particular case, the moderate resolution of the two protein-DNA complexes limits us from drawing similar conclusions. However, this ambiguity does not impair the interpretation of the structures. Most of the in- teractions during DNA binding occur between STAT6 and the sugar-phosphate backbone. Only one interaction of STAT6 with G14 (N4 site)/G13 (N3 site) is directed toward a base. Incidentally, this base is located within the palindromic region. Recently, STAT6 was shown to participate in immune re- sponses initiated by host cells during viral infections. TBK1 was shown to phosphorylate S407 of the DBD of STAT6 (5). How- ever, our crystal structures of STAT6 reveal that S407 is buried and not exposed on the surface. Specifically, the side chain of S407 points toward the interior of the protein and forms two hydrogen bonds with its neighboring residues V386 and L408 (Fig. S8A). The location and conformation of S407, as well as the microenvironment around it, are highly conserved in all of the STAT structures reported thus far (PDB ID codes 1BF5, 1BG1, 1Y1U, 1YVL, 3CWG, and 4E68). Therefore, residue S407 is not likely to be accessible for phosphorylation by any kinase in the conformations of the protein observed in the crystal structures. Furthermore, the results from a luciferase reporter-based assay result demonstrated that a S407E mutation in STAT6, which was generated to simulate the phosphorylated state, compromised the ability of STAT6 to activate the reporter gene in both IL-4 and antiviral signaling pathways. Another mutant of STAT6, S407A, retains only partially the ability of the WT protein to activate the reporter gene in an IL-4dependent manner (Fig. S8B). S407E and S407A mutations of STAT6 CF are insoluble when expressed using the E. coli-based expression system, lim- iting any further structural and biochemical characterization of these mutations. Perhaps, the S407 mutants of STAT6 may have lost their function due to misfolding of the protein. Analysis of the structures of STAT proteins indicates that if S407 is indeed phosphorylated during the antiviral response, the DNA-binding domain would have to undergo a large conformational change to bring S407 to the surface so that it could be accessed by a kinase. Given the conformational diversity of STAT proteins, this could be possible. Further structural and biochemical studies are re- quired to shed more light on role of additional phosphorylation sites of STAT6. SI Materials and Methods Protein Expression and Purification. The human STAT6 CF (aa 123658) was cloned into the pMCSG7 expression vector using the ligation-independent cloning method as described previously (27, 28). In brief, the clone expressed STAT6 CF with an N-terminal 6x His-tag. Phosphorylation of the protein was achieved by coex- pressing STAT6 CF with the tyrosine kinase receptor domain of Elk in the E. coli BL21 (DE3) TKB1 strain (Agilent Technology) as described previously (17, 18). Harvested cells were suspended in PBS and lysed by ultrasonication. Soluble recombinant protein present in the clarified supernatant was isolated and purified using a Ni-NTA column (Qiagen) followed by further purifica- tion using a heparin column. The phosphorylated protein was eluted from the heparin column with 400 mM NaCl and then incubated with TEV protease overnight at 16 °C to remove the His-tag. The tag-less protein was loaded on a Superdex S200 column (GE Healthcare) equilibrated with 20 mM Hepes (pH 7.0), 200 mM NaCl, 0.5 mM EDTA, 10 mM MgCl 2 , and 4 mM DTT (Fig. S3A). Pure, phosphorylated STAT6 CF eluted in a single peak. Fractions containing the protein were pooled and concentrated for crystallization and other experiments. Un- phosphorylated STAT6 CF was overexpressed in E. coli BL21 (DE3) and purified by Ni 2+ affinity chromatography followed by gel filtration chromatograph using the same procedures as de- scribed for phosphorylated STAT6 CF . The phosphorylated core fragment of human STAT1 (aa 132713, STAT1 CF ) was cloned, expressed, and purified using the same protocols as described for phosphorylated STAT6 CF . Crystallization. Phosphorylated STAT6 CF was concentrated to 20 mg/mL. The protein concentration was estimated using a NanoDrop 2000 machine (Thermo Scientific). Phosphorylated STAT6 CF was screened for crystallization using commercially available sparse matrix screens. Crystallization drops containing protein and precipitant (1:1 ratio) were dispensed by a Mosquito robot (TTP LabTech). Optimization of the crystals was per- formed manually in hanging drops containing 1.0 μL protein Li et al. www.pnas.org/cgi/content/short/1611228113 1 of 11

Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Supporting InformationLi et al. 10.1073/pnas.1611228113SI ResultsIn agreement with crystallographic observations, during SAXSanalysis of STAT6CF, the Dmax of STAT6

CF decreased from 164.2to 150.9 Å after binding of DNA. Furthermore, the Rg and Vporodcalculated from SAXS data also decreased after DNA binding(Fig. S7), indicating an obvious change in morphology of theSTAT6CF dimer induced by DNA binding. Moreover, Kratkyplots generated from SAXS data reveal a bump around 0.1 s forSTAT6CF compared with the STAT6CF-N4 complex, suggestingthat the DNA-free form of phosphorylated STAT6CF is probablymore flexible than the STAT6CF-N4 complex in the mid-q region(Fig. S7). This observation also indicates that the conformationof STAT6CF could be stabilized by DNA.To further investigate the DNA binding dynamics, MD sim-

ulations of the STAT6CF-N4 complex and apo STAT6CF (afterremoval of DNA from the STAT6CF-N4 complex structure) wereperformed. The total potential energy remained steady duringthe simulation (Fig. S6 C and D). The stabilized structures fromthe simulations do not deviate significantly from the respectivecrystal structures. This observation suggests that the crystal struc-tures could be reasonable representatives of these molecules intheir physiological states. Using the distance between twoN-terminal coiled coil domains as the reaction coordinate, the one-dimensional free energy profile was calculated. The structure of theapo STAT6CF shows large structural flexibility as revealed by thelarge distance fluctuation between the apo STAT6CF state andDNA-bound state (Fig. S6 C and D). In contrast, the tendency ofprotein to undergo conformational change is significantly reducedin the STAT6CF-DNA complex. These simulation results clearlyindicate a decreased structural flexibility of STAT6CF after DNAbinding. Thus, the crystal structures, SAXS analysis, and MD sim-ulation experiments indicate that the conformation of STAT6CF

changes substantially after DNA binding. This conformation of DNA-bound STAT6CF is stable compared with the unliganded STAT6CF.

SI DiscussionIn the crystal structure of the STAT6CF-DNA complex, twomolecules of STAT6CF form a homotypic dimer to bind oneduplex of DNA. Each protomer of STAT6 within a dimer bindshalf of the palindromic DNA. However, the DNA sequencesused for crystallization are not palindromic in a strict sense.Consequently, there is a possibility that the DNAs may havegotten packed in random orientations similar to the crystallo-graphic average observed in STAT1CF-DNA and STAT3CF-DNA complex structures (9, 10). In our particular case, themoderate resolution of the two protein-DNA complexes limits usfrom drawing similar conclusions. However, this ambiguity doesnot impair the interpretation of the structures. Most of the in-teractions during DNA binding occur between STAT6 and thesugar-phosphate backbone. Only one interaction of STAT6 withG14 (N4 site)/G13 (N3 site) is directed toward a base. Incidentally,this base is located within the palindromic region.Recently, STAT6 was shown to participate in immune re-

sponses initiated by host cells during viral infections. TBK1 wasshown to phosphorylate S407 of the DBD of STAT6 (5). How-ever, our crystal structures of STAT6 reveal that S407 is buriedand not exposed on the surface. Specifically, the side chain ofS407 points toward the interior of the protein and forms twohydrogen bonds with its neighboring residues V386 and L408(Fig. S8A). The location and conformation of S407, as well as themicroenvironment around it, are highly conserved in all of the

STAT structures reported thus far (PDB ID codes 1BF5, 1BG1,1Y1U, 1YVL, 3CWG, and 4E68). Therefore, residue S407 is notlikely to be accessible for phosphorylation by any kinase in theconformations of the protein observed in the crystal structures.Furthermore, the results from a luciferase reporter-based assayresult demonstrated that a S407E mutation in STAT6, which wasgenerated to simulate the phosphorylated state, compromisedthe ability of STAT6 to activate the reporter gene in both IL-4and antiviral signaling pathways. Another mutant of STAT6,S407A, retains only partially the ability of the WT protein toactivate the reporter gene in an IL-4–dependent manner (Fig.S8B). S407E and S407A mutations of STAT6CF are insolublewhen expressed using the E. coli-based expression system, lim-iting any further structural and biochemical characterization ofthese mutations. Perhaps, the S407 mutants of STAT6 may havelost their function due to misfolding of the protein. Analysis ofthe structures of STAT proteins indicates that if S407 is indeedphosphorylated during the antiviral response, the DNA-bindingdomain would have to undergo a large conformational change tobring S407 to the surface so that it could be accessed by a kinase.Given the conformational diversity of STAT proteins, this couldbe possible. Further structural and biochemical studies are re-quired to shed more light on role of additional phosphorylationsites of STAT6.

SI Materials and MethodsProtein Expression and Purification. The human STAT6CF (aa 123–658) was cloned into the pMCSG7 expression vector using theligation-independent cloning method as described previously(27, 28). In brief, the clone expressed STAT6CF with an N-terminal6x His-tag. Phosphorylation of the protein was achieved by coex-pressing STAT6CF with the tyrosine kinase receptor domain of Elkin the E. coli BL21 (DE3) TKB1 strain (Agilent Technology) asdescribed previously (17, 18). Harvested cells were suspended inPBS and lysed by ultrasonication. Soluble recombinant proteinpresent in the clarified supernatant was isolated and purifiedusing a Ni-NTA column (Qiagen) followed by further purifica-tion using a heparin column. The phosphorylated protein waseluted from the heparin column with ∼400 mM NaCl and thenincubated with TEV protease overnight at 16 °C to remove theHis-tag. The tag-less protein was loaded on a Superdex S200column (GE Healthcare) equilibrated with 20 mM Hepes (pH7.0), 200 mM NaCl, 0.5 mM EDTA, 10 mM MgCl2, and 4 mMDTT (Fig. S3A). Pure, phosphorylated STAT6CF eluted in asingle peak. Fractions containing the protein were pooled andconcentrated for crystallization and other experiments. Un-phosphorylated STAT6CF was overexpressed in E. coli BL21(DE3) and purified by Ni2+ affinity chromatography followed bygel filtration chromatograph using the same procedures as de-scribed for phosphorylated STAT6CF. The phosphorylated corefragment of human STAT1 (aa 132–713, STAT1CF) was cloned,expressed, and purified using the same protocols as describedfor phosphorylated STAT6CF.

Crystallization. Phosphorylated STAT6CF was concentrated to20 mg/mL. The protein concentration was estimated using aNanoDrop 2000 machine (Thermo Scientific). PhosphorylatedSTAT6CF was screened for crystallization using commerciallyavailable sparse matrix screens. Crystallization drops containingprotein and precipitant (1:1 ratio) were dispensed by a Mosquitorobot (TTP LabTech). Optimization of the crystals was per-formed manually in hanging drops containing 1.0 μL protein

Li et al. www.pnas.org/cgi/content/short/1611228113 1 of 11

Page 2: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

mixed with 1.0 μL mother liquor. Crystals for data collectionwere grown in 0.1 M imidazole (pH 7.5), 0.2 M Li2SO4, and 12%(wt/vol) PEG3000.Single-stranded DNAs used in the study were chemically

synthesized by Sangon Biotech. Double-stranded DNAs wereproduced by annealing. Nucleic acids were dissolved in the gelfiltration chromatography buffer. All of the STAT6CF-DNAcomplexes were obtained by incubating the purified phosphory-lated STAT6CF (10 mg/mL) with annealed oligonucleotide du-plexes at a molar ratio of 1:1.2 for 1 h in an ice bath. A variety ofDNAs of different lengths were used for formation of protein-DNA complexes. These complexes were screened for crystalli-zation. Phosphorylated STAT6CF in complex with the 22-bp N4site duplex and 21-bp N3 site duplex formed crystals that weresuitable for data collection. Crystals of both the complexes weregrown in 0.1 M citrate (pH 5.6), 0.1 M NaCl, 20% (vol/vol)isopropyl alcohol, and 8% (wt/vol) PEG4000.

Data Collection and Structure Determination. All crystals wereharvested, cryoprotected in the mother liquor containing anadditional 25% (vol/vol) glycerol, and then flash-frozen at 100 Kin liquid nitrogen. The diffraction data for STAT6CF were col-lected on beamline BL17U1 at the Shanghai Synchrotron Ra-diation Facility (SSRF). The data for the STAT6CFand N4 siteDNA complex (STAT6CF-N4 complex) were collected onbeamline BL19U of SSRF. The data for STAT6CF and N3 siteDNA complex (STAT6CF-N3 complex) were collected onbeamline 23ID-C of the General Medical Sciences and CancerInstitutes Structural Biology Facility–Collaborative Access Team(GM/CA-CAT), Argonne National Laboratory. All datasetswere indexed, integrated, and scaled using HKL2000 softwarepackage (29). All three structures were determined by MR. Thestructure of STAT6CF was solved first using the structure ofmouse STAT5 (PDB ID code 1Y1U) as the search model dur-ing MR (13). Subsequent structures of the two complexesof STAT6CF with DNA were solved by MR using our ownSTAT6CF structure as the search model. All three structureswere manually improved in Coot (30). Refinement for eachstructure was carried out using Phenix Refine (31), alternately.The details of data collection and refinement statistics are lis-ted in Table S3.

ITC. ITC measurements were performed on an iTC200 calorimeter(Microcal). All experiments were carried out in the buffer con-taining 20 mM Hepes (pH 7.0), 200 mM NaCl, 10 mM MgCl2,4 mM DTT, and 0.5 mM EDTA at 25 °C. PhosphorylatedSTAT6CF and its mutants were placed into the sample chamber,and different DNAs were added using the syringe with 20 suc-cessive additions of 2 μL for 4 s (the first injection was 0.5 μL),and the injection interval was 120 s. The data analysis was per-formed using ORIGIN software (Microcal).

SPR Assay. The interactions between STAT1CF or STAT6CF andDNAs were further explored using Biacore 3000 at 25 °C.Running buffer was composed of 20 mM Hepes (pH 7.0),150 mM NaCl, 1 mMMgCl2, and 0.1% Tween-20. DNAs labeledwith biotin at the 5′ end were immobilized onto two ligandchannels of a SA sensor chip (GE Healthcare) with ∼50 RUeach. First, we made a blank injection of the running buffer,followed by injections of the proteins at different concentrations(0.625 nM to 5.12 μM) in the running buffer at a flow rate of30 μL/min for 60 s. Each injection was followed by 120-s disas-sociation and 15-s regeneration with 0.1% SDS. The sensor-grams obtained were fit simultaneously after subtracting areference blank of running buffer using Biacore 3000 evaluationsoftware to obtain on (Ka) and off (Kd) rates. The kinetic KD wascalculated based on the on and off rates. An equilibrium analysisof the data was also performed to calculate the KD.

Dual Luciferase Assay. Full-length STAT6 (STAT6FL) with an N-terminal Flag-tag was cloned into pCDNA3.1/hygro(−) vectorbetween the XhoI and KpnI restriction sites. HEK293T cellswere transfected using Polyethylenimine (PEI) with p4×STAT6-Luc2P (a STAT6-Firefly luciferase plasmid purchased fromAddgene; 35554) along with renilla-luciferase plasmid as atransfection control and WT or mutated STAT6FL plasmids. TheSTAT6FL-Firefly luciferase plasmid contains four copies of theSTAT6 N4 site response element 5′-TTCccaaGAA-3′. An N3site response element 5′-TTCcaaGAA-3′ was obtained throughsite-directed mutagenesis. The lowercase letters for sequencesindicate the spacer sequences between the consensus sequences.Cells were treated 24 h after transfection with IL-4 (10 ng/mL) ortransfected with Poly(dA:dT) (Invivogen) by Lipofectamine 2000(2 μg/μL) for 2 h. Cell lysates were collected and analyzed usingthe Promega Dual-Luciferase Reporter Assay System (32).Mean ratio luciferase/renilla light unit activities are shown fortriplicate samples.

AUC. Analytical sedimentation velocity experiments were con-ducted using a ProteomeLab XL-I system (Beckman Coulter)according to our previous report (32). Briefly, an An-60Ti rotorwas used to centrifuge protein samples suspended in 20 mMHepes (pH 7.0), 200 mM NaCl, 0.5 mM EDTA, 10 mM MgCl2,and 4 mM DTT. Unphosphorylated STAT6CF was centrifuged at55,000 rpm with an A280nm of ∼0.7, whereas phosphorylatedSTAT6CF and the STAT6CF-N4 complex were centrifuged at30,000 rpm with an A280nm of ∼0.7. A set of 93 scans was col-lected at 1-min intervals. Sedfit software was used for size dis-tribution analysis with a continuous c(s) distribution model withthe default parameters. After interpretation and refinement ofthe results, the distribution was displayed and exported with aconfidence level (F-ratio) of 0.9. The molecular weights (MWs)of the globular proteins were obtained by converting the S valueswith the f/f0 ratio in Sedfit. The percentage distribution of eachpeak was obtained by S peak integration in Sedfit.

Mutagenesis. Mutagenesis was performed using a QuikChangesite-directed mutagenesis kit following the manufacturer’s in-structions (Agilent Technology). All of the recombinant plasmidswere sequenced to verify the clones. Mutants were overexpressedand purified following the same procedures as described for thephosphorylated STAT6CF.

TSA. TSA was conducted using 0.05 mg/mL STAT6CF with orwithout 50 μM N4 site DNA (CS4) in 20 mM Hepes (pH 7.0),200 mM NaCl, 10 mM MgCl2, 4 mM DTT, and 0.5 mM EDTAsupplemented with a 1,000 dilution of SYPRO Orange dye(Invitrogen). The experimental procedure and data analysis arethe same as previously described (28).

SAXS. SAXS data for unphosphorylated STAT6CF, phosphory-lated STAT6CF, and its complex with N4 site DNA were col-lected on the beam line 12.3.1 at the Advanced Light Source(ALS), Lawrence Berkeley National Laboratory, and the datawere treated as previously described (28, 33, 34). Each samplewas measured at three exposures (0.5, 1.0, and 2.0 s) at 10 °C andthree concentrations (2.5, 5.0, and 10.0 mg/mL) in a buffercomposed of 20 mM Hepes (pH 7.0), 200 mM NaCl, 0.5 mMEDTA, 10 mM MgCl2, and 4 mM DTT. The scattering intensityI (Q) was measured for s values (s = 4πsinθ/λ, where 2θ is thescattering angle) ranging from 0.01 to 0.3 Å−1. The data wereintegrated, scaled, and buffer subtracted to obtain the standardscattering curves. Scattering carves at concentration of 2.5 mg/mLand at exposure time of 0.5 s were used for further calculation.The initial Rg values from the Guinier plot analysis, pair distancedistribution function, P(r), curves, and the MWs were calculatedas described in ref. 34.

Li et al. www.pnas.org/cgi/content/short/1611228113 2 of 11

Page 3: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

MD Simulation. All of the simulations were performed byGROMACS 1.4.9. In the simulation system, counter ions wereadded for charge neutralization. To prepare for the simulations,the system temperature was increased from 0 to 300 K in 5 ns withharmonic restraints on the protein and N4 site DNA, followed byseveral nanoseconds of equilibration. The temperature was keptclose to 300 K by using a V-rescale thermostat, whereas thepressure was maintained at 1 bar using the Berendsen barostatmethod. The particle Mesh Ewald (PME) algorithm was appliedto calculated electrostatic interaction. The Linear ConstraintSolver (LINCS) was used to restrain the bond lengths.The principal component analysis (PCA) was carried out to

address the collective motions of STAT6 protein by using the

positional covariance matrix C of the atomic coordinates and itseigenvectors. The elements of the positional covariance matrix Care defined by the following equation:

Ci =�ðxi − hxiiÞ

�xj −

�xj���ði, j= 1,2,3, ...3NÞ, [1]

where xi is the Cartesian coordinate of the ith Cα atom, N is thenumber of Cα atoms considered, and hxii represents the timeaverage over all of the configurations obtained in the simulation.

Statistical Analysis.The significance of differences between groupsexhibiting similar variance was evaluated using the Student t test.

Fig. S1. DNA sequences preferred by STAT proteins based on the existing experimental determination of STAT DNA-binding motifs (7, 35).

Li et al. www.pnas.org/cgi/content/short/1611228113 3 of 11

Page 4: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S2. Multiple sequence alignment of STAT6 and other STAT proteins produced by ClusterW and ESpript (espript.ibcp.fr/ESPript/ESPript/). Every 10 residues are indicated with a dot (·) shown above the sequences. Strictlyconserved residues are boxed in white on a red background, and highly conserved residues are boxed in red on a white background. The key residue (H415 in STAT6) for distinguishing N4 and N3 site DNA and the strictlyconserved tyrosine phosphorylation site (Y641 in STAT6) are highlighted by a black box and a green box, respectively. The core fragment of STAT6 (aa 123–658) used in the study is marked.

Li et al. www.pnas.org/cgi/content/short/1611228113 4 of 11

Page 5: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S3. Characterization of STAT6CF. (A) Comparison of elution profiles of unphosphorylated STAT6CF (monomer), phosphorylated STAT6CF (dimer), andSTAT6CF-N4 complex during gel filtration chromatography. (B) AUC analysis of unphosphorylated STAT6CF, phosphorylated STAT6CF, and STAT6CF-N4 complexshows that unphosphorylated STAT6CF forms a monomer, whereas phosphorylated STAT6CF forms a dimer in solution. (C) TSA results of unphosphorylatedSTAT6CF and phosphorylated STAT6CF with or without DNA. Both unphosphorylated and phosphorylated STAT6CF had high melting temperature, but onlyphosphorylated STAT6CF formed a complex with DNA, which resulted in a higher melting temperature than the protein alone. (D) ITC results show thatphosphorylated STAT6CF interacts with DNA with a molar ratio of 2:1. (E) Identification of phosphorylation of Y641 by MS. The identified peptide is“DGRGYVPATIK” phosphorylated at Y5, Charge: +3 and monoisotopic m/z: 419.53912 Da.

Li et al. www.pnas.org/cgi/content/short/1611228113 5 of 11

Page 6: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S4. STAT6CF dimer formation and its complex with N4 DNA. (A) Cartoon representation of STAT6CF homodimer, shown from the front view (Upper) andthe top view (Lower). The coloring scheme for each domain is the same as in Fig. 1B. The N and C termini of one protomer are labeled with N and C, and missingloops are shown by dashed lines. (B) Superimposition of STAT6CF and previously reported unphosphorylated STAT1CF (PDB ID code 1YVL), STAT3CF (PDB ID code3CWG), and STAT5CF (PDB ID code 1Y1U) protomers is shown. (C) Drawing depicting details of the interactions between the phosphorylated tail of Y641 fromone molecule and the SH2 domain of another molecule. The phosphorylated tail is shown in green; the residues directly participating in the hydrogen bondinginteractions are colored magenta. (B) Drawing depicting details of the interactions between two phosphorylated tail fragments as an antiparallel β-sheet. Theresidue K647 was placed at the center of the dimer interface. (C) Superimposition of STAT6 over STAT1 shows that STAT6CF (magenta) has a shorter C-terminalloop than STAT1CF (green). The C-terminal loops of both proteins are highlighted by black ellipse. (D) Surface electrostatic potential representation of STAT6CF

bound with 22-bp N4 site DNA. A positively charged area inserts into the major groove of dsDNA. Blue, positively charged; red, negatively charged; white,neutral. (E) A simulated annealing (SA) omit electron density map (2Fo-Fc), contoured at 1.0σ, of N4 site dsDNA bound to phosphorylated STAT6CF. (F) Electrondensity map (2Fo-Fc), contoured at 1.0σ, of H415 at DNA binding domain of STAT6. (G) DNAs in STAT1/STAT3 complex structures are bent, forming a 140°angle. (H) DNA molecules in STAT6CF complex are straight and connected to each other end to end, depicting an appearance of a long continuous stretch ofDNA in the crystal packing. (I) Residues of the DNA binding interface of STAT6 are highly conserved. Amino acid conservation of STAT6CF via 150 homologs wasdisplayed on STAT6CF-N4 structure using ConSurf (consurftest.tau.ac.il/). (J) Superimposition of N4 site DNA bound STAT6 with STAT5 (PDB ID code 1Y1U)showing the position of side chain of H415 in STAT6 (cyan) and H471 in STAT5 (orange). (K) Structures of apo STAT6CF and its complex with N4 and N3 site DNAwere aligned with STAT1-N3 DNA structure using SH2 domain as reference (indicated by an arrow). A larger angle at the intersection where monomers meet isobserved in STAT6CF structures (Inset). Magenta, DNA-free STAT6CF; green, N3 site DNA-bound STAT6CF; cyan, N4 site DNA-bound STAT6CF; blue, N3 site DNA-bound STAT1CF.

Li et al. www.pnas.org/cgi/content/short/1611228113 6 of 11

Page 7: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S5. Crystal structure of STAT6CF-N3 complex and its comparison with STAT6CF-N4 complex structure. (A) Cartoon diagram of the STAT6CF-N3 complex.Colors of each domain are the same as in Fig. 1B. (B) Drawing depicting details of the STAT6CF-N3 interface. The side chains of residues (A chain) donatinghydrogen bonds are shown by sticks in lemon and hydrogen bonds are shown in gray dash. The conserved palindromic bases (TTC/GAA) are shown in orange.(C) A simulated annealing (SA) omit electron density map (2Fo-Fc), contoured at 1.0σ, of N3 site DNA in STAT6CF-N3 complex. (D) Electron density map (2Fo-Fc),contoured at 1.0σ, of H415 at DNA binding domain of STAT6. Hydrogen bond formed by H415 and G13 of DNA chain is shown in gray dash. (E) Conformationalchanges between STAT6CF-N3 and N4 site DNA complexes. The DNA binding domains are aligned together in one protomer of the STAT6CF dimer and thedifference in the movement of H415-base interactions in the DNA binding domains in another protomer of the STAT6CF dimer is shown (in bottom view).

Li et al. www.pnas.org/cgi/content/short/1611228113 7 of 11

Page 8: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S6. (A) Our STAT6CF-N3 site DNA complex structure compared with previously reported N3 site DNA bound structures using the DNA binding domain asthe reference. The structures show very little divergence and the residues, H415 in STAT6, N460 in STAT1, and N466 in STAT3 (indicated by red circles), arealmost at the same place as in the two protomers of the dimer. (B) The comparison of DNA binding domain of STAT6CF-N4 site DNA complex structure (cyan)and STAT1CF-N3 site DNA complex structure (blue) is shown as cartoon representation. The steric hindrance between residue N417 in STAT6 and the DNA base(the fifth T) in the N3 site DNA bound STAT1 (PDB ID code 1BF5) are shown as spheres. (C and D) The conformation of STAT6CF is stabilized by DNA binding.Cartoon display of the region selected for molecular dynamics simulation calculation (red circle) on the model containing missing loops generated by MODELERprogram based on our STAT6CF-N4 complex (C) and time series of the distance between N-terminal coiled coil domains from molecular dynamics simulation(Upper) and one-dimensional free energy profile as the function of N-terminal coiled coil domain distance for Apo-STAT6CF (black) and STAT6CF-N4 complex(red), respectively (Lower). The dotted lines in the upper panel show the observed distances, derived from crystal structures of phosphorylated STAT6CF (black)and STAT6CF-N4 complex (red).

Li et al. www.pnas.org/cgi/content/short/1611228113 8 of 11

Page 9: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S7. SAXS analysis of unphosphorylated STAT6CF, phosphorylated STAT6CF, and STAT6CF-N4 complex. Scattering curves (A) and P(r) distribution (B) ofunphosphorylated STAT6CF (black), phosphorylated STAT6CF (red), and STAT6CF-N4 complex (green) are indicated. The Inset in A is Guinier plots. (C) Kratkyplots. (D and E) Experiential scattering profiles (in blue) of the 2.5 mg/mL phosphorylated STAT6CF (D) and STAT6CF -N4 complex (E) vs. the ideal scatteringprofiles (in green) of their corresponding crystal structures (phosphorylated STAT6CF, χ2 = 2.089; STAT6CF-N4 complex, χ2 = 1.060, respectively) calculated usingCRYSOL (36). Also shown in these figures are the ribbon models of the X-ray crystal structures superposed onto the molecular envelopes based on SAXS datacalculated by program DAMMIF (37). These envelopes were calculated from the average of 16 DAMMIF runs with P1 symmetry. The crystal structures weresuperimposed onto their corresponding envelopes using ref. 38. (F) Parameters derived from SAXS curves.

Li et al. www.pnas.org/cgi/content/short/1611228113 9 of 11

Page 10: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Fig. S8. S407 of STAT6 is not accessible for phosphorylation. (A) Side chain of S407 points toward the interior of STAT6 and forms two hydrogen bonds withthe residues V386 and L408, which was confirmed by electron density map (2Fo-Fc), contoured at 1.0 σ, of S407. (B) HEK 293T cells were transfected with a 4 ×STAT6 luciferase (Luc) reporter, renilla (Ren) reporter, and pcDNA3.1 empty (NTC), STAT6FL-WT, STAT6FL-S407E, or STAT6FL-S407A. After 24 h, cells werestimulated with IL-4 (10 ng/mL) or transfected with poly(dA:dT) (2 μg/μL) for 2 h. Mean ratio luciferase/renilla light units activity is shown for triplicate samples.Normalized results are presented as percent activity relative to the activity in cells transfected with STAT6FL-WT.

Table S1. List of the RMSD and sequence identity of STAT6CF with structures of STAT proteins

PDB ID code Protein Phosphorylation Protein stoichiometry Species Identity (%) Z score* RMSD of Cα (Å)*

1BF5 STAT1-DNA Yes Dimer Human 30 29.8 2.71BG1 STAT3b-DNA Yes Dimer Mouse 31 29.0 2.64E68 STAT3b-DNA No Dimer Mouse 31 28.6 2.61YVL STAT1 No Monomer Human 27 29.9 2.43CWG STAT3 No Monomer Mouse 30 30.7 2.31Y1U STAT5a No Dimer Mouse 46 34.3 2.21UUR Dd-STATa Yes Dimer Mold 31 18.7 5.11UUS Dd-STATa Yes Dimer Mold 31 17.1 4.3

*Data were derived from DALI service using STAT6CF structure as reference.

Table S2. DNA sequences used for crystallization, ITC, and SPRin this study

Name Sequence Uses

CS4 5′-TGATTTCctagGAAGACA-3′ ITC, SPRCS4-22 5′-ATGGATTTCctagGAAGACAGA-3′ CrystallizationCS-21 5′-ATGGATTTCctgGAAGACAGA-3′ CrystallizationIHG 5′-CGACTTCccaaGAACAGA-3′ ITCM67 5′-TGCATTTCccgTAAATCT-3′ ITC, SPRT1 5′-CGCTTTCcccTAAATGG-3′ ITC

The lowercase letters for sequences indicate the spacer sequences be-tween the consensus sequences.

Li et al. www.pnas.org/cgi/content/short/1611228113 10 of 11

Page 11: Supporting Information - PNAS · 2016-11-01 · Supporting Information Li et al. 10.1073/pnas.1611228113 SI Results In agreement with crystallographic observations, during SAXS analysis

Table S3. Data collection and refinement statistics

Crystal STAT6CF STAT6CF-N4 complex STAT6CF-N3 complex

Data collectionPDB ID code 4Y5U 4Y5W 5D39X-ray source SSRF/BL17U1 SSRF/BL19U GM/CA-CAT/23ID-CDetector distance (mm) 360 620 480Number of images 360 360 720Oscillation width (°) 0.5 1.0 0.5Wavelength (Å) 0.98 0.98 0.98Space group P212121 P1 P1a, b, c (Å) 66.15, 94.32, 179.36 68.30, 94.24,147.78 68.40, 94.70, 145.64α, β, γ (°) 90.00, 90.00, 90.00 99.85, 101.73, 89.93 79.62, 78.31, 89.58Mosaicity (°) 1.13 0.59 0.60No. protein molecules/ASU 2 4 4No. DNA chains/ASU 0 4 4Resolution range (Å) 50.00–2.70 50.00–3.10 50.00–3.20

(2.80–2.70) (3.21–3.10) (3.26–3.20)Rsym (%) 8.7 (51.5) 11.0 (65.4) 8.2 (69.5)Mean I/σ(I) 17.68 (4.10) 11.2 (1.89) 16.7 (1.22)Completeness (%) 94.9 (95.3) 98.0 (95.2) 99.0 (98.9)Redundancy 5.2 (5.3) 3.5 (3.4) 3.4 (3.1)

RefinementResolution (Å) 46.36–2.70 38.81–3.10 44.12–3.20No. reflections 28,460 57,478 57,110Rwork/Rfree (%) 18.51/25.29 24.96/27.80 21.03/23.61No. atoms 7,227 17,098 17,397No. protein atoms 7,051 15,057 15,469No. DNA atoms 0 1,804 1,722No. water 176 237 206Mean B (Å2) 21.0 69.0 118.0Wilson B (Å2) 23.2 65.58 98.70RMSDs

Bond lengths (Å) 0.008 0.008 0.006Bond angles (°) 1.237 1.064 1.321

Ramachandran analysisFavored region (%) 97.19 96.47 96.38Allowed region (%) 2.34 3.10 3.15Outliers (%) 0.47 0.43 0.47

The numbers in parentheses represent the values for the highest-resolution shell.

Movie S1. Conformational change between apo STAT6CF, STAT6CF-N3, and N4 site DNA complex structures.

Movie S1

Li et al. www.pnas.org/cgi/content/short/1611228113 11 of 11