Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Improvementofcryo-EMmapsbydensitymodification
ThomasC.Terwilliger1,2,StevenJ.Ludtke3,RandyJ.Read4,PaulD.Adams5,6,PavelV.Afonine5
1LosAlamosNationalLaboratory,LosAlamosNM87545USA2NewMexicoConsortium,LosAlamosNM87544USA3BaylorCollegeofMedicine,Houston,TX77030USA4CambridgeInstituteforMedicalResearch,Cambridge,CB20XY,UK5MolecularBiophysics&IntegratedBioimagingDivision,LawrenceBerkeleyNational
Laboratory,Berkeley,CA94720-8235,USA6DepartmentofBioengineering,UniversityofCaliforniaBerkeley,Berkeley,CA,USA
Abstract
Adensitymodificationprocedureforimprovingmapsproducedbysingle-particleelectroncryo-
microscopyispresented.Thetheoreticalbasisofthemethodisidenticaltothatofmaximum-
likelihooddensitymodification,previouslyusedtoimprovemapsfrommacromolecularX-ray
crystallography.Twokeydifferencesfromapplicationsincrystallographyarethattheerrorsin
Fouriercoefficientsarelargelyinthephasesincrystallographybutinbothphasesand
amplitudesinelectroncryo-microscopy,andthathalf-mapswithindependenterrorsare
availableinelectroncryo-microscopy.Thesedifferencesleadtoadistinctapproachfor
combinationofinformationfromstartingmapswithinformationobtainedinthedensity
modificationprocess.Theapplicabilityofdensitymodificationtheorytoelectroncryo-
microscopywasevaluatedusinghalf-mapsforapoferritinataresolutionof3.1Åanda
matched1.8Åreferencemap.Errorestimatesforthemapobtainedbydensitymodification
werefoundtocloselyagreewithtrueerrorsasestimatedbycomparisonwiththereference
map.Thedensitymodificationprocedurewasappliedtoasetof104datasetswherehalf-maps,
afullmapandamodelallhadbeendeposited.Theprocedureimprovedmap-modelcorrelation
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
andincreasedthevisibilityofdetailsinthemaps.Theprocedurerequirestwounmaskedhalf-
mapsandasequencefileorothersourceofinformationonthevolumeofthemacromolecule
thathasbeenimaged.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Single-particleelectroncryo-microscopy(cryo-EM)israpidlybecomingthedominanttechnique
fordeterminationoflargethree-dimensionalstructuresofmacromoleculesandtheir
complexes1.Theresultofacryo-EManalysisisathree-dimensionalmapreflectingtheelectric
potentialofthemacromolecule2andwhichhasmapvaluesandanappearancecloselyrelated
tomapsobtainedfromX-raycrystallography3.Inbothcryo-EMandinmacromolecular
crystallography,theaccuracyofthemapisanimportantcharacteristic.Inmacromolecular
crystallography,theamplitudesofFouriercoefficientsaregenerallymeasuredaccuratelyand
thephasesarepoorlyestimated.Itiscommonpracticeinthatfieldtocarryoutaprocedure
knownasdensitymodificationtoreducetheerrorsinthephasesandtherebyimprovethe
resultingmap4-7.Thesourceofnewinformationincrystallographicdensitymodificationisprior
knowledgeaboutexpectedvaluesinamap.Forexample,theprobabilitydistributionofmap
valuesmaybeknown,ortheremaybeknowledgeaboutspecificfeaturesinthemapsuchasa
flatsolventregion.Informationaboutthetruedensityinpartorallofthemapcanbeusedto
obtainimprovedestimatesofthephases,andtheseimprovedphasesleadtoanimprovedmap
everywhere,notjustwheretheinformationwasapplied6.
Incryo-EMaformofdensitymodificationmaybeappliedduringtheprocessofimage
reconstruction.Themacromoleculetypicallyoccupiesonlyasmallpartofthevolumeofthe
reconstruction,andduringreconstructionnoiseisremovedfromthepartofthemapthatis
outsidethemacromolecule8,9.Thiscanimprovethemapintheregionofthemacromolecule
andisrelatedtothe“solventflattening”aspectofcrystallographicdensitymodification10.Local
denoising11orfiltering12,13proceduresareoftenappliedtoimprovetheinterpretabilityofcryo-
EMmaps.Aprocedureforhistogram-matchingandresolutionfilteringhasalsobeen
developed14.Thoughtheoverallprocessofdensitymodificationasimplementedina
crystallographicsettingisthoughttobeinappropriateforcryo-EM15,ithasbeensuggestedthat
thegeneralconceptcouldbeadaptedandapplied8,15.Hereweshowthataversionofdensity
modificationwiththesametheoreticalbasisascrystallographicdensitymodificationbutwith
keydifferencesreflectingthedifferencesbetweencrystallographicandcryo-EMmapscanbe
usedtoimprovecryo-EMmaps.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Thereareseveralframeworksfordensitymodificationthatcouldbeappliedtocryo-EMmaps4.
Hereweusemaximum-likelihooddensitymodification,asitmakesacleardistinctionbetween
informationcomingfromtheoriginaldataandinformationthatcomesfromexpectationsabout
thefeaturesinthemap5.Theprocessformapimprovementbymaximum-likelihooddensity
modificationinvolvesidentifyinghowthecurrentFouriercoefficientscanbechangedsoasto
increasetheplausibilityofthemap(expressedasalikelihood),whileretainingcompatibility
withtheoriginalexperimentalmap(seeMethods).
Cryo-EMmapsdifferinfundamentalwaysfromcrystallographicmaps,andtheactualprocess
ofdensitymodificationcannotbeappliedinthesamewayinthetwosituations.Asdetailedin
Methods,onekeydifferenceisthatincrystallographyonlytheamplitudesofFourier
coefficientsaremeasureddirectly(phasesaregenerallyestimatedindirectlybycomparisonof
amplitudesmeasuredatdifferentwavelengthsorfromslightlydifferingcrystals),whilein
electroncryo-microscopybothphaseandamplitudesaredirectlyavailablefromexperiment.
Anotheristhathalf-mapswithrelativelyindependenterrorsareavailableinelectroncryo-
microscopy16butnotincrystallography.
Wetestedtheapplicabilityofdensitymodificationtheorytocryo-EMbyapplyingittoamapof
apoferritinatareportedresolutionof3.1Å(EMDataBankentryEMD-2002817).Weuseda
matched1.8Åreferencemap(EMD-20026)toevaluatetheerrorestimatesthatmakeupakey
partofthedensitymodificationprocessandtotesttheeffectsofdensitymodificationonmap
quality.
Thedensitymodificationprocedureforthe3.1ÅmapwastobecarriedoutusingFourier
coefficientstoaresolutionof2.5Å,sowefirstcheckedtheaccuracyofthe1.8Åmapuptothis
resolutionbycalculatingtheFouriershellcorrelation(FSC)ofindependenthalf-maps16,18.Fig.
1Ashowsthatthetwohalf-mapscorrespondingtothefull1.8ÅmaphaveanFSCvalueabove
0.97atallinverseresolutionsofupto0.4Å-1(correspondingtoaresolutionof2.5Å).Half-
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
datasetFSCvaluescanbeusedtoestimatetheexpectedcorrelationofFouriercoefficientsfora
maptoFouriercoefficientsrepresentingthetruemapusingtheformula18,
FSCref=[2FSC/(1+FSC)]½(1)
HereweusethenotationFSCreftoemphasizethatthisisaFSC-basedresolution-dependent
estimateofmapsimilaritytoareference(ideal)map.Thisusagecorrespondstotheprevious
use18ofCrefandissimilartotheuseofthenotationCC*incrystallography19).AccordingtoEq.
(1),anFSCvalueof0.97correspondstoavalueofFSCref=0.99,indicating(asidefrom
systematicerrorsaffectingbothhalf-maps15)thatuptoaresolutionof2.5Å,the1.8Åmap
closelymatchesaperfectmapofthisstructure.
Weappliedourdensitymodificationproceduretothetwohalf-mapsfromthe3.1Ådataset,
yieldingtwointermediatemap-phasinghalf-maps,twodensity-modifiedhalf-maps,andafinal
densitymodifiedfullmap.AsdescribedinMethods,eachFouriercoefficientineachmap-
phasinghalf-mapisobtainedindividuallybyadjustingittomaximizethelikelihood
(believability)ofamapcalculatedfromthiscoefficientplusallother(constant)Fourier
coefficientsinthecorrespondingoriginalhalf-map.Thesemap-phasinghalf-mapsarethen
recombinedwiththeoriginalhalf-mapsusingaresolution-dependentweightingapproachto
yielddensity-modifiedhalf-maps.Finallythedensity-modifiedhalf-mapsareaveragedto
produceasingledensity-modifiedmap.Asapartofthedensitymodificationprocess,estimates
areobtainedoftheFouriershellcorrelationsFSCrefbetweentheinitialfullmapandatruemap,
andalsoestimatedcorrelationsFSCrefbetweenthedensity-modifiedmapandatruemap.The
estimatedFouriershellcorrelationbetweentheinitialfullmapandatruemapcomesfrom
applyingEq.(1)toahalf-datasetFSC(suchastheoneshowninFig.1Aforthe3.1Åmap).Fig.
1Billustratestheseestimatedresolution-dependentmapaccuracyvalues(FSCref)forthe3.1Å
map(orangedots)andshowsthattheyareverysimilartoactualmapaccuracy(theFourier
correlationbetweenthe3.1Åandreference1.8Åmaps,blueline).
Forthedensity-modifiedmap,valuesofestimatedFouriershellcorrelationtoatruemap,
FSCref,areestimatedfromresolution-dependenterrorestimates.Theseerrorestimatesarein
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
turnbasedoncorrelationsamongthetwooriginalhalfmapsandthetwomap-phasinghalf
maps(seeMethods).Fig.1Bdisplaysestimatesofresolution-dependentmapqualityFSCreffor
thedensity-modifiedmap(purpletriangles)andshowsthattheyareverysimilartotheactual
Fouriershellcorrelationvaluesbetweenthedensity-modifiedmapandthe1.8Åreferencemap
(greenline).Fig.1BalsoshowsthatthevaluesofbothestimatedFouriercorrelationFSCrefand
actualFouriercorrelationtothe1.8Åreferencemapareconsistentlyimprovedoverthe
inverseresolutionrangefromabout0.3to0.4Å-1.Thismeansthatintheresolutionrange
betweenabout2.5Åand3.3Åthedensitymodifiedmapismoreaccuratethantheoriginal
map.
Figure1.Densitymodificationofapoferritin3.1Åmapandevaluationusingapoferritin1.8Å
map.A.Fouriershellcorrelation(FSC)curvesforEMD-20028(3.1Å)andEMD-20026(1.8Å).B.
Orangecirclesareestimatedresolution-dependentmapquality(FSCref)oforiginal3.1Åmap
(seetext).BluelineistheFSCoftheoriginal3.1Åmaptothe1.8Åmap.Purpletrianglesare
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
FSCrefofdensity-modified3.1Åmap,andthegreenlineistheFSCofthedensity-modified3.1Å
mapto1.8Åmap.HorizontallinesaredrawnatFSCvaluesofzeroand½.Verticaldottedlines
aredrawnatinverseresolutionsof1/2.78Å-1and1/2.97Å-1.C,D.Densitymodificationof
apoferritin1.8Åmap,carryingoutdensitymodificationwithoutauto-sharpeningthestarting
half-mapsandusingadensity-modificationresolutionof1.8Å.C.Depositedmapwith
resolution-dependencematchedtothatofthedensity-modifiedmapusingthePhenix20tool
auto_sharpenwiththeexternal_map_sharpeningmethod.D.Density-modifiedmap.Contours
inCandDaredrawntoencloseequalvolumes21.TheX-raystructureofhumanapoferritin(PDB
entry3ajo22)isshownafterdockinginthedepositedmapandre-refinementagainstthedensity
–modifiedmap.ArrowsindicatethelocationsofF81(multipleconformationsin3ajo)andY27
wherethearomaticringsareclearafterdensitymodification.
Inarealcase,therewillnormallybenoreferencemapforcomparison.TheanalysisinFig.1B
indicatesthatitisreasonabletousevaluesofestimatedFouriercorrelationFSCref,obtained
fromthecorrelationsbetweenoriginalanddensity-modifiedhalf-datasets,asroughestimates
ofFouriercorrelationtothetruemap.
Figs.1CandDillustratethevisualeffectsofdensitymodificationofahigh-resolutionmap,that
ofhumanapoferritinat1.8Å.Thedensity-modifiedmapinDshowshigh-resolutiondetailsthat
arenotpresentintheresolution-matchedmapinC,includingcleardensityforthearomatic
ringsofF81andY27.Takenasawhole,Fig.1indicatesthatapplicationofdensitymodification
toapoferritinmapsat3.1Åor1.8Åimprovestheminasignificantwaythatisingeneral
agreementwiththatexpectedfromtheerrormodelfordensitymodificationdescribedin
Methods.
Wenexttestedthegeneralityofmaximum-likelihooddensitymodificationofcryo-EMmapsby
applyingitto104setsofhalf-mapsandfullmapsavailablefromtheEMDatabank(EMDB17).
Themainfocusofthistestwasonmapsintheresolutionrangeof2Åto4.5Åwherewe
developedtheparametersandexpectedtheproceduretowork,butweincludedmapsatlower
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
resolution(upto8Å)aswell.Toevaluatetheeffectsofdensitymodification,wecalculatedthe
Fouriershellcorrelation16ofeachmap(beforeandafterdensitymodification)tothedeposited
atomicmodelfromtheProteinDataBank(PDB23),afterre-refiningthemodeltothemaptobe
evaluated(seeMethods).Thepurposeofthere-refinementwastocreatesimilarbiasesforthe
analysisoforiginalanddensitymodifiedmapsandyieldacomparisonthatwasfair.Usingthe
map-modelFSCvaluesforanoriginalanddensity-modifiedmap,wegeneratedtwometrics
thatreflectdifferentaspectsofrelativeaccuraciesofthetwomaps.Thefirstmetricwasthe
resolutionatwhichthemap-modelFSCfallstoapproximately½,anindicationofaresolution
wherethereissubstantialinformationpresent,andthesecondwastheaverageFSC(inthe
sameresolutionrangeforthetwomaps),ameasureofoverallqualityofeachmap.Such
metricsofmapqualityhavemanylimitations18,24-26andarethereforefarlessusefulthana
directcomparisonwithanessentiallyidealmapasinFig.1B.However,iftheyarefairtheycan
atleastgiveageneralideaastowhetherthemethodisuseful.
Fig.2Aillustratestheresolutionatwhichthemap-modelFSCfallstoapproximately½for
originalmaps(bluedots)anddensity-modifiedmaps(orangecircles),plottedasafunctionof
thereportedresolutionoftheoriginalmaps.Theresolutionatwhichthemap-modelFSCfalls
toapproximately½isgenerallyimproved(resultinginasmallervalue)bydensitymodification
overtheentirerangeofreportedresolutions,thoughformapswithreportedresolutionworse
than4.5Åthevariabilityinthismetricisquitelarge.
Fig2Bshowstheaveragemap-modelFSCforeachoriginalmap(bluedots)anddensity-
modifiedmap(orangecircles)asreflectedinthemeanmap-modelFouriershellcorrelation,
againplottedasafunctionofreportedresolution.Thismetricalsogenerallyimproves(is
increased)afterdensitymodificationbutthereissubstantialvariabilityintheamountof
improvement(averageimprovementwithdensitymodificationwas0.07,standarddeviationof
0.08).Overall,boththeresolutionatwhichthemap-modelFSCfallstoapproximately½and
averageFSCgenerallyimprovewithdensitymodificationbutthereisconsiderablevariability,
particularlyatlowerresolution.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Toexaminethevisualeffectsofdensitymodificationinspecificcases,Fig.2panelsC-Fshow
twomatchedpairsofdepositedmapsandcorrespondingdensitymodifiedmaps.Tomakethe
visualcomparisonasfairaspossible,eachdepositedmapwassharpenedautomaticallyto
matchtheresolution-dependenceofthedensity-modifiedmap,andcontoursformatched
mapswerechosentoencloseequalvolumes21asinFig1.Figs.2Cand2Dshowaloopregion
(residuesA793-A804)thatispoorlyresolvedintheb-galactosidasemapataresolutionof2.2Å
(EMDBentry298427;PDBentry5a1a;Fig.2CandSupplementaryFig.1A),butisclearinthe
density-modifiedmap(Fig.2D).Figs.2Eand2Fshowa5.8Åcryo-EMmapofaguanylate
cyclase28(EMDBentry20282,PDBentry6pas).Intheoriginalmap(Fig.2E,resolutionmatched
tothedensity-modifiedmap)helicesareessentiallyfeaturelesstubesofdensity,whileinthe
density-modifiedmap(Fig.2F)theheliceshaveclearperiodicitycorrespondingtothehelical
repeat.Sharpeningoftheoriginalmapdoesnotyieldamoreinterpretablemap
(SupplementaryFig.1B).Ineachcasethedensitymodifiedmapshowssubstantiallymore
detailthantheoriginalmap.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Figure2.ApplicationofdensitymodificationtomapsfromtheEMDataBank.A.Estimated
resolutionatwhichmap-modelFSCfallstoapproximately½fororiginalmaps(bluedots)and
density-modifiedmaps(orangedots)basedonmap-modelFSCusingmodelsrefinedagainstthe
mapsbeingexamined.B.Meanmap-modelFSCfororiginalmaps(bluedots)anddensity-
modifiedmaps(orangedots),calculateduptoinverseresolutionofcorrespondingto5/6the
statedresolutionofthemap(thevalueof5/6themapresolutionisatypicalvalueofthe
resolutionusedindensitymodification).C,D.Depositedanddensity-modifiedmapsforb-
galactosidase(2.2Å,EMDB2984).Theresolution-dependenceofthedepositedmapismatched
tothatofthedensity-modifiedmap,andcontoursincludeequalvolumes.SupplementaryFig.
S1AshowsthesameregionforthedepositedmapforEMDB2984.E,F,Depositedanddensity-
modifiedmapsofaguanylatecyclase28at5.8Å(EMDBentry20282,PDBentry6pas).
SupplementaryFig.S1Bshowsthesameregionforasharpenedversionofthedepositedmap.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Alimitationinourcurrentimplementationofdensitymodificationforcryo-EMisthe
assumptionofrelativelyuniformnoiselevelsthroughouttheregionoutsidethe
macromolecule,whileactualmapstypicallyhavenoiselevelsthatvarywithdistancefromthe
macromolecule.AsdescribedinMethods,usingasub-volume(box)containingthe
macromoleculeandasmallregionarounditindensitymodificationreducesthenon-uniformity
innoiselevelsbutdoesnoteliminateit.Weinvestigatedwhetherareconstructionmethodthat
producedmoreuniformnoisewouldimprovedensitymodification.Weprocessedasubsetof
imagesavailableforb-galactosidase(EMPIAR-1006127,29)withtwodifferentproceduresusing
EMAN230,yieldingmapswithresolutionsofabout3.9Å.Thefirstprocedurewasastandard
reconstructionwithdefaultparametersincludingaGaussiankernelwitharesolution-
dependentwidthexceptthatnofinalmaskingwasappliedtothehalf-maps.Afullmapwith
maskingwasalsogeneratedfromthesehalfmapsandusedforcomparison.Thesecond
procedurewasareconstructionwithafixed-widthGaussiankernel,expectedtoyieldamore
uniformnoisedistributioninreal-space,totestwhetherthenon-uniformityinnoiseinacryo-
EMmapisalimitingfactor.Themapsobtainedfromeachprocedurewerethendensity
modified.EachmapwasthenevaluatedbasedonFouriershellcorrelationtothedeposited2.2
Åmapforb-galactosidase(EMD298427),superimposedonthereconstructions.Theresolution
atwhichthisFouriershellcorrelationfallstoapproximately½wasusedasaqualitymeasureof
thecorrespondingmap.
Simpleaveragingofthehalf-mapsfromthestandardreconstructiongaveamapwherethe
Fouriershellcorrelationfallsto½ataresolutionof4.0Åbasedonacomparisontothe
deposited2.2Åmap.Densitymodificationofthesemapsgaveamapwhereitwas3.9Å.
Simpleaveragingofthehalf-mapswithmoreuniformnoiseyieldedamapforwhichthe
resolutionwheretheFouriershellcorrelationfallsto½was3.9Å.Thiswasimprovedbydensity
modificationtoavalueof3.7Å.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Figure3.Effectofareconstructionprocedureyieldingmoreuniformnoiseontheoutcomeof
densitymodification.A.Standardreconstructionprocedureappliedtosubsetofimagesoffor
b-galactosidase(EMPIAR-10061).B.Density-modifiedversionofthemapinA.C.Density
modifiedversionofreconstructiondesignedtoimprovetheuniformityofnoiseinthemap.D.
Deposited2.2Åmapofb-galactosidase,superimposedonthemapinAandlow-passfilteredat
aresolutionof3.5Å.Eachmapwasautomaticallysharpenedusingmodel-basedsharpening
withthedepositedmodelforthe2.2Åmapofb-galactosidase(5a1a)superimposedonthe
mapinA.Contourswerechosentoencloseequalvolumesinallfourmaps21.
AportionoftheinitialfullmapisshowninFig.3A.Forcomparison,Fig.3Dshowsthedeposited
high-resolution(2.2Å)map,low-passfilteredataresolutionof3.5Å.Thedensity-modified
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
maps(Fig.3Band3C)botharemoresimilartothehigh-resolution,low-passfilteredmapthan
theinitialfullmap(Fig.3A),withthemapobtainedbydensitymodificationofthehalf-maps
withmoreuniformnoisebeingtheclearerofthetwomapsandthemostsimilartothehigh-
resolution,low-passfilteredmap(Fig3C).Theobservationthatthereconstructionprotocol
withrelativelyuniformnoiseproducedtheclearestmapindicatesthatsuchaprotocolmaybe
particularlywell-suitedfordensitymodification.Ingeneral,densitymodificationismostlikely
toimprovemapsthathavenotbeenmaskedandinwhichthemacromoleculeissurroundedby
asolventregionthatretainstheoriginalnoisefromthereconstruction.Thedensity
modificationprocesscanthenusethatnoiseinthesolventtoidentifyerrorsinFourier
coefficientsandtherebytoreducenoiseintheregionofthemacromolecule.
Thedensitymodificationproceduredescribedhereisfullyautomaticandrequiresonlytwo
half-maps,anoptionalnominalresolution,andinformationaboutthemolecularvolumeofthe
macromolecule(suchasasequencefileormolecularmass).Eachofthe104datasetsanalyzed
inFig.2tookfrom1to100cpu-minutes(averageof12minutes)on2.3GHzAMDprocessors.
Alloftheseanalyseswerecarriedoutwithdefaultparametersin“quick”mode,supplyinga
sequencefileandthereportedresolution,andcarryingoutonecycleofdensitymodification.
Inmanyofthesecases,improvedmapsareobtainedwithadditionalcyclesofdensity
modification.Inadditiontoafinaldensity-modifiedmap,theprocedureyieldstwodensity-
modifiedhalf-maps.AsnotedinMethods,thesemapsmayhavesomecorrelationsintroduced
bythemaskingeffectsofdensitymodification,butwiththiscaveattheycaninprinciplebe
furtherprocessedwithlocalsharpening31,weightedcombinationofhalf-mapsandother
methodsforoptimizingthefinalmap.
Therearenumerousextensionstothemethodsthataredescribedherethatcouldimprovethe
outcomeofdensitymodification.Inparticular,theanalysiscouldincludeinformationaboutthe
macromoleculefromothersourcessuchasmodelsbuiltusingthemapsorfittedintothem32,33.
Densitymodificationincludingsymmetrynotusedinthereconstructionprocesscouldbe
carriedoutaswell34.Theprocedurecouldallowstartinghalf-mapsthathaveerrorsthatare
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
correlatedwitheachotherorthathavedifferentexpectederrors,anderrorsthatdonotfollow
Gaussiandistributions.Itcouldbecarriedoutusingjustonemapormorethantwo‘half-maps”.
Mapscouldbedensitymodifiedwithoutboxingbyintroducingalocation-dependent
expectationformapvaluesoutsidethemacromolecule4.Errorscouldbeestimatedinregionsof
reciprocalspaceoranisotropicallyratherthaninshellsofresolution.Thedensitymodification
stepcouldalsobecarriedoutbyothermethods,forexamplesolventflipping35.Themolecular
compositioncouldbecalculatedbyanalysisofthemap,suchasusinglocalhistogramsofmap
values,allowingtheidentificationofunexpectedcomponents.Moregenerally,theentire
analysisdescribedherecouldbeextendedtoanysituationwhereamapofanydimensionality
haserrorsthatareatleastpartiallyindependentintheFourierdomainandinwhichsome
informationaboutexpectedvaluesinthemapisavailable.Assuggestedsometimeago8,a
densitymodificationproceduresuchastheonedescribedherecouldbeincorporatedintothe
overallprocessofiterativecryo-EMmaprefinementaswell.
Methods
Mapsandmodels
ThemapsusedtogenerateFig.1areapoferritinmapsEMD20026and20028andtheir
associatedhalf-maps.Thesemapshavereportedresolutionsof1.75Åand3.08Å,respectively
andwerefertothemasthe“1.8Å”and“3.1Å”apoferritinmaps.Theestimatesofresolution
forthesemapsbasedoncomparisonofmaskedhalf-mapsobtainedinthisworkareslightly
different,presumablyduetodifferentmaskingprocedures,withvaluesof1.93Åand2.97Å,
respectively.ThemodelshowninFig1isderivedfromPDBentry3ajo22andhasbeen
superimposedonthe1.8ÅmapandrefinedwiththePhenixtoolreal_spacerefine36againstthe
density-modified1.8Åmap.
The104setsofdatausedinFig.2werechosenfromtheEMDBusingdataatresolutionsfrom2
Å–8Åthathadassociatedhalf-datasetsandmatchingmodelsinthePDB.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Calculationsofmap-modelFouriershellcorrelationsweredonewithasoftmaskaroundthe
atomsinthemodel.Thesoftmaskwascalculatedasamaskaroundeachatomwitharadius
givenbytheatomicradiusoftheatomplustwicetheresolutionofthemap,followedby
smoothingofthemapwithaGaussiansmoothingfunctionwithafalloffto1/eoftwicethe
resolutionofthecorrespondingmap.AnalysesinFig.2werecarriedoutusingthePhenix20tool
resolve_cryo_em,providingtwohalf-maps,thereportedresolutionofthedataset,andthe
sequenceofthemacromoleculeobtainedfromthedepositedcorrespondingmodel.All104
analysesweredoneusingthesameversion(3689)andthesameparametersexceptfor
resolutionandsequence.
ThedataforFigures1-3areavailableasanExcelworksheetinSupplementaryDataI,andthe
(sharpened)originalanddensity-modifiedmapsshowninFigs.1-2alongwithChimerascripts
todisplaythemareavailableonlineat:http://phenix-
online.org/phenix_data/terwilliger/denmod_2020/.
Procedureforevaluationofmapquality
Weusedanautomatedproceduretoevaluatemapaccuracyandtochoosematchingmap
contoursfordisplaysothatmapcomparisonswouldbeasfairaspossible.Forevaluationof
mapaccuracywecalculatedFouriershellcorrelationsbetweenamapandanatomicmodel
refinedagainstthatmap.Therationaleforthisprocedureisthattheatomicmodelsavailable
fromthePDBarenormallyalreadyrefinedagainstthedepositedmap.Thisnecessarilybiases
themap-modelFSCcalculation.Tomakeacomparisonwithanewmap,themodelisfirstre-
refinedagainsttheoriginalmap.Thenthere-refinedmodelrefinedagainstthenewmapbefore
FSCcalculation.Thisisintendedtoleadtosimilarbiasesformodelsrefinedagainstoriginaland
densitymodifiedmaps,leadingtoarelativelyfaircomparisonbetweenmaps.
Ourlargelyautomatedprocedureforevaluationanddisplayofonemapwasthen(1)
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
refinementofthecorrespondingmodelfromthePDBusingthatmap,(2)boxingthemapwitha
rectangularboxaroundthemodelwithsoftedges,(3)calculationofmap-modelFSC,(4)
sharpeningthemapbasedonthemap-modelFSC18,37,(5)calculationoftheresolutionatwhich
themap-modelFSCfallstoapproximately½,and(6)calculationofaveragemap-modelFSC38up
toaresolutionof5/6thestatedresolutionofthemap(i.e.,0.83dmin).Thentocomparea
density-modifiedandoriginalmapvisually,themapswerevisuallyexaminedandaregionof
themapandcontourlevelforthedensity-modifiedmapwerechosenwheredifferencesfrom
theoriginalmapwereclear.Thecontourlevelfortheoriginalmapwasthenchosen
automaticallytoyieldthesameenclosedvolumesinthetwomaps21.Thiscontourlevelforthe
originalmapwasalwaysclosetothatobtainedbysimplyadjustingittomakeabouthalfthe
surfacethecoloroftheoriginalmapandhalfthecolorofthedensity-modifiedmapwhenthe
twomapsaredisplayedatthesametimeinChimera39.Finally,keepingthesamecontourlevels,
themapsinFig.3weremasked3Åaroundtheatomsintheregiontobedisplayedtomakeit
easiertoseetheregionofinterest.ForthemapsinFigs.1and2,anadditionalstepwasadded
inwhichtheresolution-dependenceoftheoriginalmapwasmatchedtothatofthedensity-
modifiedmap(usingthePhenixtoolauto_sharpenwiththeexternal_map_sharpeningmethod)
tomakethecomparisonofmapsasfairaspossible.
ErrorsinFouriercoefficientsrepresentingcryo-EMmaps
WeassumethatthedistributionoferrorsinFouriercoefficientsrepresentingcryo-EMmaps
canberepresentedbyatwo-dimensionalGaussianinthecomplexplane8.Thisassumptionis
evaluatedinFig.4whichcomparesFouriercoefficientsforapoferritinfromthe3.1Åand1.8Å
mapsanalyzedinFig.1.Fouriercoefficientsfortheshellofresolutionfrom3.0Åto3.1Åwere
calculatedforeachmapafterboxingthemapsaroundthefittedmodelusedinFig.1.The
Fouriercoefficientsforthe1.8Åmapweretreatedasperfectvalues.Thesevalueswere
multipliedbythecorrelationcoefficientbetweenthetwosetsofFouriercoefficientsand
subtractedfromtheFouriercoefficientsfromthe3.1Åmaptoyieldestimatesoftheerrorsin
the3.1Åmap.Fig.4showshistogramsoftheseerrorsalongdirectionsparalleland
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
perpendiculartotheFouriercoefficientsfromthe3.1Åmap.IneachcaseaGaussian
distributionisfittedtothesehistogramsandisshownaswell.Itcanbeseenthattheerrorsare
notquiteGaussianandarenotquitethesameinthetwodirections,butthataGaussianisa
goodfirstapproximation.Inthisexample,thenormalizederrorsperpendiculartotheFourier
coefficientsfromthe3.1Åmaphaveameanofzeroandastandarddeviationof0.63,while
thoseparallelhaveameanof0.1andastandarddeviationof0.70.
Fig.4.AnalysisofdistributionoferrorsinFouriercoefficientsforapoferritin3.1Åmap.A.
EstimatederrorsinFouriercoefficientsparalleltocoefficientsfor3.1Åmap.B.Errors
perpendiculartocoefficientsfor3.1Åmap.SolidlinesineachcasecorrespondtoaGaussian
fittedtothevaluesshown(seetext).
Procedurefordensitymodificationofcryo-EMmaps
Densitymodificationofacryo-EMmapisbasedonthemaximum-likelihoodformalismthatwe
developedpreviouslyforcrystallographicdensitymodification5.Therearetwoimportant
differences,however.OneisthatthestartingprobabilitydistributionsforFouriercoefficients
(thoseavailablebeforedensitymodification)areverydifferentinthetwocases.Theotheris
thattypicallyinacryo-EMexperiment,twoindependenthalf-mapsareavailable(twomaps
witherrorsthatareuncorrelated).
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Maximum-likelihooddensitymodificationhastwooverallsteps.Inthefirststepaprobability
distribution(calledthe"map-phasing"probabilitydistribution)isobtainedforeachFourier
coefficient.Thismap-phasingdistributionhaserrorswhich,inanoptimalsituation,are
independentoftheerrorsinthecorrespondingFouriercoefficientinthestartingmap.Inthe
secondstepthemap-phasingprobabilitydistributionforeachFouriercoefficientisrecombined
withthestartinginformationaboutthatFouriercoefficienttoyieldanew"densitymodified"
estimateofthatFouriercoefficient.
Thefirststageisthesameincrystallographicandcryo-EMcases.Itstartswithamap
representedbyFouriercoefficients.Itrequiresafunctionthatdescribeshowthelikelihood
(believability)ofthatmapwouldchangeifthevaluesinthemapchange5.Thislikelihood
functionmightforexamplesaythatifthevaluesinthemapintheregionoutsidethe
macromoleculeallmovetowardsacommonvalue,thebelievabilityincreases.Itmightalsosay
thatifthedistributionofvaluesintheregionofthemacromoleculebecomesclosertoan
idealizeddistribution,thatmap'sbelievabilityimproves.Aspecificexampleofalikelihood
functionthathasboththesepropertieshasbeendescribed5(Eq.17inthisreference).Given
suchamapandlikelihoodfunction,itispossibletocalculatea"map-phasing"probability
distributionanditsmaximumorweightedmeanforeachFouriercoefficient40.Thisyieldsa
"map-phasing"map.
Themap-phasingmaphastheimportantpropertythatthenewestimateofaFourier
coefficientdoesnotdependatallonthevalueofthatFouriercoefficientinthestartingmap40.
Thisrathernon-intuitivesituationispossiblebecausethemap-phasingprobabilitydistribution
foraparticularFouriercoefficientcomesonlyfromalltheotherFouriercoefficientsandthe
characteristicsofthemapasreflectedinthelikelihoodfunction.Inothermethodsofdensity
modificationsuchassolventflippingasimilareffectisobtainedbyspecificallyremovingthe
informationcorrespondingtotheoriginalmap4,35.Itshouldbenotedhoweverthatiftheother
FouriercoefficientshaveinformationabouttheerrorsintheFouriercoefficientinquestion(for
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
examplethroughpreviousdensitymodificationorbymaskingofthemaparoundthe
molecule),thatinformationcanindeedaffectthemap-phasingestimateoftheFourier
coefficientofinterest.
Thekeydifferencesinimplementationbetweencrystallographicandcryo-EMcasesariseinthe
secondstep.First,theinformationabouttheFouriercoefficientthatisavailablebeforedensity
modificationisverydifferentinthetwocases.Inthecrystallographiccase,theamplitudeof
eachFouriercoefficientistypicallyknownquiteaccurately(oftenintherangeof5-30%
uncertainty)andtheremaybesomeinformationaboutthephase(thismightrangefromno
informationtophaseuncertaintiesintherangeof45degrees).Theresultingdistributionof
likelyvaluesforaparticularFouriercoefficientmightessentiallybearingofrelativelyconstant
amplitudeora"boomerang"withpartially-definedphaseandrelativelyconstantamplitude.In
contrast,inthecryo-EMcase,phaseandamplitudearebothuncertain,andthedistributionof
likelyvaluesbeforedensitymodificationcanberepresentedbyatwo-dimensionalGaussianin
thecomplexplane8.
Thisqualitativeandverysubstantialdifferenceintheformofthedistributionoflikelyvaluesfor
aFouriercoefficientpriortodensitymodificationmeansthatwhenrecombininginformation
betweenthestartingmapandmap-phasingdistributions,differentapproachesarebestsuited
tothetwosituations.Forcrystallographicapplications,recombinationessentiallyamountsto
testingpossiblevaluesofthephaseofaFouriercoefficientatconstantamplitudefor
consistencywithpriorandmap-phasingdistributions.Incontrast,forcryo-EMapplicationsas
describedhere,recombinationconsistsofcalculatingtheproductoftwo2-dimensional
distributionsandfindingthemaximumoftheresultingdistribution.Ifthedistributionsare
Gaussian,thisamountstoasimpleweightedaverageofthepriorandmap-phasingFourier
coefficients.
Thesecondkeydifferenceisthatcryo-EManalysesaretypicallycarriedoutinawaythatyields
twohalf-mapswithlargelyindependenterrors.Thismeansthatoverallmean-squarevaluesof
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
errorscanbeestimatedinastraightforwardwayinbinsorshellsofresolutionbycomparisonof
Fouriercoefficientsfromthetwohalf-mapsandfromthetwomap-phasinghalfmaps(see
below).
Theoverallprocedurefordensitymodificationoftwohalf-mapsisthen:(1)averagethetwo
halfmapsandsharpen/blurtheresultingmapbasedonresolution-dependenthalf-map
correlation18,37toobtainanoptimizedstartingfullmap,(2)calculatetargethistograms(see
below)forthemacromoleculeandnon-macromoleculeregionsofthisfullmap,(3)usethe
histogramsandFouriercoefficientsrepresentingeachhalf-mapinthefirststepofdensity
modificationtoobtainamap-phasingprobabilitydistributionforeachFouriercoefficientfor
thathalf-map,and(4)calculateaweightedaverageofvaluesofeachFouriercoefficient
obtainedfromthetwostartinghalf-mapsandthemap-phasingmapsobtainedfromthemto
yielda"density-modified"mapalongwithcorrespondingweightedhalf-mapsandresolution-
dependentestimatesoftheaccuraciesandresolutionofeachmap.Theoptimalweightingis
discussedbelowintermsofasimpleerrormodel.Finally(5)theentireprocesscanbe
repeated,usingthedensitymodifiedhalfmapsfromonecycleinstep(3)ofthenextcycle.Itis
alsopossible(butnotdonebydefaultinourcurrentprocedure)tousethefulldensitymodified
mapfromonecycletoobtainhistogramsforthenextcycle.Theprocessisconcludedaftera
specifiednumberofcycles.Fortheexamplesinthisworkonlyonecyclewascarriedout
(additionalcyclesdidimprovedensitymodificationformanyofthe104casesinFig.2,
however).
Onevariationontheoverallprocedurecansometimesimprovedensitymodification.Thisisto
carryoutdensitymodificationstartingdirectlywiththedepositedmapsinsteadofauto-
sharpeningthem37instep(1)above.Thisiscontrolledbythekeyword
density_modify_unsharpened_maps=TrueinthePhenixtoolresolve_cryo_em.
Targethistogramsofdensitydistributions
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Akeyelementofthemaximum-likelihooddensitymodificationprocedureistheuseoftarget
histogramsrepresentingtheexpecteddistributionofmapvaluesforthe“true”(desired)mapin
theregioncontainingthemacromoleculeandintheregionoutsideit5,14.Thesehistogramscan
beobtainedinanyofseveralways.Oneistouseamapormapscorrespondingtohigh-quality
structuresthatarealreadydetermined.Anotheristousehistogramsusedpreviouslyfor
crystallographicanalyses.Thedefaultmethodusedhereistousehistogramsbasedoncrystal
structures,withanoptiontousehistogramsderivedfromthefullmapobtainedbyaveraging
thetwocurrenthalf-maps.Theseaveragedhistogramshavetheadvantagethattheyare
automaticallyatthecorrectresolutionandrepresentmacromoleculeandsurroundingregionin
justthesamewayasthehalfmaptobeanalyzed,butinourteststhehistogramsfrom
crystallographicanalysesresultedinthelargestimprovementssotheseareusedbydefault.
ErrormodelforanalysisofFSCcurvesanditsuseinoptimizingweightsandestimating
correlationstotruemaps.
Weuseasimpleerrormodelwiththefollowingassumptions:
A. Startinghalf-mapshaveerrorsthatareuncorrelatedbetweenhalf-maps.This
assumptionisbasedontheconstructionofhalf-maps,inwhichtheyarederivedfrom
independentsubsetsofthedata.Therearehoweversomeaspectsofhalf-map
constructionthatcouldleadtocorrelationoferrors,includingtheuseofthesame
referenceinsomestagesofanalysisandmaskingofthemaps18.
B. Density-modifiedhalf-mapshaveerrorsthatareuncorrelatedbetweenhalf-mapsand
theymayalsohaveerrorsthatarecorrelatedwiththecorrespondingstartinghalf-maps
anderrorsthatarecorrelatedwiththeeachother.Errorscorrelatedwiththe
correspondingstartinghalf-mapscouldcomefromthedensitymodificationprocedure
notyieldingfullyindependentinformation.Errorscorrelatedwiththeotherdensity-
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
modifiedhalf-mapcouldcomefrommaskingeffectsintroducedfromsolventflattening
proceduresindensitymodification.
C. Meansquarevaluesoferrorsareresolution-dependent.Thisassumptionsimplifiesthe
analysisoferrorsbydescribingtheerrorsintermsofresolutionandallowingthemtobe
estimatedinshellsofresolution.
D. Meansquareerrorsforeachmemberofapairofhalfmapsarethesame.This
assumptioncomesfromtheconstructionofhalf-maps,wheretheytypicallycomefrom
equalnumbersofimages.
E. Errorshavetwo-dimensionalGaussiandistributionswithmeanexpectedvaluesofzero.
Thissimplifiestheanalysis.
F. Fouriercoefficientsrepresentingindividualhalf-mapsareequaltothetrueFourier
coefficientsplusuncorrelatederrorsuniquetothatmapandcorrelatederrorsshared
amongtwoormoremaps.ThisyieldsasimpleformfortheFouriercoefficientsthatis
amenabletoestimationoferrorsfromcorrelationsofFouriercoefficients.
TheseassumptionsyieldasimpleerrormodelwhereFouriercoefficientsfortheoriginaland
density-modifiedhalf-mapscanberepresentedas:
Originalhalf-maps:
F1a=F+ga+sa (2a)
F1b=F+gb+sb (2b)
Density-modifiedhalf-maps:
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
F2a=F+ga+a+ba (3a)
F2b=F+gb+a+bb (3b)
Inthisdescription,FrepresentsthetruevalueofoneFouriercoefficient.Thereareestimatesof
Fthatcomefromeachhalf-mapandeachdensity-modifiedhalf-map.F1aandF1brepresent
Fouriercoefficientsforhalf-mapsaandb,andF2aandF2brepresentFouriercoefficientsfor
density-modifiedhalf-mapsaandb.Thetermssaandsbrepresentuncorrelatederrorsinhalf
mapsF1aandF1b.ThemeansquarevaluesofeachareS:<sa2>=<sb
2>=S,andthemeanvalues
ofallerrorsinthisanalysisarezero.Thetermgarepresentserrorsthatarecorrelatedbetween
halfmapF1aanditscorrespondingdensity-modifiedhalfmapF2a(presentinhalfmapF1aand
notcorrectedbydensitymodification),andthetermgbrepresentserrorscorrelatedbetween
F1bandF2b.Themeansquarevaluesofgaandgbare<ga2>=<gb2>=C.Thetermsbaandbb
representuncorrelatederrorsinhalfmapsF2aandF2b,where<ba2>=<ba2>=B.Theterma
representserrorscorrelatedbetweenhalfmapsF2aandF2b,where<a2>=A.Asitisassumed
thaterrorsareresolution-dependent,theestimatesofmeansquareerrors(A,B,C,S)arein
turnassumedtoberesolution-dependentandinourproceduretheyareestimatedinshellsof
resolution.
Forsimplicityinnotation,weassumebelowthattheFouriercoefficientsforeachhalf-mapare
normalizedinsuchawaythatthemeansquarevalueofFisunity.Asthefollowingcalculations
onlyinvolvecorrelationcoefficients,theoverallscaleonFouriercoefficientshasnoeffecton
thevaluesobtained,sothissimplificationdoesnotaffecttheoutcomeoftheanalysis.
Usingthiserrormodelandnormalization,theexpectedvaluesofcorrelationsbetweenhalf-
mapscanbecalculated.Theseareasfollows,wherethebracketsrepresentexpectedvalues,
andthenotationCC(x,y)representsthecorrelationcoefficientrelatingvaluesofxandy.
Theexpectedcorrelationbetweenhalf-mapsaandbisgivenby,
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
<CC(F1a,F1b)>=[<(F+ga+sa)(F+gb+sb)>/(<(F+ga+sa)2><(F+gb+sb)2>]½ (4)
whichreducesto,
<CC(F1a,F1b)>=1/(1+C+S) (5a)
Similarly,correlationbetweendensity-modifiedhalf-mapsaandbisgivenby:
<CC(F2a,F2b)>=(1+A)/(1+A+B+C) (5b)
Cross-correlationbetweenhalfmapaanddensity-modifiedhalfmapa(andalsobetween
correspondingmapsb)isgivenby:
<CC(F1a,F2a)>=(1+C)/[(1+C+S)(1+C+A+B)]½ (5c)
Cross-correlationbetweenhalfmapaanddensity-modifiedhalfmapb(andalsobetweenhalf
mapbanddensity-modifiedhalfmapa)isgivenby:
<CC(F1a,F2b)>=1/[(1+C+S)(1+C+A+B)]½ (5d)
Astherearefourrelationshipsandfourparametersdescribingerrors,therelationshipscanbe
usedtoestimatethevaluesoftheerrorsA,B,C,andS,leadingtotheformulas:
A=[CC(F1a,F1b)CC(F2a,F2b)/CC(F1a,F2b)2]-1 (6a)
B=[CC(F1a,F1b)-CC(F1a,F1b)CC(F2a,F2b)-CC(F1a,F2a)CC(F1a,F2b)+CC(F1a,F2b)2]/
CC(F1a,F2b)2 (6b)
C=[CC(F1a,F2a)/CC(F1a,F2b)]–1 (6c)
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
D=[1/CC(F1a,F1b)]–[CC(F1a,F2a)/CC(F1a,F2b)] (6d)
Asnotedinmoredetailbelow,forshellsathighresolutiontheuncertaintiesinthecorrelations
suchasCC(F1a,F1b)canbelargecomparedtothecorrelationsthemselves.Inthesesituationsthe
valuesofcorrelationsaresmoothedandadditionalassumptionsaremadeaboutthe
relationshipsamongtheerrorestimatesinorderreducethenumberofparametersthatneed
tobeobtainedfromthedataatthatresolution.
Afterestimationoferrors,allestimatesofFcanbeaveragedwithresolution-dependent
weightingfactorsw.Basedontheassumptionofequalmeansquareerrorsinmembersofa
pairofhalf-maps,theweightsoneachhalfmapinapairarealwaysequal.Therecombined
(density-modified)estimate(G)ofFisthengivenby,
G=wF1+(1-w)F2 (7a)
Wherewistheweightontheoriginalaveragedhalfmaps(F1)and(1-w)istheweighton
averageddensity-modifiedaveragedhalf-maps(F2)andtheaveragedmapsaregivenby:
F1=½(F1a+F1b) (7b)
F2=½(F2a+F2b) (7c)
TheweightwthatmaximizestheexpectedcorrelationoftheestimatedFouriercoefficient,G
withthetrueone,F,is:
w=(2A+B)/(2A+B+S) (8)
andtheestimatedcorrelationofGwithF,thecorrelationofthefinalestimateoftheFourier
coefficientwiththetrueone,represented18byFSCrefis:
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
FSCref=[2(A+½B+½S)/[A(C+S+2)+B(½C+½S+1)+(½C+1)S]]½ (9)
Asmentionedabove,assumptionsaremadetoallowestimationofcorrelatedanduncorrelated
errorsfromFSCplotsforresolutionshellswhereuncertaintiesincorrelationestimatesare
large.Theseadditionalassumptionsare:
A. Inthehighestresolutionshellconsideredthereisnoremainingsignalandall
correlationsofFouriercoefficientsareduetocorrelatederrors.Thelimitingresolutions
ofFSCplotsinthisanalysisaresetinsuchawaythatthereislittlesignalatthose
resolutionsandcorrelationsarelargelyduetonoiseandcorrelatederrors.FSCplotsare
calculatedinshellsofresolution(d).Thehighestresolutioninerroranalysesconsidered
(dmin)istheresolutionusedfordensitymodificationmultipliedbyafixedratio(typically
5/6).Astheresolutionusedfordensitymodificationisnormallyabout0.5-1Åfinerthan
thenominalresolutionoftheoverallmap,fora4Åmapthishigh-resolutiondminwould
typicallybeintherangeof2.5Åto3Å.
B. Inhigh-resolutionshellswherethereissubstantialuncertaintyintheestimatesoferrors
(typicallywherehalf-mapcorrelationsarelessthanabout0.05),ratiosofcorrelatedto
uncorrelatederrorsareassumedtobethesameasthoseestimatedinlower-resolution
shells.
C. Forshellsofresolution(d)wherethevaluesofFSCarebelowafixedminimumFSC,
(typicallyFSC_min=0.2),smoothedFSCvaluesarecalculatedbyfittingtheobservedvalues
toasimpleexponentialfunctionwithonefreeparameter.Thefunctionusedis
FSC=(FSC_d1-FSC_d_min)exp(-H/d2)+FSC_d_min.ThefreeparameterisH,thefall-offwith
1/d2).FSC_d1isthevalueofFSCatresolutiond1,thehighestresolutionwhereFSCis
higherthanafixedminimumvalue(typically0.2).FSC_d_ministheestimatedFSCatthe
highestresolutionintheanalysis.Asnotedaboveitisassumedthatanynon-zeroFSC
foundatthisresolutionisduetocorrelatederrorsintheanalysis.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Real-spaceweightingandweightingofindividualFouriercoefficientsincalculationofthefinal
map
Anoptionavailableattheendofacycleofdensitymodificationistoapplyalocalweighting
schemetothefinalcombinationoforiginalanddensity-modifiedmaps.Theideaistoidentify
localaccuracyintheoriginalmapfromlocalsimilaritybetweenoriginalhalf-maps,andalso
localaccuracyinthedensitymodifiedmapsfromlocalsimilaritybetweendensity-modifiedhalf
maps.Theprocedureforonepairofhalf-mapsistosubtractthemaps,squaretheresulting
map,andsmooththesquaredmapwithasmoothingradiustypicallygivenbytwicethe
resolutionofthemaptogivealocalvarianceforthosehalf-maps.Thenalocalweightforeach
setofhalf-mapsobtainedastheinverseofthelocalvarianceofthosehalf-maps.Theselocal
weightsarethenscaledtoyieldanaveragelocalweightofunityandthenareappliedtothe
individualhalf-mapsbeforetheyareaveraged.Asecondoptionforrecombinationofmapsis
toweightindividualFouriercoefficientsbasedontheestimatedvarianceforthesecoefficients.
ThevarianceforanindividualcoefficientisestimatedfromthefourFouriercoefficients
representingthefourhalf-mapsavailableattheendoftheprocedure(thetwooriginalhalf-
mapsandthetwodensity-modifiedhalf-maps).Thesetwoproceduresarenotappliedby
defaultbutcanimprovemapsinsomecases.
Overallspectralscaling
Theprocedureatthispointyieldsweights(w,1-w)forthetwopairsofhalf-mapsthatare
combinedtoyieldadensity-modifiedmap,andanestimateofthequalityofthemap
coefficientsinthisresolutionshell,(FSCref).Thiscalculationiscarriedoutinshellsofresolution
(typically100shells)andallowscalculationofFouriercoefficientsineachshellofresolution.A
finalresolution-dependentweighting(spectralscaling)isthenappliedtotheFourier
coefficients.Thereareseveraloptionsforthisfinalscaling.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Oneoption(thedefault,controlledbythekeywordfinal_scale_with_fsc_ref=True)beginsby
applyingascalefactor(Aj)foreachshelljtoyieldaconstantvalueofaverageamplitudesof
Fouriercoefficientsineachshell(essentiallycreatinganE-mapincrystallographicterminology;
notethatalthoughscalefactorsarecalculatedinshellsofresolution,theyareappliedasa
smoothfunctionofinverseresolution).Wenotethatthisprocedurehasasignificant
disadvantageinthattheFouriercoefficientsinthelowest-resolutionshellstypicallyhavevery
highvariationinmagnitudecomparedtothoseinhigher-resolutionshells,sothatnosingle
scalefactorreallycanbesuitable.Afterthisnormalizationstep,Fouriercoefficientsare
multipliedbythevalueofFSCrefinthatshell,correspondingtotheapproachoftenusedto
sharpenamapbasedonitscorrespondinghalf-maps18.
Asecondoptionforfinalscalingissimilartothefirst,exceptthatinthefirststepthescale
factorappliedisthesquarerootofthescalefactor(Aj)describedabove.Thisoptionisan
attempttoreducetheeffectofscalingonthelow-resolutioninformationandcanimprovethe
low-resolutionmap-modelcorrelationfoundattheconclusionofdensitymodificationinsome
cases.Thisprocedureiscontrolledbythekeywordgeometric_scale=Trueandisthedefaultif
final_scale_with_fsc_ref=False.
Athirdoption,onethatcanbecarriedoutaftereitherofthefirsttwo,istoapplyaresolution-
dependentscalefactortoallFouriercoefficientsthatcorrespondstotheresolution-
dependenceofacalculated“typical”proteinstructure(weusedthe2.2ÅmapEMD-2984ofβ-
galactosidasetocalculatethisresolutiondependence).Thisiscontrolledbythekeyword,
spectral_scaling=Trueandisnotcarriedoutbydefault.
Boxingofcryo-EMmaps
Inourprocedure,arectangularsolidportionofacryo-EMmapthatcontainsthe
macromoleculeofinterestiscutoutfromthemapandisusedintheanalysis.This“boxing”of
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
themapiscarriedoutwitha“soft”(Gaussian)maskwithasmoothingradiustypicallyequalto
theresolutionofthemaptoreducetheintroductionofcorrelationsinFouriercoefficients
betweendifferentmapsboxedinthesameway18.Theedgesoftheboxarechosenusing
boundsineachdirectionidentifiedusingalow-resolution(typically20Å)maskcalculatedfrom
thefullmapwithavolumebasedontheexpectedmolecularvolumeofthemacromolecule.
Typically,abufferof5Åisaddedtotheboundsineachdirectiontoyieldaboxthathas
dimensions10Åbiggerthanthesizeineachdirectionofthemacromolecule.
Therearetwoimportanteffectsofboxing.Oneistoreducethevariationofnoiseinthemapin
theregionoutsidethemacromolecule.Inatypicalcryo-EMmapthereissubstantialnoise
(fluctuationinmapvaluesnotrepresentingthemacromolecule)nearthemacromolecule,and
progressivelylessfurtherfromthemacromolecule(thevariationinnoiselevelsmayalsobe
morecomplicated).Thisvariationinnoiselevelsislargelyduetotheuseofproceduresthat
smoothFouriercoefficientsinreciprocalspacewiththeeffectofmaskingaroundthe
macromolecule30.Inourprocedureitisassumedthatthedistributionofmapvaluesinthe
regionoutsidethemacromoleculecanberepresentedbyasimplehistogram.Asthereisa
distance-dependentvariationinthelevelofnoiseinunboxedcryo-EMmaps,ourprocedurecan
bemademoreapplicablebyboxingthemaps.
ThesecondeffectofboxingthemapistoreducethecorrelationofFouriercoefficientsinthe
map.IfasmallobjectisplacedinalargeboxandFouriercoefficientsarecalculated
representingtheobjectinthebox,coefficientswithsimilarindices(neighboringFourier
coefficients)willbehighlycorrelated41,42.ThesignificanceofcorrelationsbetweenFourier
coefficientsisthaterrorsmaybecorrelatedaswell,resultinginmap-phasingFourier
coefficientsthatarenotfullyindependentfromtheoriginalFouriercoefficients.Boxingreduces
theemptyvolumeofthemapandreducesthiscorrelation.
Effectofmaskingindensitymodificationduetosolventflattening
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Anindirecteffectofdensitymodificationismaskingofthemap.Densitymodificationincludes
astepinwhichnoiseinthesolventregionofthemapisreduced,andthisprocesscanleadtoa
mapthatispartiallymasked.Aconsequenceoffullmasking(settingvaluestoaconstantvalue
outsidethemask)isthatcorrelationswithaperfectmap,withamodel-basedmap,orbetween
half-maps16areincreasedrelativetoanunmaskedmap.Wecarriedoutatesttoevaluate
whetherthiseffectcontributesinasubstantialwaytotheimprovementinresolutionby
densitymodificationshowninFig.1B.ThetestconsistedofrepeatingtheanalysisinFig.1B
exceptthatinsteadofactuallycarryingoutdensitymodification,eachinitialhalf-mapwas
simplymultipliedbytheprobabilisticmaskthatwastobeusedinidentifyingtheproteinregion
fordensitymodification,andonlyonecyclewascarriedout(thiscanbedoneusingthe
keywordcontrol_no_denmodinthePhenixtoolresolve_cryo_em).InFig.1Btheresolutionat
whichthemap-modelFSCfallstoapproximately½changedwithdensitymodificationfrom2.98
Åto2.77Å.Inthetestanalysisusingjustmasking,thisvaluechangedbyonly0.01Å(from2.98
Åto2.97Å),indicatingthatthemaskingeffectisverysmallinthiscase.
Resolutioncutoffusedfordensitymodification.
Inordertoincludeinformationathighresolution,theresolution(d_dm)ofFouriercoefficients
usedinthedensitymodificationprocedureistypicallyfinerthantheresolutionofthefullmap.
Therelationshipbetweentheresolutiondofamapandtheoptimalresolutiond_dmfordensity
modificationisnotobvious,soweusedananalysisof51half-mapsfromtheEMDBand
associatedmodelsfromthePDBtodevelopanempiricalrelationship.Theempiricalfunction
wasobtainedbycarryingouttheentiredensitymodificationprocedureforeachdataset,each
witharangeofvaluesofd_dm.ThentheaverageFSCbetweenmapandmodel-basedmapwas
calculatedforeachanalysisandasimplefunctionwasdevelopedforchoosingtheresolutions
wheredensitymodificationwasoptimal.Thisfunction,developedforresolutionsbetween2.4
Åand5Å,was:
d_dm=2.4+0.99(d-3)-0.2(d-3)2 (10)
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Ataresolutiond=2.4Å,thisyieldsd_dm=1.9.Formapswithresolutionfinerthan2.4Å,we
simplysubtract0.5Åfromtheresolutiontoyieldd_dm,exceptthatd_dmisneverallowedtobe
lessthan½d.Forresolutionsgreaterthan5Åtheoptimalresolutionfordensitymodification
hasnotbeenidentified.InFig.2Eq.(10)isused.
Threeoptionsforchoiceofresolutioncutofffordensitymodificationd_dmareavailableinthe
currentimplementationofdensitymodification.Oneisdirectlyspecifyingd_dm,oneisusingEq.
10toestimated_dm,andthelastistotryvariousvaluesofd_dmandchoosetheonethatleadsto
themostfavorableestimatedimprovementintheresolutionwhereFSCrefis½basedonEq.9.
Adjustableparameters
Therearemanyadjustableparametersinourprocedurebutallhavedefaultvalues(default
valuesforversion3689ofPhenixwereusedinall104testsinFig.2).Someoftheparameters
thatcansubstantiallyaffecttheresultsandthatausermightvaryiftheinitialresultsarenot
optimalarelistedhere.Theresolutionusedfordensitymodificationisnotfullyoptimizedand
canaffecttheoutcomesubstantially.Additionalcyclesofdensitymodificationcanimprovethe
outcomeinmanycases.Thechoiceoffinalscalingprocedure(final_scale_with_fsc_ref)can
affectthedensity-modifiedmap,ascanthechoiceofhalf-mapsharpeningatthestartof
densitymodification(density_modify_unsharpened_maps).Thenumberofshellsofresolution
usedinthecalculationofcorrelationsbetweenFouriercoefficientshasadefaultof100;more
shellscanpotentiallyimprovetheaccuracybynotgroupingcoefficientsthathaveverydifferent
valuessimplyduetoresolution-dependentvariationbutcouldreduceitduetofewer
coefficientsinacalculation.Theoptionaluseofspectralscaling,real-spaceweighting,and
individualweightingofFouriercoefficientsattheendoftheprocedurecansometimesaffect
theresultingmap.
Histogram-matching
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
Weexaminedwhethermapimprovementcanalsobeobtainedbyusinghistogram-matching43
withoutdensitymodification.Withhistogram-matchingalone,anewmapvalueateachgrid
pointisobtainedbasedonthedistributionsofexpectedandobservedvaluesinthemapand
thevalueatthatgridpoint.Withhistogram-matchingasapartofdensitymodification,this
histogram-matchedmapisnotuseddirectlyasthenewmap,butratheritisusedasa
likelihoodtargetthatindicatestheplausibilityofacandidatemap.Thisallowsinformation
fromallpartsofthemaptobecombinedtoyieldnewestimatesofeachFouriercoefficient.
Totestwhetherthemapimprovementobtainedwithdensitymodificationcouldbeobtained
usingreal-spacemethodsalone,weappliedhistogram-matching43tothe1.8Åapoferritinmap
showninFig.1C.SupplementaryFig.S2illustratesthatourhistogram-matchingapproach
improvesthismap,butdensitymodificationimprovesthismapconsiderablymore.PanelA
showstheaverageofdepositedhalf-maps,sharpenedtomatchdensity-modifiedmapinpanel
C.PanelBshowsthehistogram-matchedversionofthemapinA,andpanelCshowsthe
density-modifiedmap.NotetheconsiderableimprovementinclarityintheringofF81inthe
density-modifiedmap.
Effectsofdensitymodificationonindividualhalf-maps
SupplementaryFig.S3showstheindividualoriginalhalf-mapsandthetwodensity-modified
halfmapsproducedbythe1.8Åapoferritindensitymodificationexampleshownin
SupplementaryFig.S2.Itcanbeseenthattheoriginalhalfmapshavesimilarclaritybutare
different,andthatbothhalfmapsareimprovedbydensitymodificationandareagain
somewhatdifferent.
Sensitivityofdensitymodificationtoparameters
Wetestedthesensitivityofdensitymodificationtotwoparametersthatseemedlikelytohave
substantialeffectsontheprocedure,theboxsize,andtheresolutionusedfordensity
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
modification.Ineachcaseweusedthe3.1ÅapoferritinmapanalyzedinFig.1.Supplementary
Fig.S4showstheeffectofvaryingtheresolutionusedfordensitymodificationonthequalityof
densitymodification,asmeasuredbytheestimatedimprovementinresolutionandactual
improvement(changeintheresolutionwherethemap-modelFSCwasapproximately½).Itcan
beseenthatthedensitymodificationresolutionhasasmalleffectontheactualimprovement
overtherangeof2Å–2.6Åbut,asexpected,theimprovementbecomessmallerasthe
densitymodificationresolutionapproachesthenominalresolutionofthemap.Theestimated
resolutionhasmorevariabilitybutasimilarrelationship.
Weinvestigatedtheeffectofboxsizebycarryingoutdensitymodificationusingdifferentsize
boxestoextractthemolecule,rangingfromaboxjustthesizeofthemolecule(definedasthe
regionwithexpectedvolumewheredensityishighest),toaboxwithedges30Åbiggerineach
direction(thedefaultisabox10Åbiggerthanthemolecule).Overthisrangeofboxsizesthe
resultingresolutionwheretheFSCtothemodelequaled½variedbyasmallamount(ranging
from2.76Åto2.83Åforallboxesexceptthesmallestonewhichhadavalueof2.92Å),and
withthebestresolutionof2.76Åwithabox4Åbiggerthanthesmallestone.Thisindicates
thatatleastinthiscasetheexactboxsizeisnotveryimportantbutthatasmallboxmaybe
slightlybetterthanabiggerone.
Changesinmap-modelmetricsafterdensitymodification
Weexaminedwhethercommonmodelandmap-modelmetricsweresubstantiallydifferent
usingdensity-modifiedmapscomparedtotheoriginalmaps.SupplementaryFig.5showsthat
map-modelcorrelation(calculatedataresolutiongivenby0.83timesthenominalresolutionof
thestartingmaptoemphasizehigh-resolutioninformation)generallyimproves,butthatthe
othermetricsexamined(rotameroutliers,ClashScore44,Ramachandran%favored,EMRinger45
scores)didnotchangeinasystematicwayovertheresolutionrangeof2Å–4.5Åwhereour
methodisdesignedtoapply.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
SupplementaryFig.S6showsthelocalmap-modelcorrelationforEMD-7544andassociated
PDBentry6coy46beforeandafterdensitymodification.Thedensitymodifiedmaphas
generallyhighermap-modelcorrelationbutthedifferencevariessomewhatbylocationalong
thechain.
Effect of applying multiple cycles of density modification
Supplementary Fig. S7 illustrates the effect of multiple cycles of density modification on the average
map-model FSC obtained. Panel A corresponds to Fig. 2B after 5 cycles of density modification.
Panel B compares the average map-model FSC after one cycle with the same metric after 5 cycles.
It can be seen that additional cycles lead to a substantial improvement in a number of cases, while in
many other cases there is little effect after additional cycles, and in 2 cases there is substantial
worsening.
Softwareavailability
AlltheproceduresdescribedinthisworkareavailableusingthePhenixtoolresolve_cryo_emin
versions3689andlaterofthePhenixsoftwaresuite20.
Acknowledgements
ThisworkwassupportedbytheNIH(grantGM063210toPDA,RJRandTTandgrantR01-
GM080139toSJL),theWellcomeTrust(grant20947/Z/17/ZtoRJR),andthePhenixIndustrial
Consortium.ThisworkwassupportedinpartbytheUSDepartmentofEnergyunderContract
No.DE-AC02-05CH11231atLawrenceBerkeleyNationalLaboratory.
Authorcontributions
SLcarriedoutimageprocessingoftestdatasetstoevaluatevaryingreconstructionprocedures,
RJRandTCTcontributedideasontheformoferrorsincryo-EM,PAdevelopedtoolsforthe
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
testinginfrastructure,TCTdevelopedthesoftwareforerroranalysis,andPDAandTCT
supervisedthework.
Authorinformation
Theauthorsdeclarenocompetingfinancialinterests.
Correspondenceandrequestsformaterialsshouldbeaddressedto:
References
1 Nogales,E.Thedevelopmentofcryo-EMintoamainstreamstructuralbiologytechnique.NatMethods13,24-27,(2016).
2 Marques,M.A.,Purdy,M.D.&Yeager,M.CryoEMmapsarefullofpotential.CurrentOpinioninStructuralBiology58,214-223,(2019).
3 Terwilliger,T.C.,Adams,P.D.,Afonine,P.V.&Sobolev,O.V.Cryo-EMmapinterpretationandproteinmodel-buildingusingiterativemapsegmentation.ProteinSci,(2019).
4 Cowtan,K.Recentdevelopmentsinclassicaldensitymodification.ActaCrystallographicaSectionD66,470-478,(2010).
5 Terwilliger,T.Maximum-likelihooddensitymodification.ActaCrystallographicaSectionD56,965-972,(2000).
6 Wang,B.C.Resolutionofphaseambiguityinmacromolecularcrystallography.MethodsEnzymol115,90-112,(1985).
7 Podjarny,A.D.,Rees,B.&Urzhumtsev,A.G.inCrystallographicMethodsandProtocols(edsChristopherJones,BarbaraMulloy,&MarkR.Sanderson)205-226(HumanaPress,1996).
8 Scheres,S.H.ABayesianviewoncryo-EMstructuredetermination.JMolBiol415,406-418,(2012).
9 Sindelar,C.V.&Grigorieff,N.Optimalnoisereductionin3Dreconstructionsofsingleparticlesusingavolume-normalizedfilter.JStructBiol180,26-38,(2012).
10 Cheng,Y.,Grigorieff,N.,Penczek,P.A.&Walz,T.Aprimertosingle-particlecryo-electronmicroscopy.Cell161,438-449,(2015).
11 Ramlaul,K.,Palmer,C.M.&Aylett,C.H.S.ALocalAgreementFilteringAlgorithmforTransmissionEMReconstructions.JournalofStructuralBiology205,30-40,(2019).
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
12 Chen,S.etal.High-resolutionnoisesubstitutiontomeasureoverfittingandvalidateresolutionin3Dstructuredeterminationbysingleparticleelectroncryomicroscopy.Ultramicroscopy135,24-35,(2013).
13 Cardone,G.,Heymann,J.B.&Steven,A.C.Onenumberdoesnotfitall:Mappinglocalvariationsinresolutionincryo-EMreconstructions.JournalofStructuralBiology184,226-236,(2013).
14 Spiegel,M.,Duraisamy,A.K.&Schröder,G.F.Improvingthevisualizationofcryo-EMdensityreconstructions.JournalofStructuralBiology191,207-213,(2015).
15 Murshudov,G.N.inMethodsinEnzymologyVol.579(edR.A.Crowther)277-305(AcademicPress,2016).
16 Rosenthal,P.B.&Rubinstein,J.L.Validatingmapsfromsingleparticleelectroncryomicroscopy.CurrentOpinioninStructuralBiology34,135-144,(2015).
17 Lawson,C.L.etal.EMDataBank.org:unifieddataresourceforCryoEM.NucleicAcidsRes39,D456-D464,(2011).
18 Rosenthal,P.B.&Henderson,R.Optimaldeterminationofparticleorientation,absolutehand,andcontrastlossinsingle-particleelectroncryomicroscopy.JMolBiol333,721-745,(2003).
19 Karplus,P.A.&Diederichs,K.LinkingCrystallographicModelandDataQuality.Science336,1030-1033,(2012).
20 Liebschner,D.etal.MacromolecularstructuredeterminationusingX-rays,neutronsandelectrons:recentdevelopmentsinPhenix.ActaCrystallographicaSectionD75,861-877,(2019).
21 Urzhumtsev,A.,Afonine,P.V.,Lunin,V.Y.,Terwilliger,T.C.&Adams,P.D.Metricsforcomparisonofcrystallographicmaps.ActaCrystallographicaSectionD70,2593-2606,(2014).
22 Masuda,T.,Goto,F.,Yoshihara,T.&Mikami,B.Theuniversalmechanismforirontranslocationtotheferroxidasesiteinferritin,whichismediatedbythewellconservedtransitsite.BiochemicalandBiophysicalResearchCommunications400,94-99,(2010).
23 Berman,H.M.etal.TheProteinDataBank.NucleicAcidsRes28,235-242,(2000).24 vanHeel,M.&Schatz,M.Fouriershellcorrelationthresholdcriteria.Journalof
StructuralBiology151,250-262,(2005).25 vanHeel,M.&Schatz,M.ReassessingtheRevolution’sResolutions.bioRxiv,224402,
(2017).26 Afanasyev,P.etal.Single-particlecryo-EMusingalignmentbyclassification(ABC):the
structureofLumbricusterrestrishaemoglobin.IUCrJ4,678-694,(2017).27 Bartesaghi,A.etal.2.2Åresolutioncryo-EMstructureofβ-galactosidaseincomplex
withacell-permeantinhibitor.Science348,1147,(2015).28 Horst,B.G.etal.Allostericactivationofthenitricoxidereceptorsolubleguanylate
cyclasemappedbycryo-electronmicroscopy.eLife8(2019).29 Iudin,A.,Korir,P.K.,Salavert-Torres,J.,Kleywegt,G.J.&Patwardhan,A.EMPIAR:a
publicarchiveforrawelectronmicroscopyimagedata.NatureMethods13,387-388,(2016).
30 Tang,G.etal.EMAN2:Anextensibleimageprocessingsuiteforelectronmicroscopy.JournalofStructuralBiology157,38-46,(2007).
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
31 Jakobi,A.J.,Wilmanns,M.&Sachse,C.Model-basedlocaldensitysharpeningofcryo-EMmaps.Elife6,(2017).
32 Skubak,P.etal.AnewMR-SADalgorithmfortheautomaticbuildingofproteinmodelsfromlow-resolutionX-raydataandapoorstartingmodel.IUCrJ5,166-171,(2018).
33 Terwilliger,T.Improvingmacromolecularatomicmodelsatmoderateresolutionbyautomatediterativemodelbuilding,statisticaldensitymodificationandrefinement.ActaCrystallographicaSectionD59,1174-1182,(2003).
34 Bricogne,G.Geometricsourcesofredundancyinintensitydataandtheiruseforphasedetermination.ActaCrystallographicaSectionA30,395-405,(1974).
35 Abrahams,J.P.&Leslie,A.G.MethodsusedinthestructuredeterminationofbovinemitochondrialF1ATPase.ActaCrystallogrDBiolCrystallogr52,30-42,(1996).
36 Afonine,P.V.etal.Real-spacerefinementinPHENIXforcryo-EMandcrystallography.ActaCrystallographicaSectionD74,531-544,(2018).
37 Terwilliger,T.C.,Sobolev,O.V.,Afonine,P.V.&Adams,P.D.Automatedmapsharpeningbymaximizationofdetailandconnectivity.ActaCrystallographicaSectionD74,545-559,(2018).
38 Brown,A.etal.Toolsformacromolecularmodelbuildingandrefinementintoelectroncryo-microscopyreconstructions.ActaCrystallogrDBiolCrystallogr71,136-153,(2015).
39 Pettersen,E.F.etal.UCSFChimera--avisualizationsystemforexploratoryresearchandanalysis.JComputChem25,1605-1612,(2004).
40 Terwilliger,T.Map-likelihoodphasing.ActaCrystallographicaSectionD57,1763-1775,(2001).
41 Sousa,D.&Grigorieff,N.Abinitioresolutionmeasurementforsingleparticlestructures.JStructBiol157,201-210,(2007).
42 Shaikh,T.R.,Hegerl,R.&Frank,J.AnapproachtoexaminingmodeldependenceinEMreconstructionsusingcross-validation.JournalofStructuralBiology142,301-310,(2003).
43 Zhang,K.Y.J.,Cowtan,K.&Main,P.inMethodsinEnzymologyVol.27753-64(AcademicPress,1997).
44 Chen,V.B.etal.MolProbity:all-atomstructurevalidationformacromolecularcrystallography.Actacrystallographica.SectionD,Biologicalcrystallography66,12-21,(2010).
45 Barad,B.A.etal.EMRinger:sidechain–directedmodelandmapvalidationfor3Dcryo-electronmicroscopy.NatureMethods12,943-946,(2015).
46 Park,E.&MacKinnon,R.StructureoftheCLC-1chloridechannelfromHomosapiens.eLife7(2018).
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
SupplementaryFigures
SupplementaryFig.S1.Analysisofnon-density-modifiedmapstoexaminewhethersimplemodifications(sharpening,usingdepositedmaps)wouldyieldmapsthatappearsimilartothedensity-modifiedmaps.A.DepositedmapshowingthesameregionasdepictedinFig.2C.Thismapalsoshowspoordensityfortheloop.VaryingthesharpeningofthemapdidnotyieldconnecteddensityasinFig.2D.B.SharpenedversionoftheoriginalmapshowninFig.2E.
SupplementaryFig.S2.Histogrammatchingofemd-20026.A.Averageofhalf-maps,sharpenedtomatchdensity-modifiedmapinC.B,histogram-matchedversionofmapinA,sharpenedasinA.C,Density-modifiedmap.Allcontourssettoencloseequalvolumes.Mapsmaskedaroundatomsinmodel.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
SupplementaryFig.S3.Comparisonofhalf-maps1and2fordensitymodificationshowninFig.S2.Allmapsmaskedandsharpenedtomatchdensity-modifiedmapinFig.S2C.AandC,halfmaps1and2.BandD,density-modifiedhalf-maps1and2.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
SupplementaryFig.S4.EffectofdensitymodificationresolutiononactualandestimatedchangeinresolutionwhereFSCis½.Half-datasetmapsforEMD-20028weredensitymodifiedasinFig.1,butvaryingtheresolutionfordensitymodificationfrom2to2.9Å.Theestimatechangeinresolution(calculatedfromtheerroranalysis)andtheactualchangeinresolution(calculatedfromtheFSCtothehigh-resolutionEMD-20026map)areshown.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
SupplementaryFig.S5.Changeinmodel/mapmetricsafterdensitymodification.Analysesareplottedasafunctionofresolutionforthe104datasetsshowninFig.2.A.Map-modelcorrelation,B.Rotameroutliers,C.ClashScore,D.Ramachandranpercentageinfavoredregion,E,EMRingerscores.
SupplementaryFig.6.Map-modelresiduecorrelationsmoothedin10-residuewindowsforEMD-7544(PDBentry6coy).
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint
SupplementaryFig.7.Effectofapplying5cyclesofdensitymodificationvs1.A.Averagemap-modelFSCafter5cyclesofdensitymodification(comparewithFig.2B).B.Changeinaveragemap-modelFSCbetweenonecycleofdensitymodificationand5cycles.
.CC-BY 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted March 6, 2020. ; https://doi.org/10.1101/845032doi: bioRxiv preprint