33
V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important cornerstone for the functional analysis of any membrane protein is an accurate topology model. Topology model: describes the number of TM spans and the orientation of the protein relative to the lipid bilayer. Topology models can be generated by sequence-based prediction or by time-consuming experimental approaches. 2 medium length assignments tutorials Fri 13.00 - 14.30

V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

Embed Size (px)

Citation preview

Page 1: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II1

V7 – Positioning of TM proteins in membrane

In the absence of high-resolution 3D structures, an important cornerstone for the

functional analysis of any membrane protein is an accurate topology model.

Topology model: describes the number of TM spans and the orientation of the

protein relative to the lipid bilayer.

Topology models can be generated by sequence-based prediction or by time-

consuming experimental approaches.

2 medium length assignments

tutorials Fri 13.00 - 14.30

Page 2: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II2

Idea: generate reference point, e.g. the location of a protein‘s C terminus.

In E.coli attach alkaline phosphatase (PhoA) that is active only in the periplasm of

E.coli, or green fluorescent protein (GFP) that fluoresces only in the cytoplasm.

TMHMM: 1000 of 4288 predicted E.coli genes are inner membrane proteins.

737 genes encode proteins with > 100 residues and 2 TM helices.

714 were suitable for cloning into phoA and gfp fusion vectors.

Both fusions could be obtained for 573 genes, one fusion for an additional 92

genes.

(1) Global Topology Analysis

Daley et al. Science 308, 1321 (2005)

Page 3: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II3

Using homology, 601 proteins could be

assigned a topology.

For 71 of these, the location of the C terminus

was already established.

The results agreed except for 2 cases.

The error rate is therefore ~ 1%.

TMHMM alone predicts the correct C-terminal

location for 78% of the 601 proteins.

By providing unambiguous C-terminal

locations, the TMHMM reliability score

increases for 526 proteins and decreases for

75 proteins.

Global Topology Analysis

Daley et al. Science 308, 1321 (2005)

Page 4: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II4

Functional categorization of E.coli inner membrane proteome

Daley et al. Science 308, 1321 (2005)

clear trend for Nin – Cin topologies (even number of TMH)

- largest functional category is transport proteins, many with

6 or 12 TM helices.

Most proteins with unknown function have 6 TM helices.

Page 5: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II5

Idea: transfer experimental data set from PhoA and GFP-fusions to homologous

proteins. Data on 608 proteins.

204 annotated eubacterial and 21 archeal genomes in March 2005,

658,210 sequences. BLAST searches (E-value < 10-5)

30,744 sequence hits where TMHMM predicts 1 TM helix

Second BLAST query with these 30,744 sequences

17,111 „secondary homologs“.

Extend predictions by sequence homology

Granseth et al., J.Mol.Biol. 352, 489 (2005)

Page 6: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II6

Unconstrained vs. constrained prediction

Granseth et al., J.Mol.Biol. 352, 489 (2005)

(a) Unconstrained TMHMM predictions for the

full set of 158,182 sequences with 1 predicted

TM helix (grey bars) and constrained predictions

for the 51,208 sequences for which the C-

terminal location or the location of an internal

residue could be annotated (black bars).

The number of proteins with different topologies

are shown; Cin topologies are plotted upwards,

Cout downwards. The number of Cout proteins with

a single TM helix (39,322) is off-scale.

The unconstrained algorithm predicts too many

proteins as Cout.

(b) TMHMM predictions for the 51,208 annotated

sequences before (grey bars)

and after (black bars) constraining the

predictions with the location of the annotated

residue.

Page 7: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II7

Most TM proteins are expected to adopt only one topology in the membrane.

Global topology analysis of E.coli inner membrane proteome identified 5 dual-

topology candidates: EmrE, SugE, CrcB, YdgC, YnfY.

All are quite small (~ 100 aa), contain 4 strongly predicted TM segments, contain

only few K and R residues and have very small (K + R) bias.

(2) Dual-topology proteins?

Rapp et al., Nat.Struct.Biol. 13, 112 (2006)

(a) A dual-topology protein inserts into the membrane in two opposite directions. As nearly all helix-bundle membrane proteins have a higher number of lysine (K) and arginine (R) residues in cytoplasmic (in) than in periplasmic (out) loops (the ‚positive-inside‘ rule), dual-topology proteins are expected to have very small (K + R) biases.

Rectangles: TM segmentsblack dots: K and R residues

Page 8: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II8

Without solving their 3D structures, how can one prove that a protein has dual

topology?

Such a protein would be particularly sensitive to the addition or removal of a single

positively charged residue in a loop or tail.

measure activities of two different, C-terminally fused reporter proteins:

PhoA (only enzymatically active when in the periplasm)

GFP (fluorescent only when in the cytoplasm).

Concentrate on N-terminus and first loop.

Dual-topology proteins?

Rapp et al., Nat.Struct.Biol. 13, 112 (2006)

Page 9: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II9

(a) wt YdgE-PhoA fusion is active,

wt YdgE-GFP fusion is inactive

C-terminus in periplasm (Cout )

wt YdgF behaves oppositely (Cin)

These 2 proteins are topologically

stable.

(b – d) C-terminal orientation of

EmrE, SugE, CrcB, YnfA and

YdgC is highly sensitve to charge

mutations.

For 14 or 19 charge mutations,

both PhoA and GFP activities

change in the direction expected

from the change in (K + R) bias.

Charge mutations shift the orientations of dual-topology TM proteins

Rapp et al., Nat.Struct.Biol. 13, 112 (2006)

Page 10: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II10

Pfam searches in 174 fully sequenced bacterial genomes for homologs (E < 10-10)

to SugE, EmrE, YdgE, CrcB, YnfA, YdgC and YdgO/YdgL.

Create multiple sequence alignment with ClustalW.

Use TMHMM to predict the positions of TM helices.

Obtain consensus TM helix prediction, compute (K + R) biases for individual

proteins. 10 residues from each of the flanking TM helices were included to allow

for possible misprediction of the exact positions of the loop ends.

Dual-topology homologs occur as gene pairs or singletons

Rapp et al., Nat.Struct.Biol. 13, 112 (2006)

Page 11: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II11

Interpretation: SMR and CrcB occur as closely spaced pairs or as singletons.

Paired genes encode homologous proteins with opposite (K + R) bias.

Dual-topology homologs occur as gene pairs or singletons

Rapp et al., Nat.Struct.Biol. 13, 112 (2006)

Page 12: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II12

Most likely evolutionary scenario:

a single dual-topology protein

undergoes gene duplication, the

two resulting proteins become

fixed in opposite orientations and

finally fuse into a single

polypeptide.

An internally duplicated protein with opposite topology

Rapp et al., Nat.Struct.Biol. 13, 112 (2006)

Page 13: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II13

Global topology analysis of E.coli inner membrane proteome showed that ca. 20 –

25% of the TM proteins have 10 TM helices.

These are often involved in transport of small molecules across the membrane.

Many of these proteins will have buried helices. Can we identify those?

Develop an empirical helix burial function f based on a few assumptions.

(i) residues in buried helices are more conserved because of structural and

functional contraints.

(ii) the residue composition of the buried helices is different from the composition of

helices facing the lipid environment.

(iii) the difference between the minimal and maximal values of conservation

entropy for every position in MSAs of TM helices should be smaller in buried

helices than in lipid-exposed helices because of the homogenous environment.

(3) Prediction of buried TM helices

Adamian & Liang, Proteins 63, 1 (2006)

Page 14: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II14

f: burial function

s: average entropy of all residue positions of the TM helix

l : average lipophilicity

k: sorted entropy values of all residue positions in a helix of length d for helices

1 ... n of the TM protein

Problems: the average entropy depends on the number of sequences in the MSA.

needs MSAs with exactly the same set of sequences from the same set of

species.

Also, the stability of different membrane proteins in the lipid environment may be

different.

Account for ambiguity in the definition of TM helix ends.

Burial Function

Adamian & Liang, Proteins 63, 1 (2006)

lskf

d

ssss d

...21

d

llll d

...21

Page 15: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II15

Ranking of TM helices by burial function and robustness

Adamian & Liang, Proteins 63, 1 (2006)

Page 16: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II16

(a) TM helices TM4, TM5, TM6, TM8 form core, consistent with prediction.

(b) TM4, TM10 are most buried.

(c) one can explain prediction of TM8 as buried by considering a tightly bound

cardiolipin molecule identified in the X-ray structure.

Examples of buried TM helices that are correctly predicted

Adamian & Liang, Proteins 63, 1 (2006)

Page 17: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II17

Is the method applicable to TM

proteins where only sequence data

is available?

Test on structure of Leu transporter.

TMHMM predicts 12 TM helices.

Good overlap with X-ray helices.

Problem that no additional

sequences exist that are annotated

as Na+-dependent Leu transporters.

LeuTAa has 3 significantly buried

helices: 1, 6 and 8.

1 and 6 are true positives, 2 is a

false positive, 8 is a false negative.

Test ranking results

Adamian & Liang, Proteins 63, 1 (2006)

Page 18: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II18

Experimental techniques to study orientation of proteins in membranes

chemical modification

spin-labeling

fluorescence quenching

X-ray scattering

neutron diffraction

electron cryomicroscopy

NMR

polarized infrared spectroscopy.

Desirable to complement by computational methods.

e.g. explicit-solvent molecular dynamics

... up to simplified approaches that minimize the protein transfer energy

from water to a hydrophobic slab (used as a membrane model).

(4) Positioning of proteins in membranes

Adamian & Liang, Proteins 63, 1 (2006)

Page 19: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II19

important parameters

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 20: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II20

Model protein as a rigid body that freely floats in the planar hydrocarbon core of a

lipid bilayer.

Calculation of transfer energy

Adamian & Liang, Proteins 63, 1 (2006)

ii

MW

iitransferzfASAdzG ,,,

0

ASAi : accessible surface area of atom i, computed with NACCESS

iW-M : solvation parameter of atom i (transfer energy of the atom from water to

membrane interior in kcal/(mol.Å2) )

f(zi): interfacial water concentration profile with = 0.9 Å

0

1

1zzi i

ezf

Page 21: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II21

ionization of charged residues

Residues that are typically charged in soluble proteins may become neutral in the

hydrophobic inside of the bilayer!

The ionization/protonation energies of charged residues are described by the

Henderson-Hasselbalch equation:

Lomize et al. Prot.Sci. 15, 1318 (2006)

aioniz

pKpHRTG 3.2at pH = 7

average pKa value Gioniz

in proteins [kcal/mol]

Arg 12.0 6.9

Lys 10.4 4.7

Asp 3.4 4.9

Glu 4.1 4.0

His 6.6 0.6

Page 22: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II22

use deterministic 2-step search strategy:

(1) grid scan to determine a set of low-energy combinations of variables z0, d, , grid steps: 0.5 Å for z0 and d, 5° for , 2° for

(2) local energy minimization (Davidon-Fletcher-Powell method) starting from low-

energy points

Also consider energetically best rotation of solvent-exposed charged side chains

(e.g. Lys and Arg) that are situated close to the calculated boundaries and

could be rotated away from the hydrophobic core

Global energy optimization

Adamian & Liang, Proteins 63, 1 (2006)

Page 23: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II23

Which solvation parameters to use?

chx and dcd results agree well with experiment, oct agrees poorly.

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 24: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II24

features of model

slightly different parameter sets should be applied for proteins in detergents and

bilayers

Gtransfer should not include contributions of atoms that face internal polar cavities of

TM proteins and that do not directly interact with surrounding bulk lipid

( mention results of Sam)

Otherwise, the orientation of many -barrels and pore-forming transporters would

be computed incorrectly

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 25: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II25

Main features of model

necessary and sufficient approximations for reproducing the exp. data

(1) lipid bilayer is represented as planar hydrophobic slab with adjustable thickness

and a narrow interfacial area with a sigmoidal polarity profile

(2) proteins are considered as rigid bodies with flexible side chains; their transfer

energies are minimized with respect to 4 variables

(3) transfer free energy is calculated at an all-atom level using atomic solvation

parameters determined for the water-decadiene system

(4) neglect explicit electrostatic interactions, account for neutralization of charged

residues

(5) eliminate contributions of pore-facing atoms

The model only depends on 5 atomic solvation parameters (N, O, S, sp2 C,

sp3 C), one constant , and the ionization energies of charged groups.

All can be obtained independently from experimental sources.

Verify method for 24 TM proteins of known 3D structure whose spatial position in

bilayers have been exp studied.Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 26: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II26

Average tilt angles

(a) hydrophobic thickness matches well (table 2)

Lomize et al. Prot.Sci. 15, 1318 (2006)

(b) the calculated tilt values are in excellent agreement with NMR data,

they also correlate well with ATR-FTIR data (table 3), although the exp. values are

systematically larger orientational disorder in the experiments?

Page 27: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II27

Membrane penetration depths

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 28: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II28

Introduction

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 29: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II29

Membrane core boundaries

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 30: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II30

application to all other 109 TM protein complexes

80 -helical

28 -barrels

gramicidin dimer

control set:

20 water-soluble proteins

32 monotopic and peripheral proteins

Application to all TM proteins from the PDB

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 31: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II31

Peripheral and monotopic

proteins have low penetration

depths.

Calculated tilt angles vary

from 0° - 6°.

TM proteins tend to be

nearly perpendicular to the

membrane, although the

individual helices are on

average tilted by 21°.

Application to membrane proteins

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 32: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II32

Biological membranes differ

Lomize et al. Prot.Sci. 15, 1318 (2006)

Page 33: V7 SS 2006 Membrane Bioinformatics – Part II 1 V7 – Positioning of TM proteins in membrane In the absence of high-resolution 3D structures, an important

V7 SS 2006

Membrane Bioinformatics – Part II33

Fluctuations are larger for TM proteins with

a smaller TM perimeter.

Fluctuations around energy minimum

Lomize et al. Prot.Sci. 15, 1318 (2006)