DOMAIN,TERTIARY, AND QUARTERNARY STRUCTURE OF PROTEINS

Preview:

Citation preview

DOMAIN,TERTIARY, AND QUARTERNARY STRUCTURE

OF PROTEINS

Levels of protein structure organization

Between secondary and tertiary structure

• Supersecondary structure: arrangement of elements of same or different secondary structure into motifs; a motif is usually not stable by itself.

• Domains: A domain is an independent unit, usually stable by itself; it can comprise the whole protein or a part of the protein.

Example of a -hairpin: tryptophan zipper (1LE0)

Example of a -hairpin in bovine pancreatic trypsin inhibitor– BPTI.

Example of a protein with two -hairpins: erabutoxin from whale.

Example of a -meander: -spectrin SH3 domain (1BK2)

Helix Hairpin

Alpha alpha corner (L7.24)

Because of high content of acidic amino-acid residues with side chains pointing inside the loop, the EF-hand motif constitutes a calcium-binding scaffold in troponin, calmodulin, etc.

Helix E

helix F

Troponin C with four EF motifs that bind calcium ions.

The Helix-Turn-Helix motif

• This motif is characteristic of proteins binding to the major DNA grove.• The proteins containing this motif recongize palindromic DNA sequences.• The second helix is responsible for nucleotide sequence recognition.

The Helix-Turn-Helix motif

The -helix--hairpin motif (zinc finger)

-- Motif (very important and very frequent)

Hydrophobic core between -helix and -sheet

The Greek Key Motif

The Greek-key motif as seen in proteins

1. Functionality (performing a biological function or role in formation and stabilization of globular structure)

2. Solubility:

• Globular proteins and protien domains (water solubke)

• Membrane proteins and domains (lipid soluble)

• Fibrillar protiens (insoluble)

3. Content of secondary structure

• (parallel and antiparallel)

• •

• +• high disulfide-bridge or metal content.

Domains: classification criteria

Protein domains

There is no consensus method to determine the domains and a structure is often dissected into domains in an arbitrary and intuitive manner. However, the following features apply:

- a domain is a potentially independent folding uint,- a domain is a potentially independent structural (and functional) unit but it is connected with the remaining part of a protein covalently,- domain sequence is often conserved and same or similar sequences can be encountered in other proteins with same domain(s),- domains often perform specific functions (e.g., nucleotide or saccharide binding)- domain interface often functions as active center.

A protein molecule can consist of a single or of several domains.

http://pawsonlab.mshri.on.ca/

• Domains form an important level in the hierarchical organisation of the three-dimensional structure of globular proteins, although not all proteins can be described as multidomain structures.

Domains: example

• Domains of recently evolved proteins are frequently encoded by exons, reflecting gene fusion of simpler modules. For example, in the case of hepatocyte growth factors and plasminogens, a number of kringle domains are present.

• For a recent review on domain insertion. Domain swapping between two protomers is not uncommon (for example in the case of diphtheria toxin).

Example of division of a protein into domains: human Hsp70 chaperone

Human cystatin C: domain swapping

Domain identification algorithms

256

1

ki ikdiNh

Schultz’s method: neighborhood correlation criterion

The Go algorithm: interdomain distances are larger than intradomain distances

The Rose algorithm: based on the deviation of the long axes of the fragments from protein mean plane; works for continuous domains            

C-C distances between secondary structures are represented in the form of average values termed 'proximity indices' and the secondary structural organisation is indicated in the form of dendrograms. An example is shown for the case of calmodulin.                                                                                      

The Crippen algorithm: based on dissection of residues according to interresidue distances into clusters

Specific nodes in these dendrograms are identified as tertiary structural clusters of the protein; these include supersecondary structures and domains. A ratio of the average proximity indices (ignoring inter-clusteral distances) to the average of all proximity indices, weighted for the aggregation of small sub-clusters and termed the disjoint factor, is employed as a discriminatory parameter to identify automatically clusters representing individual domains. An example of domains identified in glutathione reducatase is shown below :                                                     

The domains identified by this clustering method may not correspond to the functional domains proposed. The "disjoint factor" gives a measure of the extent of interaction between domains and has been used to classify domains into one of the three types, disjoint, interacting and conjoint. Domains are classified as those with sparse inter-domain interfaces (disjoint), intermediate interactions (interacting) and elaborate interfaces (conjoint) based on the magnitude of the disjoint factor. An example of the three types is shown below :                                                                                 

Classification of three-dimensional structures of protein

Richardson’s classification

– -helices are only or dominant secondary-structure elements (e.g., ferritin, myoglobin)

– -sheets are only or dominant elements (e.g., lipocain)

– contain strongly interacting helices and sheets

+ – contain weakly interacting or separated helices and sheets

Structural Classification Of Proteins

This is a hierarchical classification scheme with the following 4 levels:

1. Families – one family is comprised by proteins related structurally, evolutionally, and functionally.

2. Superfamoilies – A superfamily is comprised by families of substantially related by structure and function.

3. Folds – Superfamilies with common topology of the main portion of the chain.

4. Classes - Groups of folds characterized by secondary structure: (mainly -helices), (mainly-sheets), (-helices and -sheets strongly interacting), (-helices and -weakly interacting or not interacting), multidomain proteins (non-homologous proteins with vert diverse folds).

SCOP classification

[ http://scop.mrc-lmb.cam.ac.uk/scop/ ]

Class Number of foldsNumber of superfamilies

Number of families

All alpha proteins

284 507 871

All beta proteins 174 354 742

Alpha and beta proteins (a/b)

147 244 803

Alpha and beta proteins (a+b)

376 552 1055

Multi-domain proteins

66 66 89

Membrane and cell surface proteins

58 110 123

Small proteins 90 129 219Total 1195 1962 3902

Scop Classification Statistics

SCOP: Structural Classification of Proteins. 1.75 release38221 PDB Entries (23 Feb 2009). 110800 Domains. 1 Literature Reference

(excluding nucleic acids and theoretical models)

CATH classification (Class (C), Architecture(A), Topology(T), Homologous superfamily (H))

Four hierarchy levels:

1. Class (Level C): according to the content of secondary structure type , , ( and +), weakly or undefined secondary structure.

2. Architecture. (Level A) – Orientation and connection topology between secondary structure elements.

3. Topology. (Level T) – based on fold type.

4. Homoloous superfamilies. (Level H) – high homology indicating a common anscestor:

- >30% sequence identity OR

- > 20% sequence identiy and 60% structural homology OR

- > 60% structural homology and similar domains have similar function.

• Class(C)

derived from secondary structure

content is assigned automatically

• Architecture(A)

describes the gross orientation of

secondary structures,

independent of connectivity.

• Topology(T)

clusters structures according to

their topological connections and

numbers of secondary structures

• Homologous superfamily (H)

[ http://www.biochem.ucl.ac.uk/bsm/cath_new/ ]

A „periodic table” of protein structures

W. Taylor, Nature, 416, 6881, 657-660 (2002)

„Menagerie” of known protein folds

-helical structures

ROP: two packed helices

1rop - RASMOL

Antiparallel four--helix bundle

15o twist of helix axes

1fha - RASMOL

Example: ferritin

2mhr - RASMOL 1fha - RASMOL

The heme binding sites are located between the -helices

G-protein coupled receptors: antiparallel 7-helix bundles

1bac, 1brd - RASMOL

Each helix is about 20 residue long

Bakteriorodopsyna – model teoretycznyBacteriorhodopsin: theoretical model

Photosynthetic reaction center

1bac, 1brd - RASMOL

Bakteriorodopsyna – model teoretyczny

1prc - RASMOL

-helical structures with the Greek key topology

• Tego typu domeny spotykane są przede wszystkim w białkach globinowych.

• Zbudowane są z dwóch warstw -helis.

• Kierunki helis obu wartsw są często prawie prostopadłe (upakowanie helisy metodą ortogonalną)

• Domena przypomina nieco cylinder utworzony z helis skręconych w stosunku do jego osi o kąt od 0 do 45o

• Najczęściej spotykanym typem połączeń między helisami jest sekwencja +3, -1, -1, -1, spotykana w motywie klucza greckiego

1mba - RASMOL

DNA-binding proteins

1hdd - RASMOL 1lmb - RASMOL

Large -helical proteins

1SLYa - RASMOL

horseshoe

The jellyroll topology

Example of a protein with jellyroll topology: Carbohydrate-Binding Module Family 28 from Clostridium josui Cel5A (3ACI)

Example of a -barrel (red fluorescent protein; 3NED)

Example of a -propellor motif : Thermostable PQQ-dependent Soluble Aldose Sugar Dehydrogenase (3DAS)

The -helix

Quaternary structure is the result of subunit associations

Fig. 5-15

1RLZ (leucine zipper)

1G6U: artificial homodimer (three-helix bundle)

Homo-multimersIt is far more common to find copies of the same tertiary domain associating non-covalently. Such complexes are usually, though not always symmetrical. Because proteins are inherently asymmetrical objects, the multimers almost always exhibit rotational symmetry about one or more axes. The majority of the enzymes of the metabolic pathways seem to aggregate in this way, forming dimers, trimers, tetramers, pentamers, hexamers, octamers, decamers, dodecamers, (or even tetradecamers in the case of the chaperonin GroEL).

 

Hetero-multimers In this case we see different tertiary domains aggregating together to form a unit. The photoreaction centre is a good example.  

Sometimes, we find that several domains are found in a single enzyme complex, either in a single polypeptide chain, or as an association of separate chains.Often the domains have related functions, for instance, where one domain will be responsible for binding, another for regulation, and a third for enzymatic activity. Cellobiohydrolase provides anexample of such a protein. It is not uncommon to find more than once the same chain in a protein complex. A good example is the F-1 ATPase.

Two (further) steps in the biosynthetic pathway of tryptophan (in S.typhimirium) are catalysed by tryptophan synthase which consists of two separate chains, designated and , each of which is effectively a distinct enzyme.

 

The biologically active unit is a hetero-tetramer comprised of 2 and 2 units.We sometimes find slightly different versions of the same protein associating. Thus, haemoglobin has both an A-chain and a B-chain, which come together to form a hetero-dimer. Two copies of this then associate to form the normal haemoglobin tetramer. Which is equivalent to an A-dimer associating with a B-dimer.Also, it can happen that two different chains associate to form a bigger secondary structure. It is the case of the pea lectin, where a very large -sheet is nade out of strands coming from different protein chains:

Amyloid fibril formed from the Abeta peptide

Recommended