48
1 CHAPTER 1 INTRODUCTION 1.1 POST TRANSLATIONAL MODIFICATION OF PROTEINS The diversity in nature’s repertoire of proteins is contributed by the differences in arrangement of amino acids. This diversity is enhanced by post translational modifications to perform several biological functions. Post translational modifications are covalent processing events that alter the properties of a protein by proteolytic cleavage or by addition of modifying groups to one or more amino acids. Such processing events modulate biological processes by influencing protein activity, localization, turnover, and interactions with other proteins (Mann and Jensen 2003). The proteome diversification by covalent modification occurs in both prokaryotes and eukaryotes; in latter it is much more extensive in terms of types of modifications and frequency of occurrence (Walsh et al 2005). The most common types of covalent protein modifications include; phosphorylation, glycosylation, disulfide bond formation, acylation (such as ε-N-acetylation, N-myristoylation, S-palmitoylation, mono- and polyubiquitylation) and alkylation (such as N-methylation and S-prenylation) (Walsh et al 2005). Apart from the above well-characterized and abundant covalent modifications, there are many additional classes of enzymatic modification of proteins that expand the metabolic and signaling capacities of organisms. These include, protein hydroxylation, sulfur transfer, ADP-ribosylation, carboxylation, phosphopantetheinylation etc (Walsh et al 2005, Yarbrough and Orth 2009, Walsh and Jeffries 2006).

CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

1

CHAPTER 1

INTRODUCTION

1.1 POST TRANSLATIONAL MODIFICATION OF PROTEINS

The diversity in nature’s repertoire of proteins is contributed by the

differences in arrangement of amino acids. This diversity is enhanced by post

translational modifications to perform several biological functions. Post

translational modifications are covalent processing events that alter the

properties of a protein by proteolytic cleavage or by addition of modifying

groups to one or more amino acids. Such processing events modulate

biological processes by influencing protein activity, localization, turnover,

and interactions with other proteins (Mann and Jensen 2003). The proteome

diversification by covalent modification occurs in both prokaryotes and

eukaryotes; in latter it is much more extensive in terms of types of

modifications and frequency of occurrence (Walsh et al 2005). The most

common types of covalent protein modifications include; phosphorylation,

glycosylation, disulfide bond formation, acylation (such as ε-N-acetylation,

N-myristoylation, S-palmitoylation, mono- and polyubiquitylation) and

alkylation (such as N-methylation and S-prenylation) (Walsh et al 2005).

Apart from the above well-characterized and abundant covalent

modifications, there are many additional classes of enzymatic modification of

proteins that expand the metabolic and signaling capacities of organisms.

These include, protein hydroxylation, sulfur transfer, ADP-ribosylation,

carboxylation, phosphopantetheinylation etc (Walsh et al 2005, Yarbrough

and Orth 2009, Walsh and Jeffries 2006).

Page 2: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

2

1.2 COMMON POST TRANSLATIONAL MODIFICATIONS –

GLYCOSYLATION, PHOSPHORYLATION AND LIPID

MODIFICATION OF PROTEINS

Of the several post-translational modifications, glycosylation is the

most common post-translational process in eukaryotes and accounts to 1-2%

of the proteins encoded by the human genome (Walsh and Jeffries, 2006).

Most of the cell surface and secreted proteins are glycoproteins (Ashford and

Platt 1999). In this type of modification, oligosaccharides are attached

co-translationally to specific asparagine (N-linked) or serine/threonine

(O-linked) residues; for N-linked glycosylation the consensus sequence

Asn-X-Ser/Thr is essential, (where X can be any amino acid except proline),

whereas sites of O-glycosylation show no specific amino acid sequence

(Ashford and Platt 1999). The sugar moieties of glycoproteins affect both the

structural and functional properties of the protein, such as protein folding and

conformation, stability to denaturation, solubility and resistance to proteolysis

as well as key biological properties such as receptor binding, modulation of

enzyme activity and cellular recognition events (Walsh and Jeffries 2006).

Another important post translational modification is

phosphoryalation of proteins, generally recognized as a fundamental

mechanism by which the intracellular events are modulated (Morandell et al

2006). The process is reversible, enabling the cells to respond to myriad

signals. In eukaryotes, phosphorylation usually occurs on Ser, Thr, and Tyr

residues whereas, in prokaryotes it occurs on the basic amino acid residues of

His or Arg or Lys. The reversible phosphorylation in many enzymes and

receptors results in a conformational change, causing them to become either

activated or deactivated, and thereby controlling protein activity within the

cells. For example, caspases, the key degradative enzymes that function in the

apoptotic process are activated upon phosphorylation.

Page 3: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

3

Covalent attachment of lipids to proteins is an essential post

translational mechanism occurring in both eukaryotes and prokaryotes. It was

first demonstrated from the studies in Escherichia coli murein lipoprotein by

Braun and Rehn 1969. The discovery was soon followed by the identification

of fatty acids linked to viral glycoproteins, fungal mating factors and to

GTP-binding proteins (Baumann and Menon 1985). The eukaryotic lipid

modification of proteins attracted most of the attention and was intensely

studied (Yalovsky et al 1999). Eukaryotic lipidation ranges from addition of

myristyl, palmitoyl, diphatnyl or cholesterol moieties conferring wide range

of lipophilicity. These can be added at the amino terminus, the carboxy

terminus, or at internal residues via ester, thioester, thioether, or amide bonds;

or through mediating elements, activated intermediary carrier like acyl carrier

protein also take part in lipid acylation (Walsh et al 2005). The following is a

brief account of our current understanding of protein lipidation.

1.2.1 Eukaryotic Lipid Modification

Unlike prokaryotes, in eukaryotes the lipid modification is diverse

with 10-50% of all proteins been possibly modified by lipids belonging to

isoprenoids (15-carbon farnesyl or 20-carbon geranylgeranyl groups) or

saturated fatty acyl groups (palmitoyl, myristoyl) or

glycosylphosphatidylinositol (GPI) (Hooper and Jeffrey Mcilhinney 1999).

These lipids tether the soluble proteins to membranes and allow protein-

protein interactions and transduction of signals. Lipoproteins have also been

implicated in a variety of other cellular and extracellular events like

embryogenesis, pattern formation, protein trafficking through the secretory

pathway and evasion of the immune response by infectious parasites

(Yalovsky et al 1999). The different types of lipid modification of proteins

seen in eukaryotes are briefly described as under.

Page 4: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

4

1.2.1.1 Prenylation

Among all the lipid modification mechanisms in eukaryotes,

prenylation of proteins is extensively studied (Gelb et al 2006). Anchoring

proteins to cellular membrane aids in several protein-protein interactions that

mediate signals for growth from cell surface receptors to nuclear transcription

factors (Yalovsky et al 1999, Gelb et al 2006). In, protein prenylation either a

farnesyl or a geranyl-geranyl moiety is transferred to C-terminal cysteine of

the target protein. Three enzymes that carry out prenylation are protein

farnesyltransferase (PTase), protein geranylgeranyltransferase type I

(GGTase-I) and protein geranylgeranyltransferase type II (GGTase-II), also

known as Rab GGTase (Zhang and Casey 1996, Hougland and Fierke 2009).

Protein prenyltransferases recognize the “CaaX” box, at the

c-terminal, which is the signature and transfers a prenyl group from either

farnesyl pyrophosphate or geranylgeranyl phosphate to the sulfhydryl group

of cysteine (Zhang and Casey 1996). Subsequently, last three amino acids,

two aliphatic and the C-terminal residue are removed by a prenyl protein–

specific endoprotease and the α-carboxyl group of prenylated cysteine is

methylated by a prenyl protein–specific methyltransferase. Farnesyl

transferases recognizes CaaX boxes where X is Met, Ser, Gln, Ala, or Cys,

whereas geranylgeranyl transferase-I recognizes CaaX boxes with X as Leu or

Glu and transfers geranyl geranyl groups to the cysteine. GGTases II transfers

two geranylgeranyl groups, with each attached to separate cysteines, and in

these cases there is no C-terminal carboxyl methylation observed (Zhang and

Casey 1996).

1.2.1.2 Myristoylation

In myristoylation myristate, a relatively rare 14-carbon fatty acid is

transferred cotranslationally, from myristoyl-CoA to the amino group of

Page 5: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

5

N-terminal glycine residue of the target protein by the enzyme myristoyl-

CoA: protein N-myristoyltransferase (NMT) (Resh 1994). Myristate can be

attached to the N-terminal glycine in synthetic peptides. Myristoylated

proteins play a vital role in membrane targeting and signal transduction (Resh

1994).

1.2.1.3 Cholesterol Modification

It is a C-terminal posttranslational modification of a family of

signaling proteins referred to as hedgehog (Hh) proteins found in insects,

vertebrates, and other multicellular organisms (Mann and Beachy 2000).

These are involved in the patterning of diverse tissues during development.

Addition of cholesterol to Hh proteins proceeds via an autoproteolytic internal

cleavage reaction at the -Gly-Cys-Phe- tripeptide motif, characteristic of Hh

precursors and attachment of cholesterol to the C-terminal Gly (Mann and

Beachy 2000).

1.2.1.4 Glycosylphosphatidylinositol (GPI) Modification

Biosynthesis of GPI-linked proteins occurs in the endoplasmic

reticulum and involves complex biosynthetic processes. GPI anchored

proteins are linked at their carboxy terminus through a phosphodiester linkage

of phosphoethanolamine to a trimannosyl-non-acetylated glucosamine

(Man3-GlcN) core. The reducing end of GlcN is linked to

phosphatidylinositol (PI). PI is then anchored through another phosphodiester

linkage to the cell membrane through its hydrophobic region (Low et al

1986). These glycolipid-modified proteins function as cell surface receptors,

cell adhesion molecules, cell surface hydrolases, complement regulatory

proteins, protozoal surface molecules etc (Simons and Toomre 2000).

Page 6: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

6

1.2.2 Prokaryotic Lipid Modification

In prokaryotes, the lipid modification has been extensively studied

in bacteria, as in the case with this study. In order to face the challenge of

placing proteins for diverse functions at membrane-aqueous interface,

bacteria have evolved a novel way of lipid modifying the N-terminal of such

proteins to anchor them to membrane. These lipoproteins take part in sensory

signaling, adhesion, pathogenesis, conjugation, protein transport, support and

integrity of cells including growth, division and spore formation (Babu et al

2006).

In archaea, though lipid-modified proteins have been reported in

many species, the mechanism of lipid addition and the types of lipid groups

added are not clearly understood. Mass-spectrometry analysis of halocyanin

from Natronomonas pharaonis revealed the presence of N-acetyl

S-diphytanyl Cys as the N-terminal amino acid (Mattar et al 1994). Recently,

iron-binding protein, DsbA-like thioredoxin domain protein and maltose

binding protein in Haloferax volcanii were demonstrated as lipoproteins

(Gimenez et al 2007). However, no archaeal homologue of bacterial

lipoprotein biosynthetic enzymes has been identified.

The thesis deals with bacterial type of lipid modification, owing to

its biological significance and its potential in several biotechnological

applications.

1.3 BACTERIAL LIPID MODIFICATION AND BACTERIAL

LIPOPROTEINS

Braun and Rehn in 1969 identified insoluble part of outer

membrane proteins when treated with alkali, which was soluble in

chloroform-methanol (2:1) solvent system. This observation led to the classic

Page 7: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

7

discovery of bacterial lipoproteins. The covalently attached lipid to the outer

membrane protein was later identified as a diacylglyceryl group. This moiety

was found attached via a thioether linkage to the sulfhydryl group of N-

terminal Cys and the α-amino group of diacylglyceryl modified-cysteine was

fatty acylated (Hantke and Braun 1973) (Figure 1.1). Although, in general this

chemical nature of lipoproteins in bacteria was found to be ubiquitous, a few

variants were identified; in Borrelia burgdorferi an acetyl moiety replaces one

of the acyl moieties in the thioether-linked lipid group (Beermann et al 2000),

in a few Gram-positives like Bacillus sps, the amide-linked fatty acid is

missing (Tjalsma et al 1999). In Archaea, Natronomonas pharaonis, a

halophilic archaea, structurally similar N-acyl S-diphantanyl moiety is the

lipid moiety (Mattar et al 1994).

Figure 1.1 Structure of N-acyl S-diacylglyceryl Cysteine, the common

N-terminal modification among bacterial lipoproteins

1.3.1 Roles of Bacterial Lipoproteins

Currently, about 35,000 lipoproteins from about 750 bacteria have

been identified or predicted and this number is bound to increase with the

increase in completed bacterial genomes are sequenced. Roughly

50-100 lipoproteins occur in a bacterium (Babu et al 2006). The broad

functions performed by the lipoproteins at membrane-boundary of a cell can

Page 8: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

8

be depicted as in Figure 1.2. Right from the attachment for colonization to

division and sporulation, the lipoproteins play crucial roles in the bacterial

viability and proliferation (Sutcliffe and Russel 1995). An account of such

roles is given below.

Figure 1.2 Bacterial lipoprotein carry out diverse functions at the

membrane-aqueous interfaces

1.3.1.1 Bacterial lipoproteins in structural integrity

Mutations in Lpp make the cells hypersensitive to various toxic

chemicals like detergents and cause release of periplasmic proteins to the

extracellular medium cells leaky and labile to toxic (Suzuki et al 1978,

Yem and Wu 1978). Pal, (peptidoglycan-associated lipoprotein) of Gram-

negative bacteria, is essential for stability of the cell envelope (Cascales et al

2002). Mutations that prevented lipid modification of NlpI, new lipoprotein in

E. coli (nlpI::cm) made E. coli osmotically-sensitive and showed impaired

septal formation, thus making cells appear segmented (Ohara et al 1999).

Temperature-sensitive apolipoprotein N-acyltransferase [lnt(Ts)] mutants of

Salmonella were to be reported non-flagellate at 42ºC. These mutants

Page 9: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

9

defective in lipoprotein biosynthesis affected lipid modification of FlgH, the

L-ring subunit of the flagellar basal body (Dailey and Macnab 2002).

1.3.1.2 Bacterial lipoproteins as adhesions

Adherence to a surface is key to colonization or formation of

biofilm on a variety of surfaces. This is often carried out by the surface

lipoproteins that function as adhesins and mediate molecular cross-talk at the

cell-surface interface. A laminin-binding lipoprotein (Lmb) mediates

attachment of Streptococcus agalactiae to human laminin (Spellerberg et al

1999). SsaB, 34.7-kDa lipoprotein of Streptococcus sanguis is an adhesin that

interacts with a salivary receptor and possibly involved in coaggregation with

Actinomyces naeslundii (Jenkinson 1992). CsgG is a lipoprotein involved in

the regulation of curli formation, an adhesive surface fibre produced by

Escherichia coli and Salmonella for biofilm formation (Römling et al 1998).

1.3.1.3 Bacterial lipoproteins as binding/transport proteins

Vital functions like nutrient-uptake are essentially carried out at the

membraneous surface. Such functions are also significant in niche-based

adaptation. Substrate-binding lipoproteins of ABC transport systems represent

~40% of the putative lipoproteins in Gram-positive bacteria (Hutchings et al

2009). For example, a 45kDa substrate-binding lipoprotein of the

cyanobacterium, Synechococcus sp. strain PCC 7942 is crucial to the transport

of nitrate and nitrite (Maeda and Omata 1997). Oligopeptide pheromone

signals of Enterococci are generated from proteolytic processing of

lipoprotein signal peptides and taken up by lipoprotein-dependent

olipopeptide ABC permeases (Chandler and Dunny 2004). OppA lipoprotein

in Bacillus subtilis is reported to mediate peptide transport along with other

transporters (Perego et al 1991, Rudner et al 1991).

Page 10: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

10

1.3.1.4 Bacterial lipoproteins in spore formation

Germination of B. subtilis spores normally begins with the binding

of specific nutrients by specific receptors, GerAC, GerBC, GerKC, GerD

(Moir and Smith 1990). Among these, GerD is a proven lipoprotein, which is

important in spores’ rapid response to nutrients, either by directly interacting

with nutrient receptors or performing some signal transduction essential for

germination (Pelczar and Setlow 2008).

1.3.1.5 Bacterial lipoproteins in signaling system

In E. coli, NlpE, an outer membrane lipoprotein, is essential to

mediate surface-induced activity of a two-component signal transduction

pathway that responds to stresses that affect cell envelope by activating genes

encoding periplasmic protein folding and degrading factors (Manson et al

1998). RcsF, a putative outer membrane lipoprotein mediates the signaling to

the sensor RcsC, a component of His-Asp phospho-relay system in

γ-Proteobacteria that is involved in signaling outer membrane defects

(Castanié-Cornet et al 2006).

1.3.1.6 Bacterial lipoproteins in protein secretion, folding and localization

DsbA a thiol-disulphide oxidoreductase in Staphylococcus sp, is a

lipoprotein involved in disulphide bond formation of secreted protein

substrates (Heras et al 2008). YidC, a membrane insertase is a putative

lipoprotein in several Gram-positives that are involved in translocation of

protein substrates across cytoplasm (Serek et al 2004). LolB, an outer

membrane – specific lipoprotein receptor, binds specifically to outer

membrane lipoproteins bound to LolA and involves in localization of such

lipoproteins (Tanaka et al 2001).

Page 11: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

11

1.3.1.7 Bacterial lipoproteins in electron transfer processes

Cytochromes in Bacillus (c551) and Heliobacterium gestii (c553),

which are involved in electron transfer, are lipoproteins (Sutcliffe and

Harrington 2002). Cytochrome c oxidase subunit II (CtaC proteins) and QoxA

menaquinol oxidase have been shown to be lipoproteins (Antelmann et al.

2001). The B. subtilis Sco1 (YpmQ) accessory protein involved in

cytochrome c oxidase assembly is also a lipoprotein (Andrews et al 2005)

ResA homologue is a putative thioredoxin- like lipoprotein, which together

with DsbD/CcdA family of electron – transfer proteins mediate reduction of

the apocytochrome c for insertion of prosthetic heme group (Sutcliffe and

Hutchings 2007).

1.3.1.8 Bacterial lipoproteins in pathogenesis

Bacterial lipoproteins play crucial role in host-pathogen

interactions, from surface adhesion for effective colonization to delivery of

virulence factors into the host cytoplasm. Outer surface lipoproteins of

B. burgdorferi possess cytokine stimulatory properties. One such protein,

OspA, is a potent stimulant of nuclear factor, kappa B (Wooten et al 1996).

Surface-associated lipoprotein of Streptococcus pneumoniae, putative

proteinase maturation protein A (PpmA) is involved in colonization during

infection (Hermans et al 2006). Lipoproteins in Staphylococcus aureus

induced inflammation by TLR2 signaling in murine peritoneal macrophages.

B. abortus expresses outer membrane lipoproteins Omp19 (L-Omp19) that

activates human neutrophils (Zwerdling et al 2009). MxiM, a lipoprotein of

the type III secretory pathway in Shigella flexneri is important for delivering

invasins into host cytoplasm (Schuch and Maurelli 1999). Surface

lipoproteins of Mycoplasma are expressed upon infection. One such

Page 12: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

12

lipoprotein MAA1 of Mycoplasma arthritidis has been shown to be essential

for colonization of joint tissues in the early infectious stage (Washburn et al

2000). Recently, a mutant of Listeria monocytogenes, lacking signal peptidase

specific for lipoprotein biosynthesis was found to be ineffective in

phagosomal escape (Poupet et al 2003).

Antibiotic-resistance conferring - β-lactamases in Bacillus cereus,

Bacillus sp. strain 170, Bacillus licheniformis, Mycobacterium sp. and

Staphylococcus aureus are reported as lipoproteins (Sutcliffe and Russell

1995).

1.3.2 Signal Sequence of Bacterial Lipoprotein Precursors

The discovery of the first kind of covalent lipid modification is the

major outer membrane lipoprotein, Lpp; this stimulated further investigations

to elucidate its biosynthesis (Braun and Rehn 1969, Hantke and Braun 1973).

Inouye and co-workers (1977) identified a 20-amino acid extension at the

amino terminus of the lipoprotein when its purified mRNA was translated

in vitro. Its signal sequence was found to be similar to a typical signal peptide

with; the N-terminal positively charged region consisting 5 amino acids

(n-region) followed by a hydrophobic segment of 9 amino acids (h-region)

and a cleavage region (c-region) (Inouye et al 1977).

The positively charged N-region is essential for secretion of

prolipoprotein. Removal of the basic amino acids residues or its substitution

with negatively charged residues hampered translocation and caused

cytoplasmic accumulation of prolipoproteins (Vlasuk et al 1983). Inouye et al

(1983) proved the essentiality of cysteine at the +1 position by replacing it

with glycine and showing accumulation of unmodified prolipoproteins. The

remarkable flexibility was shown by several mutational studies by others

Page 13: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

13

(Inouye et al 1984). Replacing the Gly preceding Cys, with bulkier residues

that Thr set the limit with slow lipid modification, while Val or Leu resulted

in the accumulation of unmodified prolipoprotein.

With growing interest in this novel modification, many more

lipoproteins were identified and their precursors studied (Pollitt et al 1986).

The complete analysis of 25 signal sequences by Hayashi and Wu (Hayashi

and Wu 1990) and 75 signal sequences subsequently by Braun and Wu

(Braun and Wu 1993) established the common tripartite nature of the signal

peptide with a characteristic consensus sequence in the cleavage region,

L[AS][GA]C instead of the Ala-X-Ala sequence which is commonly

identified in the cleavage region of normal secretory proteins. The newly

identified consensus sequence in prolipoproteins was termed as “lipobox” and

it serves as a signature for differentiating lipoproteins and non lipoproteins

(Babu and Sankaran 2002). Currently, typical lipoprotein signal sequences are

identified with N-terminal region containing 5 to 7 amino acids with

minimum of one positively charged amino acid, but majority containing two,

the length of hydrophobic region varying between 7-22 uncharged amino

acids, the c-region has a consensus [LVI][ASTVI][GAS]C (Figure 1.3) and

the invariant lipid-modified Cys at +1 position. In the lipobox, Leu is

favoured at –3 position (81%), followed by Val (9%); the –2 position is more

flexible as uncharged, polar and non-polar residues occur [Ala (29%),

Ser (27%), Thr (13%), Val (10%) and Ile (8%)]; Gly (43%) and Ala (39%)

are preferred at -3 position, Ser which defines the size limit for lipid

modification has been observed in 14% of the cases (Figure 1.4).

Page 14: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

14

Figure 1.3 Tripartite structure of lipoprotein signal sequence with

positively charged n-region, hydrophobic ‘h’ region and c-

terminal cleavage region

Figure 1.4 Frequency of amino acids in lipobox (Babu et al 2006)

1.3.3 Bacterial Lipid Modification - Pathway

The similarity in fatty acid composition of murein lipoprotein to

that of bulk phospholipids of E. coli pointed to the possible donors of fatty

acyl groups. Pulse-chase experiments confirmed this and showed that the

O- and N- acyl moieties were derived from phosphatidyl glycerol and any

phospholipids [phosphatidylethanolamine (PE), phoshpatidylglycerol (PG)

and cardiolipin (CL)], respectively (Tokunaga et al 1982).

The discovery of globomycin, a fungal antibiotic, and its use in

study of lipoprotein biosynthesis led to profound implications towards

understanding of lipoprotein biosynthesis (Inukai et al 1978). The

accumulation of a number of lipid modified prolipoproteins of different sizes

Page 15: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

15

in inner and outer membranes of E. coli suggested an existence of a common

biosynthetic pathway for lipoproteins (Hussain et al 1980, Ichihara et al

1981).

A comprehensive understanding on prolipoproteins initiated

in vitro studies using S35Methionine-labeled unmodified prolipoprotein as a

substrate for lipid modification. Accordingly, based on in vitro studies and

in vivo studies, a common biosynthetic pathway for lipoproteins in E. coli was

postulated (Chattopadhyay and Wu 1977) in which, diacylglyceryl

modification of the Cys residue in the lipobox of prolipoproteins precedes the

processing of the lipid-modified prolipoproteins by a specific endopeptidase

called prolipoprotein signal peptidase (Tokunaga et al 1983). This lipoprotein

signal sequence cleavage precedes two enzymatic reactions; attachment of a

non-acylated glycerol moiety to the cysteine by prolipoprotein

phosphatidylglycerol glyceryl transferase, followed by O-acylation of the

hydroxyl group of glycerol by phospholipids acyl transferase (Tokunaga et al

1982).

An in vitro peptide assay with N-terminal 24 amino acids of

Braun’s prolipoprotein, designed by Sankaran and Wu (1994) experimentally

proved the transfer of diacylglyceryl moiety from phosphatidylglycerol to the

sulfhydryl group of cysteine residue with a concomitant formation of

sn-glycerol 1-phosphate. This new assay led to alteration in the proposed

biosynthetic pathway and accordingly the enzyme that catalyzes the first step

of lipid modification was named prolipoprotein: phosphatidylglycerol

diacylglyceryl transferase. The first enzyme phosphatidylglycerol:

prolipoprotein diacylglyceryl transferase (Lgt) transfers a diacylglyceryl

moiety from phosphatidylglycerol (PG) to the invariant cysteine in the

lipobox of a prolipoprotein. The signal sequence in the diacylglyceryl

modified prolipoprotein is cleaved subsequently by the second enzyme,

Page 16: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

16

lipoprotein signal peptidase (Lsp) to form apolipoprotein. The amino group in

apolipoprotein generated from signal peptidase action is fatty acylated by the

enzyme Apolipoprotein N-transacylase resulting in N-acyl S-diacylglyceryl-

modified lipoprotein (Sankaran and Wu 1994) (Figure 1.5).

Figure 1.5 Bacterial lipoprotein biosynthetic pathway (Sankaran and

Wu 1994) showing the conversion of pre-protein into

lipoprotein sequentially catalyzed by three inner membrane

enzymes

Page 17: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

17

1.3.3.1 Phosphatidylglycerol:prolipoprotein diacylglyceryl transferase

Phosphatidylglycerol:prolipoprotein diacylglyceryl transferase

(Lgt) catalyzes the first committed step of bacterial lipoprotein biosynthetic

pathway. The enzyme transfers the diacylglyceryl moiety of

phosphatidylglycerol to thiol group of the invariant cysteine in the lipobox of

prolipoproteins with concomitant release of glycerol-1-phosphate (Sankaran

and Wu 1994). A temperature sensitive (ts) mutant of Salmonella

typhimurium accumulated unmodified prolipoprotein in the cytoplasm at 42ºC

but not at 30ºC. Sequencing of the complementing 1.4-kilobase DNA insert

from S. typhimurium revealed an ORF of 291 amino acids, which is

immediately 5’ to the thyA gene and allelic to umpA of E. coli (Gan et al

1993).

After identifying the role of Lgt in bacterial lipoprotein

biosynthesis by Sankaran and Wu (1994) much of the research was carried

out to understand its structure-function relationship. Analysis of the primary

sequences of Lgt from phylogenetically distant species, such as Escherichia

coli, Salmonella typhimurium, Staphylococcus aureus and Haemophilus

influenzae revealed a significant degree of homology and conservation with

about 24% identity and 47% similarity (Gan et al 1995). The alignment of Lgt

sequences from phylogenetically distant species, such as Escherichia coli,

Salmonella typhimurium, Staphylococcus aureus and Haemophilus influenzae

revealed a conserved region of 103-HGGLIG-108, indicating its possible

involvement in active site (Qi et al 1995). The enzyme contained hydrophobic

segments interspersed with charged hydrophilic segments rich in Arg, among

Gram-negative organisms, Arg and Lys in Gram-positives, thus was deduced

with a pI value of 10.4. The enzyme was found to be inactivated with

diethylpyrocarbonate with a second-order rate constant of 18.6 M-1 s-1, and

this inactivation was reversible with hydroxylamine at pH 7, thus pointing

Page 18: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

18

towards the involvement of a single modifiable residue, His or Tyr in its

activity. Accordingly, site-directed mutagenesis studies indicated role of

His-103 and Tyr-235 was crucial for Lgt activity. Consequently, deletion or

modification of these residues inactivated the enzyme (Sankaran et al 1997).

Role of lgt in growth and viability of bacteria was understood from

mutational studies carried out in lgt. The lgt null mutants in Gram-negatives

like E. coli and Salmonella were lethal (Qi et al 1995) unlike Gram-positives

that remained viable (Leskela et al 1999). This indispensability of Lgt in

Gram-negative bacteria has proscribed the study of virulence of lipoprotein-

processing mutants of Gram-negative pathogens. However, with several

Gram-positive bacteria as pathogens, implications from lgt mutants of such

pathogens revealed that not all cases showed attenuation of virulence (Leskela

et al 1999, Pettit et al 2001, Stoll et al 2005). Deletion of lgt in Listeria

monocytogenes caused impaired intracellular growth in human epithelial

(Caco-2) and mouse fibroblast (3T3) cell lines (Baumgärtner et al 2007).

Similarly, lgt mutants of S. agalactiae (Bray et al 2009, Henneke et al 2006)

and Staphylococcus aureus (Wardenburg et al 2006) showed hypervirulent

phenotypes in mouse models of infection. Thus, in lgt mutants there might be

a strain-specific balance between effects on immune activation and the

functional compromisation because of the loss of lipoprotein lipidation.

In a global topology analysis of the Escherichia coli inner

membrane proteome, Daley et al showed that Lgt is a transmembrane protein

(Daley et al 2005). However, based on a simple, precise radioactive assay, Lgt

was found to be associated to the inner-membrane peripherally (Selvan and

Sankaran 2008).

Page 19: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

19

1.3.3.2 Lipoprotein signal peptidase

Among the enzymes involved in lipoprotein biosynthesis,

lipoprotein signal peptidase (Lsp) is the first enzyme to be identified and

studied in greater detail (Dev and Ray 1984). Lsp, a specific endopeptidase

recognizes diacylglyceryl modified prolipoprotein and cleaves the signal

peptide resulting in apolipoprotein (Sankaran and Wu 1995). The

identification of a fungal penta-peptide antibiotic, globomycin and its ability

to inhibit the processing of murein prolipoprotein to a lipoprotein is

considered as one of the significant contributions towards the understanding

of lipoprotein biosynthetic pathway in bacteria (Inukai et al 1978).

Globomycin-treated cells arrested translocation of Lpp to outer membrane

and its lipid-modified precursor accumulated in the inner membrane to the

accumulated precursors contained covalently linked glyceride (Hussain et al

1980).

The involvement of an exclusive signal peptidase for the cleavage

of lipoprotein signal sequence was identified in 1982 by Tokunaga et al.

Around the same time, the requirement of diacylglyceryl modified

prolipoprotein as a prerequisite for lipoprotein-specific signal peptidase was

demonstrated. With the knowledge that over-expression of lipoprotein signal

peptidase results in increased globomycin resistance, a clone containing

plasmid pLC3-13 was isolated and subcloned into pBR322 to generate

plasmid pMT52 (Tokunaga et al 1983, Yamagata et al 1983). This plasmid

was used to complement the temperature sensitive mutant of lipoprotein

signal peptidase in E. coli. This enabled mapping of the lsp gene between

0.5 to 0.6 min of E. coli genome (Regue et al 1984, Tokunaga et al 1985). The

amino acid sequence of the Lsp, as deduced contained 164 amino acids with a

molecular weight of 18 kDa. Lsp was deduced as an integral membrane

protein with four membrane-spanning segments connected by two periplasmic

Page 20: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

20

loops and one positively charged cytoplasmic loop (Munoa et al 1991). Lsp

was also reported to be a novel aspartic protease (Sankaran and Wu 1995).

A biochemical assay for Lsp was developed by Dev and Ray in

1984. The commonly used [35S]-labeled diacylglyceryl modified

prolipoprotein was used as the substrate prepared from globomycin-treated

E. coli B cells. The assay also demonstrated that globomycin inhibited the

prolipoprotein signal peptidase in a non-competitive manner with a Ki value

of 36nM (Dev et al 1985). It was recently reported that Lsp can cleave even

unmodified prolipoprotein substrates in Listeria monocytogenes, indicating

perhaps the pathway does not follow a sequence always (Baumgärtner et al

2007). Likewise, a Streptococcus agalactiae lgt mutant revealed cleavage of

the ScaA lipoprotein precursor at the Lsp cleavage site in indicating its

activity towards unmodified forms in some Gram-positive bacteria (Bray et al

2009). Lsp mutants of several Gram-positive pathogens have shown

attenuation of virulence (Zhao and Wu 1992, Mei et al 1997, Tjalsma et al

1999); Lsp mutants of Mycobacterium tuberculosis showed reduced growth in

macrophages when cultured in vitro (Sander et al 2004). Failure of Lsp

mutant to activate immune responses via TLR2 was identified with lsp

mutants of Streptococcus agalactiae, Streptococcus equi and Streptococcus

pneumonia (Henneke et al 2006).

1.3.3.3 Phospholipid:apolipoprotein transacylase

Phospholipid:apolipoprotein transacylase (Lnt) catalyzes the

transfer of an acyl moiety to the amino group of the apolipoprotein through

amide linkage and concomitant release of lysophospholipid. The acyl donor

for this reaction could be any phospholipid present in the inner membrane

(Sankaran et al 2005).

Page 21: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

21

This enzyme catalyzing the conversion of apolipoprotein to mature

lipoprotein, was detected by an in vitro assay using [35S]methionine-labeled

apolipoprotein as the substrate. The mature lipoprotein generated following

enzymatic conversion of apolipoprotein was estimated by densitometric

scanning of the autoradiogram (Gupta and Wu 1991). Further, studies

revealed phosphatidylethanolamine is not essential for the N-acylation of

apolipoprotein and subsequent formation of lipoprotein. But, other major

phospholipids such as phosphatidylglycerol and cardiolipin could also serve

as the donor of fatty acid in N-acylation of apolipoproteins (Gupta et al 1991).

Gupta et al isolated a temperature sensitive mutant of Salmonella

typhimurium, SE5312, which accumulated apolipoprotein at 42°C. The

mutant defective in N-acyl transferase activity was complemented by a gene

allelic to cutE of E. coli (Gupta et al 1995). Mapping of this mutation placed

the lnt gene in 14-17 min of Salmonella typhimurium chromosome (Rogers

et al 1991). The lethality due to loss of Lnt activity was reported to be due to

the retention of apo-Lpp in the cytoplasmic membrane, implicating Lnt

activity is essential for proper localization of outer membrane lipoproteins.

Although biochemical analysis of Braun’s lipoprotein expressed in

Bacillus subtilis and lipoprotein preparations from Staphylococcus aureus

revealed N-acylation, BLAST search for homologues of Lnt could not be

identified in Gram-positives like Firmicutes (Hayashi et al 1985, Navarre et al

1996) However, Streptomyces coelicolor revealed homologues of Lnt but the

gene (SCO1336) failed to complement the activity in an E. coli lnt depletion

strain. Topology mapping of Lnt with -galactosidase and alkaline

phosphatase fusions indicated the presence of six membrane-spanning

segments (Robichon et al 2004). The deduced amino acid sequence revealed

512 amino acids and an estimated molecular mass of 56 kDa. The optimum

pH was found to be in the range of 6.5 to 7.4 and an appreciable activity was

Page 22: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

22

reported upto 60oC (Sankaran et al 1995). Lnt, classified as a member of the

nitrilase superfamily, contains a common Glu-Lys-Cys catalytic triad (Pace

and Brenner 2001). Seven conserved residues for Lnt were identified based on

which a structural model was also predicted. The essential residues were, the

potential catalytic triad formed by E267-K335-C387, Y388 and E389

comprising the hydrophobic pocket, which also has the active site and W237 -

E343, which are away from the active site, are expected to open and close

upon the binding and release of phospholipid and/or apolipoprotein (Vidal-

Ingigliardi et al 2007).

1.3.4 Translocation of Bacterial Lipoproteins Across Inner

Membrane

Secretory proteins are synthesized in the cytoplasm to reach their

destination outside the cytoplasm, these proteins need to be recognized and

targeted by the protein secretion system. The major route for protein transport

across cytoplasm is through ‘Sec’ machinery translocation, in which secretory

proteins are translocated in an unfolded state (Pugsley 1993). Another

recently identified protein translocation system, Twin Arginine Translocase

(TAT) Pathway exclusively exports pre-folded or fast-folding secretory

proteins (Berks 1996, Sargent et al 1998, Thomas et al 2001).

Bacterial prolipoproteins, which are synthesized within the

cytoplasm, are all known to be translocated via Sec (Sugai and Wu 1992).

As, the enzymes for modification and processing are present in the inner

membrane, the association between Sec and lipoprotein biosynthetic

machineries had been of interest, but not adequately probed. However, it has

been shown that mutants impaired in secretion were also found impaired in

lipid modification (Sugai and Wu 1992). Although, TAT is implicated for

translocation of prolipoproteins, it has not been adequately studied and

Page 23: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

23

understood (Lee et al 2006). A detailed account of both Sec and TAT

pathways and its role in translocation of bacterial prolipoproteins are given

below.

1.3.4.1 The common Sec pathway

The Sec pathway is the only known conserved protein translocation

pathway in all the three domains of life. The pathway involves a series of

steps to export proteins in an unfolded manner across the cytoplasm, which is either post-translational or co-translational (Mitra et al 2006). In bacteria, the

Sec translocase is a stable heterotrimeric organization, SecYEG, which

comprises three integral membrane proteins SecY, SecE and SecG. This complex associates with the auxiliary protein complex, SecDFYajC and

YidC. SecA, a dimeric ATPase is located at the cytoplasmic side of SecYEG

complex (Mitra et al 2006). SecB is an acidic homo tetrameric chaperone protein organized as dimer of dimers and wraps around pre-protein and

prevents premature folding of the protein (Driessen 2001). The nascent

polypeptide chain emerging from the ribosome is mostly routed to the Sec Translocase in SecB-dependent manner. In SecB-independent targeting, the

pre-proteins are translocated as ribosome-bound nascent chains (RNCs) by

the signal recognition particle (SRP) (Mitra et al 2006).

SecB-bound pre-protein, facilitates electrostatic interaction between SecB and SecYEG-associated SecA. The interaction allows transfer of

pre-protein from SecB to SecA upon ATP binding the interface of the two

nucleotide binding folds (NBF1 and NBF2) of SecA (Fekkes et al 1998). The energy from ATP hydrolysis together with proton motive force facilitates

translocation of pre-proteins through SecYEG core (Mitra et al 2006). In

co-translational translocation, SRP interacts with pre-proteins to form ribosome nascent chain complex (RNC) .The complex is targeted to

SRP-receptor; FtsY, which in turn is bound to translocation-competent

SecYEG. Upon interaction with the receptor, RNC is transferred to SecYEG

Page 24: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

24

core, which requires GTP hydrolysis (Driessen 2001). The precursor proteins

reaching this core are speculated to be pumped across the membrane barrier by utilizing proton motive force (Driessen 2001, Mitra et al 2006) (Figure 1.6).

Figure 1.6 Schematic overview of Sec translocation (Keyzer et al 2003)

showing the association of nascent polypeptide of secretory

protein with ‘Sec’ complex

1.3.4.2 Discovery of Twin Arginine Translocase (TAT) pathway

The Twin Arginine Translocase (TAT) pathway was first

discovered only recently (1995) in plant thylakoids (Chaddock et al 1995,

Berks 1996, Clark and Theg 1998). It functions in a radically different way to

that of the Sec translocase. The translocation is independent of nucleotide

triphosphate hydrolysis and depends solely on proton gradient hence, referred

as ∆pH pathway (Cline et al 1992, Alder and Theg 2003). The extensive

studies on the new mechanism of export in thylakoids revealed that the signal

peptides of the target proteins exported contained a common and essential

twin-arginine motif preceding the hydrophobic region (Chaddock et al 1995).

Berks (1996) observed that certain bacterial periplasmic proteins contained a

Page 25: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

25

conserved twin arginine motif at the n-h boundary as in thylakoid

pre-proteins, implicating the existence of ∆pH-driven translocation in

bacteria. One such cofactor requiring enzyme, Trimethylamine N-oxide

reductase (TMAO reductase) was found to fold only in the presence of

molybdenum and then exported to periplasm in a Sec-independent manner

(Santini et al 1998). Around the same time, the first component of

∆pH-system was identified in maize and later its homologs were identified in

E. coli and were found to encode two distinct genes, one of them belonged to

a four-gene operon, and the other was unlinked (Sargent et al 1998). The

products of these genes were found to be required for the Sec-independent

export of a range of proteins with twin-arginine motif in its signal sequences

(Bogsch et al 1997, Sargent et al 1998). The genes were named as tatA in

putative tatABCDE operon and tatE (Sargent et al 1998, Hicks et al 2003).

Later, Hynds and coworkers (1998) reported the ability of ∆pH-pathway to

export tightly folded proteins in thylakoids.

Characteristic signal sequences of proteins translocated via the TAT

pathway: Signal sequences that target proteins to the TAT machinery

conform to overall tripartite structure but have additional distinct features that

delineates from Sec-signal peptides. The striking feature is the presence of

consensus motif –S/T-R-R- X- F -L- at the n-h boundary with invariant

consecutive Arg residues are almost invariant, X is any polar amino acid

(Berks 1996) (Figure 1.7). Substitution of either arginine residues with lysine

appears to block transport. Nevertheless, in rare cases it has been observes

putative TAT substrates. Ser, Thr, Gly, Asp and Asn occupy -1 position

predominately, with serine occurring in more than 50% of the known

sequences (Lee et al 2006). Site-directed mutagenesis of conserved residues

in the motif revealed Phe and to a lesser extent Leu is important for TAT

targeting (Stanley et al 2000). The TAT signal sequences are longer with

28-56 amino acids compared to18-26 amino acids of Sec signal sequences

Page 26: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

26

(Berks 1996). The additional length in TAT signal sequences is largely due to

extended n-region (Berks 1996). The h-region is less hydrophobic than that of

Sec signal peptides due to a higher occurrence of Gly and Thr and a

significantly lower abundance of Leu residues (Berks 1996, Cristobal et al

1999). The c-region is characterized by the presence of basic amino acids

whereas such a feature is uncommon among Sec signal sequences. Actually

this feature along with degree of hydrophobicity of h-region acts as

“sec-avoidance” signal (Berks 1996, Bogsch et al 1997).

Figure 1.7 Features of a typical TAT signal peptide from

E. coli, TorA highlighting the characteristic TAT-

recognition sequence between n and h regions, and the

cleavage region preceded with positively charged residues,

the ‘Sec’-avoidance signal (Lee et al 2006)

Components of TAT Pathway: In E. coli, four genes tatA, tatB, tatC and

tatE were identified to encode integral membrane proteins constituting the

TAT components (Lee et al 2006). The tatA, tatB and tatC genes form an

operon with a fourth promoter-distal gene, tatD, whereas tatE is

monocistronic (TatD, a soluble protein with DNase activity was later found to

have no role in TAT pathway) (Sargent et al 1998, Wexler et al 2000). tatE is

a cryptic gene duplication of tatA and codes for the same functional protein.

In Gram-positives, Gram-negatives, tatB gene is missing; it has only

homologues of tatA and tatC genes (Berks et al 2003).

Page 27: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

27

In E. coli the minimum TAT components required for TAT

translocation are TatA, TatB and TatC (Tha4, Hcf106, and cpTatC in

chloroplasts, respectively) (Behrendt et al 2004, Lee et al 2006). TatA and

TatB respectively are 9.6kDa and 18.4kDa proteins with a hydrophobic

transmembrane α-helix at their N-terminus followed by an amphipathic

α-helix localized at the cytoplasmic side of the membrane. TatC is a 28.9-kDa

protein with six TM regions (Allen et al 2002). It is an essential component of

TAT system, as deletion mutants of tatC completely abolished

TAT-dependent transport (Bogsch et al 1998, Allen et al 2002). Detergent-

solubilized membranes of E. coli cells over expressing TAT components

revealed complexes of ~600 kDa to contain varying numbers of TatA (4 to

100; average 25) but with , a strict stoichiometric ratio of 1:1 of TatB and

TatC (de Leeuw et al 2002, Oates et al 2003).

Alami and coworkers (2003) used the site-specific cross-linking

studies to reveal the interaction of TatC interacts with the consensus -RR-

motif and the interaction of TatB with the entire length of the signal sequence

along the hydrophobic region extending to adjacent mature region. The

studies thus revealed that TatC formed the primary recognition site of TAT

Translocase. It was demonstrated that TatA transiently associated with TatBC

only in presence of a TAT-substrate and transmembrane proton gradient

(Alami et al 2003). TatA polymerizes on binding to TatBC complex at the

time of translocation to form a translocation channel of variable pore size and

can accommodate folded substrates upto 70 Å. To prevent ion leakage the

TatA protomers form a tight seal around the substrate and exports the

substrate across the membrane in an iris-type fashion (Gohlke et al 2005)

(Figure 1.8).

Page 28: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

28

Figure 1.8 Schematic overview of TAT pathway showing Tat

components and translocation of pre-folded secretory

proteins (Lee et al 2006)

Substrates of TAT pathway: The TAT pathway transports substrates that

require folding in cytoplasm. Majority of the TAT-dependent proteins were

identified as co-factor requiring proteins such as, hydrogenases, formate

dehydrogenases, nitrate reductases, trimethylamine N-oxide (TMAO)

reductases, and dimethyl sulfoxide (DMSO) reductases, all of which function

in anaerobic respiration (Berks et al 2003, Lee et al 2006). These proteins

acquire co-factor and fold in the cytoplasm, rendering them Sec-incompatible

(Santini et al 1998). Although some cofactor-binding sites such as those for

flavin adenine dinucleotide or copper are also found in proteins exported

through the Sec pathway, it is noted that the preference shifts to TAT

translocation for proteins with additional, or more complex copper binding

sites (Stanley et al 2000, Berks et al 2003). Azurins, pseudoazurins,

plastocyanins and rusticyanins contain Sec signals and single copper binding

sites, whereas, homotrimeric copper nitrite reductases with each subunit

Page 29: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

29

containing two cupredoxin domains are TAT-dependent (Berks et al 2003).

Proteins with iron-clusters are also common TAT substrates. The [Fe]

hydrogenases and the [Ni-Fe] hydrogenases also utilize TAT for export

(Dubini and Sargent 2003). Reductive dehalogenases from anaerobic bacteria

requires cobalamin as cofactor in addition to an [4 Fe-4S] cluster and are

exported via TAT pathway (Berks et al 2003).

Methylamine dehydrogenase is periplasmic enzyme with a

tryptophyl tryptophanquinone as cofactor formed upon covalent linkage of the

indole moieties from 2 tryptophan residues. The enzyme has 2 subunits, and

β. The subunit has Sec signals while β sub unit is TAT-dependent

suggesting the latter to be exported via TAT pathway (Berks et al 2003).

However, in this case at least some post-translational modification and

folding of this subunit is presumed to occur in the periplasm after the

transport step. Not all TAT substrates are co-factor -containing proteins, in

E. coli; SulfI is a proven TAT substrate of unknown function. Amidase A and

amidase C, enzymes involved in cell division are translocated via TAT system

but are neither exported with cofactors nor as multimers in a “hitch-hiker”

mechanism (Bernhardt and de Boer 2003, Ize et al 2003, Lee et al 2006).

Prediction of TAT substrates: Rose and coworkers (2002) developed TAT

FIND 1.1 to predict TAT substrates in Halobacterium NRC-1; the program

finds the position and sequence of TAT motif and also the length and

hydrophobicity of the uncharged region that follows TAT motif. However,

this programme generated greater false positives and was subsequently

refined. The advanced version, TAT FIND 1.2 was more stringent and was

trained mainly with putative haloarchaeal TAT substrates (Dilks et al 2003).

This program predicts TAT substrates based on two criteria

Page 30: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

30

(i) The presence of an (X-1) R0R+1(X+2) (X+3) (X+4) motif

within the first 35 amino acids of the protein, where X

represents a defined set of permitted residues,

(ii) The presence of an uncharged stretch of at least 13 amino

acids downstream of the R0R+1

Later, Bendsten and coworkers (2005) developed a new publicly

available programme, TatP, which in contrast to TAT FIND combines both

pattern matching and machine learning, generates less false positive

predictions. The programme identifies for the regular expression,

RR.[FGAVML][LITMVF], where '.' means any amino acid in potential TAT

substrates. The expression was generated from the ungapped multiple

sequence alignment of positive training set with the position of the two

arginines remained fixed. The pattern was found in 97% of the sequences of

the positive training set. The programme also features neural networks for

identifying cleavage site and for determining the amino acid specificity for a

TAT signal peptide.

1.3.4.3 Role of Sec and TAT pathways in translocation of

prolipoproteins

Although it is known that bacterial lipoproteins are in general

exported via Sec translocase during modification, the nature of association of

both the machineries is not clear (Sugai and Wu 1992, Kamalakkannan et al

2004). The role of sec pathway in secretion of prolipoproteins was first

understood from the studies on Lpp processing in the mutants lacking SecA

and SecF components. It was reported that in these mutants, the

prolipoprotein was localized to cytoplasmic membrane, but not modified with

diacylglycerol. From these results it was believed that the early steps in

protein export remained common to both prolipoprotein and non-lipoprotein

Page 31: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

31

precursors (Watanabe et al 1988). Later, Sugai and Wu (1992) reported that

temperature sensitive mutants of sec A, sec D, sec E, sec Y but not sec B

accumulated unmodified murein prolipoprotein in the cytoplasm indicating

the necessity of functional Sec components except SecB for modification.

These studies confirmed that both non-lipoprotein and lipoprotein precursors

are routed via common protein transport system and would diverge only with

regard to the modification and processing reactions, which are late events in

the export process.

Experiments as well as bioinformatics analysis reveals the

importance of TAT machinery in translocation of various proteins, but little is

known on its utilization in prolipoprotein export (Bolhuis 2002, Gimenez et al

2007). Presence of a lipobox sequence along with the TAT motif in the

protein sequences of Streptomyces coelicolor, Legionella pneumophila and

Haloferax volcanii, suggests the existence of TAT-dependent lipoproteins

(De Buck et al 2004, Dilks et al 2005, Gimenez et al 2007). Employing TAT

mutants of Streptomyces coelicolor, putative TAT lipoprotein substrates,

peptidylprolyl cis-trans isomerase, a putative sugar binding protein, an

iron-sulfur binding protein and a putative secretory protein were shown to be

TAT-dependent (Widdick et al 2006). Site-directed mutagenesis of TAT

signals of the iron-binding protein, DsbA-like thioredoxin domain protein,

and maltose binding protein in Haloferax volcanii, resulted in their

accumulation in the cytoplasm. Further, it was shown that the lipoprotein

signal peptidase inhibitor, Globomycin, inhibited the maturation of these

putative TAT substrates (Gimenez et al 2007). Curiously, it was observed that

the TAT-substrate, [NiFeSe] hydrogenase (HysAB) of Desulfovibrio vulgaris

Hildenborough has the TAT-box in the signal sequence of the small subunit

and the Lipobox in the N-terminal region of the large subunit. Mass-

spectrometric data supporting lipid modification of this protein were reported

recently (Valente et al 2007). Though these examples describe the possible

Page 32: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

32

involvement of TAT pathway in lipid modification, a clear study that

investigates the TAT dependent lipid modification in bacteria has not been

performed so far. Moreover, bacterial lipid modification strategies have so far

been studied only with Sec (Kamalakkanan et al 2004).

1.3.5 Sorting of Bacterial Lipoproteins

After translocation and modification at the inner membrane, the

lipoproteins are localized to either inner or the outer membranes. The first

seminal step towards understanding bacterial lipoprotein localization was

made by Yamaguchi et al (1998). They demonstrated the outer membrane

localization of beta-lactamase when fused with the signal peptide and the first

9 amino acid residues from the mature Lpp. However, on replacing the first

9-residue sequence with the first 12 residue sequence of lipoprotein-28, an

inner membrane lipoprotein, the enzyme was found exclusively in the inner

membrane. The localization of this fusion enzyme was shifted to the outer

membrane upon substituting the second amino acid residue (Asp) with Ser,

suggesting crucial role of the second amino acid in lipoprotein localization.

Later, it was shown that the residue at position 3 also influences the

Asp-dependent inner-membrane retention of lipoproteins (Gennity and Inouye

1991). To know the role of individual amino acids on membrane sorting,

especially at the +2 position, Seydel et al (1999) systematically substituted

various amino acids at position 2 of an indicator protein, Lipo-MalE. By using

this system, they reported that Asp, Glu, Phe, Gly, His, Lys, Asn, Pro, Arg,

Trp and Ala, Cys, Iso, Leu, Met, Glu, Ser, Thr, Val at +2 position functioned as inner and outer membrane targeting signals respectively.

The mechanism underlying lipoprotein localization was

comprehended with the discovery of lipoprotein localization (Lol) factors.

A periplasmic chaperone, LolA, an outer membrane lipoprotein receptor,

LolB and an ATP-binding cassette (ABC) transporter, LolCDE complex are

Page 33: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

33

the five Lol proteins that are involved in targeting of lipoproteins to outer

membrane (Takeda et al 2003).

The LolCDE complex recognizes and releases outer membrane

lipoproteins from the inner membrane (Narita et al 2002). This recognition

and sorting of lipoproteins depends on amino acids at positions 2 and 3

(Masuda et al 2002). The strong inner membrane retention or “LolCDE

avoidance” function occurs with Asp at position 2 and Asp, Glu, Gln or Asn

at position 3 (Masuda et al 2002, Hara et al 2003). LolCDE complex

recognizes the three acyl moieties of the lipoproteins and for avoidance; a

negative charge that is within a certain distance from Cα of the second residue

was required. The electrostatic and steric complementarity between Asp at

position 2 and phospholipids having a positive charge was responsible for the

“LolCDE avoidance” mechanism (Hara et al 2003). The degree of avoidance

significantly decreases with His, Lys, Cys, Ile, Ala or Thr at position 3. Such

considerations led to the speculation that a tight lipoprotein–

Phosphatidylethanolamine complex with five acyl chains cannot be

accommodated in LolCDE and therefore LolCDE is avoided.

The mechanism of interaction among the Lol factors to localize

lipoproteins to outer membrane was recently understood. The hydrophobic

cavity of LolA and perhaps even that of LolB, undergoes opening and closing

upon the binding and release of lipoproteins, respectively (Takeda et al 2003).

The strength of the hydrophobic interaction of these factors with lipoproteins

was found to be critical for efficient vectorial transfer of lipoproteins from

LolA to LolB (Takeda et al 2003). The hydrophobic cavity of LolA opens

upon binding to the target lipoprotein and aligns with that of LolB at a

minimal distance facilitating the transfer of lipoprotein from LolA to LolB.

The lipoprotein transfer from LolA to LolB occurs in a mouth-to mouth

manner. LolB flips through its N-terminal region, and allows the target

Page 34: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

34

lipoprotein to anchor at inner side of the outer membrane via three acyl chains

(Okuda and Tokuda 2009) (Figure 1.9).

Figure 1.9 Sorting of bacterial lipoproteins by lipoprotein localization

(Lol) system (Okuda and Tokuda 2009)

Recently, lipid modification of a periplasmic enzyme, apyrase with

outer membrane targeting signal, “Ser” at +2 position resulted in its inner

membrane localization (Kamalakkanan et al 2004). Further investigating this

observation pointed to the presence of additional factors like the amphipathic

β-structures in outer membrane targeting of lipoproteins (Kamalakkanan

2005). These structures are characteristic features of gram-negative outer

membrane proteins like OmpA, OmpC, OmpF, LamB and PhoE and span the

outer membrane with alternating charged, polar and hydrophobic residues.

This structure ensures that they are not retained in the inner membrane and

makes the protein soluble during their transport through periplasmic space

(Pugsley 1993, Terada et al 2001, Narita et al 2004). In agreement with these

findings, the bioinformatics study from our lab showed that out of 81 outer

membrane lipoproteins analyzed 62% of them possessed amphipathic

β-structure. Among 84 inner membrane lipoproteins, only 32% of them

possessed amphipathic β-structure suggesting both amphipathic β-structure

Page 35: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

35

and amino acids adjacent to lipid modification site in the mature sequence

would dictate lipoprotein targeting (Kamalakkannan 2005). In outer

membrane lipoproteins that lacked amphipathic β-structure, as in case of

apyrase, about 52% of them contained “Gln” at position 2 followed by “Ser”

(14%) and “Ala” (10%) (Figure 1.10). This study pointed out that “Gln” at

2 position could possibly serve as an outer membrane targeting signal of

lipoproteins without amphipathic β-structure in the mature sequence.

Figure 1.10 Frequency of Amino Acids (%) at +2 position in outer

membrane lipoproteins without amphipathic β-structures

(Kamalakkannnan 2005)

1.4 BACTERIAL LIPID MODIFICATION AS A POTENTIAL

PROTEIN ENGINEERING TOOL

Bacterial lipid modification is important for biological effects and

their potential for several man-made applications is gradually realized. The

N-terminal lipid moiety of bacterial lipoproteins imparts hydrophobicity

without affecting the protein function. This property is useful in several

biotechnological applications, such as in ELISA, biosensors and targeted-drug

delivery systems, where proteins are required to bind hydrophobic surfaces.

The enhanced antigenicity of lipoproteins aids in developing better vaccine

Page 36: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

36

candidates against several diseases. In fact, conversion of peptide into

lipopeptide is a well-opted strategy today for superior antigenic property.

Bacterial lipoproteins are also immensely useful in surface-display of proteins

that are beneficial in bioremediation, whole-cell vaccines and in developing

combinatorial libraries (Kamalakkannan et al 2005). The potential

applications of bacterial lipid modification are detailed as under (Figure 1.11).

Figure 1.11 Many applications of bacterial lipoproteins, from enhancing

ELISA sensitivity to cell-surface display (Kamalakkannan

et al 2005)

1.4.1 Enhanced Binding of Lipid-Modified Proteins on Hydrophobic

Surfaces

The specific immobilization of proteins upon surfaces has the

potential to revolutionize both the study of their natural properties and their

utilization in novel, self-assembling nanostructures (Terrettaz et al 2002).

Patterned proteins have potential applications in molecular biosensors and

protein arrays and such immobilized protein devices have tremendous

Page 37: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

37

applications in diagnostics and environmental sensing (Bertone and Snyder

2005). Typical formats use glass supports, polystyrene or latex surfaces for

immobilizing proteins, which in general are hydrophilic and as a result binds

poorly to these hydrophobic surfaces (Bertone and Snyder 2005). Nonspecific

adsorptions of proteins to a solid support or simple chemical coupling are the

most popular methods. The latter includes noncovalent adsorption to poly-L-

lysine, polyvinylidene difluoride (PVDF) and nitrocellulose, cross-linking via

aldehyde or epoxy (MacBeath and Schreiber 2000) to surface lysine residues,

histidine tag (Klenkar et al 2006), avidin (Delehanty and Ligler 2002), or

glutathione-S-transferase (GST) (Jung et al 2005) based immobilization using

fusion proteins. Conjugating reactive groups like imidothioester with

hydrophobic moiety forms a hydrophobic amidine derivative of the protein

and allows protein binding. Coupling fatty acyl groups to the exposed sulfdryl

and amino groups of target protein using bifunctional reagents have been

reported to facilitate effective binding of target proteins to hydrophobic

surfaces (Chaffey et al 2008).

However, these methods have serious drawbacks such as

requirement of high concentration of proteins, lack of its effective adsorption

and denaturation of protein upon binding. To overcome these factors and to

facilitate bioactive surfaces, a method to attach proteins via a lipid anchor

synthesized post translationally was patented recently (Anderson and Mauro

2004). The hydrophobic affinity of lipoproteins and it’s self-assembly

properties are being exploited for generating self-assembled monolayers that

has potential advantage in sensor instrumentation and nanobiotechnology

(Reichel et al 1999). In fact, work by several groups has demonstrated the

wide potential of self-assembling monolayers (SAM) of immobilized

amphiphiles incorporating small peptides (Zhang et al 1999, Miura et al 2000,

Huang et al 2003). However, a significant obstacle to the further development

of such technologies is the lack of methods that enable the anchoring of large

Page 38: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

38

soluble protein molecules to these surfaces. Currently the best available

method involves fusing the protein of interest to a membrane protein scaffold

which self-assembles with SAM (Terrettaz et al 2002, Shah et al 2007).

Bacterial lipoproteins appear promising for generating such patterned arrays.

1.4.2 Efficient Liposomal Integration of Lipid-Modified Forms for

use in Targeted - Drug Delivery

Lipid modification to immobilize proteins onto hydrophobic

surfaces was demonstrated by Laukkanen et al (1993) with Lpp-scFv fusion.

Fusion antibody was reported to be incorporated into proteoliposomes

displaying specific hapten-binding activity, and retaining its antigen-binding

property. Such a lipid-tagged, single-chain antibody Fv fragment (scFv)

against the human transferrin receptor based immunoliposomes showed

promising efficacy for systemic p53 tumor suppressor gene therapy in a

human breast cancer metastasis model (Xu et al 2002). An IgG binding

protein, β-domain of protein-A from Staphylococcus aureus was modified

through bacterial lipid modification by fusing with Lpp signal and 9

N-terminal amino acids of mature sequence, in order to use single protein-A

bound immmunoliposomes against variety of antigens (Shigematsu et al

1999). The lipid-tagged and its soluble counterparts of protein did not show

any significant change in activity and specificity. However, the lipid-modified

protein showed a stable integration with liposomes than its soluble forms. The

poor transportability of hydrophilic proteins across the biological membrane

is altered by acylation of protein molecules, as acyl moieties show high

membrane-affinity and low toxicity. For example, acylated RNase A was

reported to cross the blood-brain barrier (Chopineaua et al 1998) and

palmitoylated-chicken cystatin was rapidly internalized into the cell and

caused a complete loss of cathepsin B activity (Kočevar et al 2007). The

influence of lipidation, though not through bacterial lipid modification on the

Page 39: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

39

translocation of relatively long peptide comprising ligands to different

cytoplasmic pharmacological targets were also demonstrated (Thiam et al

1999). Protein kinase C-α,ε and ζ pseudosubstrates (Eichholtz et al 1993) and

5 kDa peptide derived from the murine IFNγ (95-132) (Thiam et al 1999)

permeated across the cytoplasmic membrane consequent to lipid

modification. N-terminal monoacylated RNase A prepared using reversed

micelles as micro reactors crossed an in vitro model of the blood brain barrier

(Chopineaua et al 1998).

1.4.3 Adjuvant Property of Lipid Moiety of Bacterial Lipoproteins

Aids in Developing Efficacious Vaccines

The triacyl chains of N-acyl-S-diacyl glyceryl cysteine, a feature

ubiquitous in bacteria are responsible for immunogenecity of bacterial

lipoproteins. The outer-membrane lipoprotein, OspA, of Borrelia burgdorferi

an outer membrane lipoprotein, has recently been licensed in US as the

vaccine against Lyme disease. Animals immunized with the full-length OspA

(lipidated form) were shown to be protected against B. burgdorferi challenge

(Chang et al 1995). Mannheimia haemolytica chimeric protein vaccine

composed of the major surface-exposed epitope of outer membrane

lipoprotein PlpE and the neutralizing epitope of leukotoxin (Ayalew et al

2008). Enhanced protection against bovine tuberculosis was possible on

administering a vaccine consisting of BCG and culture filtrate proteins (CFP)

combined with an adjuvant formulation that included a lipopeptide, Pam3Cys-

SKKKK (Pam3CSK4), which is a synthetic triacylated lipopeptide that has

adjuvant activity on TLR2. The combination induced significant levels of

protection against challenge with a virulent strain Mycobacterium bovis that

were superior to those obtained with BCG alone (Wedlock et al 2008).

A dipalmitoylated lipopeptide containing the pp65 495–503 CTL epitope

from the human cytomegalovirus (HCMV) immunodominant matrix protein

Page 40: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

40

pp65 covalently linked to a universal T helper epitope induces systemic CTL

responses in HLA-A transgenic mice. This study effectively demonstrated the

use of lipopeptides as potent adjuvant for mucosal immunization (Ben

Mohamed et al 2002a). Recently, domain III of the dengue virus envelope

protein (E3) was fused with the 40 N-terminal residues of Ag473,

a lipoprotein from Neisseria meningitidis. The resulting lipo-immunogen

(rlipo-D1E3) was expressed in high yield as a lipoprotein and was found to

elicit stronger anti-E3 and virus neutralizing antibody responses in animal

studies (Chen et al 2009). Lipidated cytotoxic T-lymphocyte epitopes of

proteins derived from viruses such as HIV, HCV/HBV and influenza, were

reported as potent vaccine candidates for diseases such as AIDS, hepatitis and

influenza, respectively (Ben Mohamed et al 2002b).

1.4.4 Surface-Display as Bacterial Lipoproteins for Bioremediation,

Vaccine Development and Other Biotechnological Applications

Bacterial lipoproteins are targeted either to outer leaflet of the inner

membrane or inner/outer leaflet of the outer membrane. Among these,

surface-display of lipoproteins has powerful applications. Francisco et al

(1992) developed a tripartite fusion protein consisting of the signal sequence

and the first 9 amino acids of Lpp, residues 46 to 159 of OmpA, the outer

membrane porin and the entire mature sequence of β-lactamase. This fusion

expressed β-lactamase to the outer surface indicating specific signals could

aid in surface display of proteins. This was soon followed by surface-display

of variety of proteins such as bacterial endoglucanase, a cellulose-binding

domain, and scFv (single chain fragment variable) antibodies (Earhart 2000).

Pytochelains (40aa) for adsorption (bioaccumulation), PE DIII antigen,

extracellular domain of human ErbB2 and IL2-Ra (237aa) for selection of

phage antibody have been displayed on the surface of E. coli using the

Lpp-OmpA fusion tags. A peptidoglycan-associated lipoprotein (PAL) fused

Page 41: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

41

to an antibody fragment (scFv) specific to the herbicide and environmental

pollutant atrazine, has been successfully targeted to the cell surface of

Escherichia coli (Wu et al 2008). Organophosphorus hydrolase (OPH) was

fused to Lpp-OmpA fusion system and expressed on the surface of E. coli.

The surface exposed OPH could degrade parathion and paraoxon at seven

fold higher rates than intracellular OPH (Chen and Georgiou 2002). Synthetic

genes encoding for several metal-chelating phytochelatin analogs were

synthesized, linked to Lpp-ompA fusion gene, and displayed on the surface of

E. coli and showed increased accumulation of cadmium, which has adverse

effects on the environment (Bae et al 2000). Highly efficient selection of

phage antibodies were mediated by display of antigen when fused with

Lpp–OmpA fusions on live bacteria (Benhar et al 2000). Specific adhesion of

whole cells to cellulosic materials with high affinity has been demonstrated

by anchoring the cellulose-binding domain (CBD) from Cellulomonas fimi on

the surface of E. coli using Lpp-OmpA fusions (Chen and Georgiou 2002).

1.4.5 Strategies for Protein Engineering using Bacterial Lipid

Modification

Generic vector systems for expression of lipoproteins in

Gram-negative organisms have been attempted to explore the potential

applications of lipid modification. In this regard, an oprI-based generic vector

system was first developed for the expression of lipoproteins in the outer

membrane (Cornelis et al 1996) and later lacI gene coding for LacI repressor

was introduced to repress the leaky expression (Cote Sierra et al 1998) and

appreciable quantities of target protein could be achieved.

Although several fusion strategies to lipid modify proteins have

been demonstrated and exploited for several potential applications as

described above. These strategies involved large fusions and therefore could

Page 42: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

42

impede proper protein folding and the added fusions may give rise to

unwarranted immune responses. Also, the structure-function integrity of

functional proteins like enzymes was not comprehensively investigated. In

this regard, recently, Kamalakkannan et al (2004) successfully demonstrated

lipid modification of a heterologus non-lipoprotein, apyrase a periplasmic

ATP diphosphohydrolase from Shigella flexneri. The engineering of apyrase

to lipid-modified forms did not affect either specific activity or its kinetics.

The lipid modification was demonstrated using two different strategies; the

first is by replacing apyrase signal sequence with signal sequence of lpp and

one amino acid of mature Lpp, and the second strategy is by replacing the

c-region of signal peptidase I-specific signal of apyrase with lipobox

sequence. Surprisingly, the lipid-modified apyrase from the first strategy was

found localized to inner membrane and not outer membrane as expected.

In addition to in vivo methods to modify proteins with lipids,

in vitro lipid modification was also demonstrated. A synthetic peptide

corresponding to signal peptide and the first three amino acids of Lpp was

modified with diacylglycerol derived from radiolabeled-Phosphatidylglycerol

(Sankaran and Wu 1994). More recently, a prototype bioreactor for lipid

modification using immobilized Lgt enzyme was developed. This reactor

could convert 65 % of the synthetic peptide substrate into lipopeptide in 7 h.

This in vitro lipid modification can be exploited for potential applications

such as in production of lipopeptides for prophylactics or self-assembly mono

layers in sensor-based applications (Selvan 2008).

1.4.6 Bacterial Lipoprotein Databases and Prediction Tools

Owing to their importance in protein engineering and metabolic

engineering applications, predictive rules to identify such target lipoproteins

were established. The characteristic consensus lipobox found in bacterial

Page 43: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

43

lipoprotein signal sequence was used for lipoprotein predictions. The

consensus (LVI)(ASTG)(GA)|C, requiring only one match to the first two

positions was able to differentiate lipoprotein signal peptides and SPaseI-

cleaved signal peptides (von Heijne, 1989) and this was employed to predict

lipoproteins in PSORT. Another lipoprotein prediction algorithm Prosite

pattern PS00013 identifies {DERK}(6) (LIVMFWSTAG)(2)

(LIVMFYSTAGCQ)(AGS)C, where {DERK}(6) does not allow the four

amino acids in the first six positions (position −10 to −5 relative to the

cleavage site) and cysteine must be between position 15 and 35, with at least

one lysine or arginine in one of the first seven positions of the signal peptide

(Falquet et al 2002). A finer expression, G+Lpp, [GV]-X[0,13]-[RK]-

[DERKQ] (6,20)-[LIVMFESTAG]-[LVIAM]-IVMSTAFG]-[AG]-C with

minimal false-positives was made for Gram-positive bacteria using about 33

experimentally-verified lipoproteins from the Gram-positives. Juncker and

coworkers established a method based on Hidden Markov Model (HMM),

trained on both SPaseI-cleaved proteins, lipoproteins, and cytoplasmic and

transmembrane proteins. The method could classify a lipoprotein signal

peptides, a SPaseI-cleaved signal peptide, or a protein without a signal

sequence (cytoplasmic or transmembrane) with very low error rates. The

HMM is also able to identify the cleavage sites in both SPaseI and SPaseII-

specific signal peptides (Juncker et al 2003).With the identification of several

bacterial lipoproteins the knowledge about bacterial lipid modification and

lipoproteins has been compiled into an exclusive database, DOLOP (Babu

and Sankaran 2002, Babu et al 2006). The database hosts a list of identified

lipoproteins from 234 completely sequenced genomes classified into eight

groups such as structural proteins, binding proteins, transporters, adhesions,

toxins, antigens, enzymes and interesting factors. Recently, the database was

updated with lipoproteins from currently sequenced genomes and its super

family assignments were also provided. The knowledgebase also offers

lipoprotein prediction, primary sequence analysis, signal sequence analysis,

Page 44: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

44

and search facility and information exchange. The lipoprotein predictive rules

permits sequences that

(i) Start with Met followed by one or more positively charged

residues (Lys or Arg) in the first five to seven residues.

(ii) The h-region should contain 7 to 22 residues.

(iii) The consensus sequence [LVI][ASTVI][GAS][C] should

occur within the first 40 residues from the N-terminal end.

1.5 OVERVIEW OF THE THESIS

Though lipoprotein biosynthesis is a vital post translational

mechanism in bacteria and has potential applications in biotechnology,

important aspects of this unique mechanism are not clearly understood,

especially lipoprotein translocation and it’s targeting to either of the

membranes.

Translocation of prolipoproteins is at the best understood with the

involvement of Sec translocation but, the mechanistic association of both

translocation and lipoprotein biosynthetic machineries are not known

(Sugai and Wu 1992, Kamalakkanan et al 2004). Recently, certain proteins

were found to fold rapidly and could not be routed via Sec and required TAT

pathway (Thomas et al 2001). However, the fate of such fast-folding

lipoproteins is not known. Hence, understanding the role of this new pathway

in lipoprotein biosynthesis would provide useful knowledge to the principles

of bacterial lipid modification and to its utility as a protein engineering tool.

In this regard, Enhanced Green Fluorescent Protein (EGFP) was chosen as a

convenient model protein for the study.

Page 45: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

45

The Green Fluorescent Protein, GFP, from the jellyfish Aequorea,

is a well-known biomarker used in monitoring genetic alterations. The

protein’s fluorescence requires no cofactor, other than the fluorophore

resulting from cyclization and oxidation of -Ser65-Tyr66-Gly67- sequence to

form 4-(p-hydroxybenzylidene)- imidazolidin-5-one structure. Enhanced

fluorescence of wild type GFP (EGFP) was achieved by substituting Leu

residue for the Phe residue at position 64 (Yang et al 1996, Heim et al 1994).

The eleven strands of β-sheet in GFP form an antiparallel barrel with short

helices forming lids on both the ends. The fluorophore is inside the can, as

part of the distorted α-helix, which runs along the axis of the cylinder. The

spontaneous oxidation of the fluorophore-forming amino acids (see above)

around a tightly folded barrel justifies its fast-folding kinetics, which prevents

its translocation via the Sec pathway (Thomas et al 2001).

Another aspect of lipoprotein maturation that requires better

understanding is the lipoprotein targeting. The elucidation of crystal structures

of protein components involved in lipoprotein localization factors provided

clues to understanding mechanism, but the primary structure requirements

that govern such targeting (+2 position) remained vague. In our recent protein

engineering study, the lipid-modified apyrase was retained in the inner

membrane despite fusing it with the lpp signal sequence of the prototypical

outer membrane lipoprotein and with the known outer membrane targeting

amino acid, “Ser” at +2 position. Furthermore, the bioinformatics study from

our lab pointed to the requirement of secondary structures like amphipathic

β structures in addition to the “+2 amino acids” for lipoprotein targeting

(Kamalakkanan 2005). Those lacking this feature had “Gln” at +2 position.

Apyrase, a four-helix bundle protein (Babu et al 2002) devoid of amphipathic

β structure was taken as a model for testing this hypothesis. This enzyme from

virulent Shigella is an ATP-diphosphohydrolase enzyme, which sequentially

hydrolyzes nucleoside triphosphates to corresponding diphosphates and then

Page 46: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

46

monophosphates and does not further hydrolyze monophosphates, unlike

normal phosphatases. The enzyme can be readily assayed using a whole-cell

colorimetric pyrophosphate hydrolysis assay. The 27kDa periplasmic protein

is synthesized as precursors with a signal peptidase I cleavage site (-A-N-A)

in Shigella. Successful lipid-modification of the enzyme in E. coli did not

affect its structure - functional integrity even after the N-terminal

modification (Kamalakkanan et al 2004).

In addition to providing new knowledge on bacterial lipoprotein

biosynthesis, a very important application of this post translational

engineering was also investigated. Engineering proteins or peptides for

improved binding onto hydrophobic surfaces is significant in ELISA and

sensor applications. In this regard, the thesis has investigated potential of

bacterial lipid modification for such an application using a hydrophilic model

protein, human interferon gamma, which is known to coat ELISA surfaces

poorly.

Human Interferon Gamma is a 14kDa highly hydrophilic α-helical

protein. Expression of this commercially important glycoprotein in E. coli

produced non-glycosylated forms, which still had significant diagnostic and

therapeutic value. Its detection and monitoring of its levels in blood is a useful

index, but due to its hydrophilic nature it exhibits poor coatability to

hydrophobic surfaces as in ELISA and its poor antigenicity poses difficulties

in raising antibodies, the diagnostic reagent.

Taken together, this thesis was aimed to understand the important

aspects of bacterial lipoprotein biosynthesis and the findings have unraveled

new facts about bacterial lipoprotein biosynthesis, translocation, targeting and

evolutionary adaptation. This new knowledge will aid in the protein

engineering and metabolic engineering applications. The study carried out is

outlined as given below

Page 47: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

47

The first chapter of the thesis deals with the background

information on the lipid modification emphasizing bacterial type of lipid

modification. The chapter elaborates on the prerequisites for lipoprotein

biosynthesis, the available knowledge on prolipoprotein translocation, the

biosynthetic pathway and sorting of lipoproteins. The next aspect of the

chapter focuses on the use of this knowledge for developing in vivo and

in vitro protein engineering strategies. The potential applications using these

strategies are also described in detail providing examples. Based on the

present understanding as supported by the literature, the objectives of this

study were framed in the final section of this chapter.

The methodology and the resources used in the study in order to

execute the objectives were dealt in the second chapter. The study in general

employed the common biochemical and molecular biology techniques. A few

methods that were slightly modified for specific application are also described

in this chapter.

The results obtained from the experiments carried out in the study

are described in the next chapter. The first section of this chapter provides the

results pertaining to targeting of lipid-modified apyrase to the outer

membrane. The second and third sections describe the results on the necessity

and the role of TAT pathway in lipoprotein biosynthesis of fast-folding

proteins. The results obtained from an extensive computational biology study

to understand more on TAT-dependent lipoprotein biosynthesis was

elaborated in the fourth section of this chapter. The final section deals with

the results on one of the potential applications of bacterial lipid modification.

The enhanced binding efficiency of lipid-modified human interferon gamma

on hydrophobic surfaces was revealed in this chapter.

Page 48: CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/33909/6/06_chapter1.pdf1.2 COMMON POST TRANSLATIONAL MODIFICATIONS – GLYCOSYLATION, PHOSPHORYLATION

48

The fourth chapter discusses the important findings of the study

with the support of literature. The first and the second sections of this chapter

describes on how the current findings have improved our knowledge on

bacterial lipoprotein biosynthesis, especially in translocation and modification

of fast-folding lipoproteins. The next section elaborates on the significance of

TAT-dependent lipoproteins as niche-based adaptation. The extension of this

new knowledge for several protein and metabolic engineering applications

was discussed in the fourth section of this chapter. The high binding

efficiency of lipid-modified human interferon gamma protein on hydrophobic

surface and its mode of binding to such surfaces were discussed in the final

section of this chapter.

1.6 OBJECTIVES

Based on the potential applications of bacterial lipid modification

as a novel protein engineering tool, the objective of this study was

To investigate the outer membrane targeting signals for

bacterial lipoproteins.

To evaluate the existing protein engineering strategy for lipid

modifying Sec-incompatible fast-folding protein.

To analyze the role of Sec-independent, Twin Arginine

Translocation (TAT) pathway in bacterial lipoprotein

biosynthesis.

To develop a novel TAT-based protein engineering strategy

for lipid modifying fast-folding proteins.

To study the properties imparted by lipid modification using

Human Interferon Gamma as a model protein.