Biochemistry scanner THE IMPRINT - WordPress.com · DNA replication in prokaryotes and viruses (The rolling circle and M13 ... of RNA replication and reverse transcription establishes

PROF. BALASUBRAMANIAN SATHYAMURTHY 2015 EDITION MBH – 202 MOLECULAR BIOLOGY

Contact for your free pdf & job opportunities [email protected] or 8050673426 Page 1 of 167

FOR MSC MICROBIOLOGY STUDENTS

2014 ONWARDS

Biochemistry scanner

THE IMPRINT MBH – 202 MOLECULAR BIOLOGY (THEORY)

As per Bangalore University (CBCS) Syllabus

2014 Edition

BY: Prof. Balasubramanian Sathyamurthy

Supported By:

Ayesha Siddiqui

Kiran K.S.

THE MATERIALS FROM “THE IMPRINT (BIOCHEMISTRY SCANNER)” ARE NOT FOR COMMERCIAL OR BRAND BUILDING. HENCE ONLY ACADEMIC CONTENT WILL BE PRESENT INSIDE. WE THANK ALL THE CONTRIBUTORS FOR ENCOURAGING THIS.

BE GOOD – DO GOOD & HELP OTHERS



DEDICATION

I dedicate this material to my spiritual guru Shri Raghavendra swamigal,

parents, teachers, well wishers and students who always increase my

morale and confidence to share my knowledge to reach all beneficiaries.

PREFACE

Biochemistry scanner ‘THE IMPRINT’ consists of last ten years solved question paper of Bangalore University keeping in mind the syllabus and examination pattern of the University. The content taken from the reference books has been presented in a simple language for better understanding.

The Author Prof. Balasubramanian Sathyamurthy has 15 years of teaching experience and has taught in 5 Indian Universities including Bangalore University and more than 20 students has got university ranking under his guidance. THE IMPRINT is a genuine effort by the students to help their peers with their examinations with the strategy that has been successfully utilized by them. These final year M.Sc students have proven their mettle in university examinations and are College / University rank holders. This is truly for the students, by the students. We thank all the contributors for their valuable suggestion in bringing out this book. We hope this will be appreciated by the students and teachers alike. Suggestions are welcomed.

For any comments, queries, and suggestions and to get your free copy write

us at [email protected] or call 8050673426.



CONTRIBUTORS:

CHETAN ABBUR ANJALI TIWARI

AASHITA SINHA ASHWINI BELLATTI

BHARATH K CHAITHRA

GADIPARTHI VAMSEEKRISHNA KALYAN BANERJEE

KAMALA KISHORE

KIRAN KIRAN H.R

KRUTHI PRABAKAR KRUPA S

LATHA M MAMATA

MADHU PRAKASHHA G D MANJUNATH .B.P

NAYAB RASOOL S NAVYA KUCHARLAPATI

NEHA SHARIFF DIVYA DUBEY

NOOR AYESHA M PAYAL BANERJEE

POONAM PANCHAL PRAVEEN

PRAKASH K J M PRADEEP.R

PURSHOTHAM PUPPALA DEEPTHI

RAGHUNATH REDDY V RAMYA S

RAVI RESHMA

RUBY SHA SALMA H.

SHWETHA B S SHILPI CHOUBEY

SOUMOUNDA DAS SURENDRA N

THUMMALA MANOJ UDAYASHRE. B

DEEPIKA SHARMA

EDITION : 2015

PRINT : Bangalore

CONTACT : [email protected] or 8050673426



BANGALORE UNIVERSITY SYLLABUS (REVISED 2014) M.SC MICROBIOLOGY II SEMESTER

MBH – 202 MOLECULAR BIOLOGY UNIT – 1 Concepts of Molecular Biology: (10 hrs)

Introduction, flow of information, central dogma of molecular biology.

Structure of DNA, DNA polymorphism (A, B, Z DNA), Structure and function of

different types of RNA.

DNA damage and repair: Types of DNA damage – deamination, oxidative

damage, alkylation, pyrimidine dimmers: Repair pathways – photoreactivation,

excision repair, post replication repair, SOS repair, methyl directed

mismatched repair, very short patch repair

Unit – 2 DNA Replication: (10 hrs) DNA replication in prokaryotes and viruses (The rolling circle and M13

bacteriophages replication), asymmetric replication, looped, rolling circle,

semiconservative replication, primer or template, concotamy formation – P1.

Origin of replication, replication fork – leading and lagging strands, enzymes

involved at different steps of replication. Fidelity of replication.

Extrachromosomal replicons.

Unit – 3 Transcription: (10 hrs)

Transcription factors and machinery, formation of initiation complex,

transcription activators and repressors, RNA polymerases. Intiation, elongation

and termination. Heat shock response. Inhibitors of RNA synthesis and their

mechanism. Polycystronic and monocystronic mRNA. Control of elongation and

termination. Alternate sigma factors. Post transcriptional modifications of m-

RNA – capping, editing, splicing, polyadenylation, modifications of t RNA and r

RNA.

Unit – 4 Translation: (10 hrs) Genetic Code – Features and character, Wobble hypothesis. Ribosome

assembly, Initiation factors and their regulation, formation of initiation

complex, Initiation, elongation and termination of polypeptide chain, elongation

factors and releasing factors, translational proof reading, inhibitors of



translation and their mechanism, post translational modification of proteins –

glycosylation. Control of translation in eukaryotes. Differences between

prokaryotes and eukaryotes

Unit – 5 Regulation of gene expression: (10 hrs) Transcriptional control. Operon concept, catabolite repression. Inducible and

repressible systems. Negative gene regulation – E.Coli lac operon; Positive

regulation - E.Coli ara operon; Regulation by attenuation – his and trp

operons, anti – termination - N protein and nut sites, DNA binding sites, DNA

binding protein, enhancer sequences, identification of protein binding site on

DNA. Maturation and processing of RNA – Methylation, cutting and

modification of t RNA degradation system. Unit – 6 Control of gene expression at transcription and translation level: Regulation of phages, viruses, prokaryotic and eukaryotic gene expression, role

of chromatin in regulating gene expression.

Gene silencing: Transcriptional and post transcriptional gene silencing – RNA I

pathway ( siRNA and miRNA). (6 Hrs)

References:

1. Robert F. Weaver. (2009). Molecular Biology, 4th Edition. McGraw-Hill.

2. B.B. Buchanan. (2007). Biochemistry and Molecular Biology of Plants. I.K.

International Publishing House Ltd. New Delhi.

3. Chris. R. Callbine., Hallace. R. Bin. F. Leus. and Andrew, A. Travers. (2006)

Understanding DNA (3rd Ed.). Academic Press.

4. Raymond F Gesteland. (2006). The RNA World, Third Edition. I.K. International

Publishing House.

5. Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts,

Peter Walter, (2002). Molecular Biology of the Cell. Garland Pub. 4th Ed.

6. Twyman R.M., (1998). Advanced Molecular Biology. 1st Ed. Viva Books Pvt

Ltd., New Delhi.



7. Joset F., Michel G, (1993). Prokaryotic Genetics, Genome Organization,

Transfer and Plasticity, Boston. Blackwell.

8. Adams R.L.P, (1992). DNA Replication. IPL Oxford, England.

9. Streips and Yasbin, (1991). Modern Microbial Genetics. Wiley Ltd.

10. Thomas D. Brock, (1990). The Emergence of Bacterial Genetics, CSH lab Press.

11. Mark Ptashne, (1986). A Genetic Switch. Gene Control and Phage λ. Cell Press

and Blackwell Scientific Publications.



UNIT – 1 CONCEPTS OF MOLECULAR BIOLOGY: Introduction, flow of information, central dogma of molecular biology. Structure of DNA, DNA polymorphism (A, B, Z DNA), Structure and function of different types of RNA. DNA damage and repair: Types of DNA damage – deamination, oxidative damage, alkylation, pyrimidine dimmers: Repair pathways – photo – reactivation, excision repair, post replication repair, SOS repair, methyl directed mismatched repair, very short patch repair

INTRODUCTION The central dogma defines the paradigm of molecular biology. Genes are

perpetuated as sequences of nucleic acid, but function by being expressed in

the form of proteins. Replication is responsible for the inheritance of genetic

information. Transcription and translation are responsible for its conversion

from one form to another.

FLOW OF INFORMATION: CENTRAL DOGMA OF MOLECULAR BIOLOGY Below Figure illustrates the roles of replication, transcription, and

translation, viewed from the perspective of the central dogma:

The perpetuation of nucleic acid may involve either DNA or RNA as the genetic

material. Cells use only DNA. Some viruses use RNA, and replication of viral

RNA occurs in the infected cell.

The expression of cellular genetic information usually is unidirectional.

Transcription of DNA generates RNA molecules that can be used further only to

generate protein sequences; generally they cannot be retrieved for use as

genetic information. Translation of RNA into protein is always irreversible.



The central dogma states that information in nucleic acid can be perpetuated or transferred, but the transfer of information into protein is irreversible. The genomes of all living organisms consist of duplex DNA. Viruses have

genomes that consist of DNA or RNA; and there are examples of each type that

are double-stranded (ds) or single-stranded (ss). Details of the mechanism used

to replicate the nucleic acid vary among the viral systems, but the principle of

replication via synthesis of complementary strands remains the same, as

illustrated in Figure

Double staranded and single stranded nucleic acid both replicate by synthesis

of complementary strands governed by the rules of base pairing

Cellular genomes reproduce DNA by the mechanism of semi-conservative

replication. Double-stranded virus genomes, whether DNA or RNA, also

replicate by using the individual strands of the duplex as templates to

synthesize partner strands.

Viruses with single-stranded genomes use the single strand as a template to

synthesize a complementary strand; and this complementary strand in turn is

used to synthesize its complement, which is, of course, identical with the

original starting strand. Replication may involve the formation of stable double-



stranded intermediates or may use doublestranded nucleic acid only as a

transient stage

The restriction to unidirectional transfer from DNA to RNA is not absolute. It is

overcome by the retroviruses, whose genomes consist of single-stranded RNA

molecules. During the infective cycle, the RNA is converted by the process of

reverse transcription into a single-stranded DNA, which in turn is converted

into a double-stranded DNA. This duplex DNA becomes part of the genome of

the cell, and is inherited like any other gene. So reverse transcription allows a

sequence of RNA to be retrieved and used as genetic information. The existence

of RNA replication and reverse transcription establishes the general principle

that information in the form of either type of nucleic acid sequence can be

converted into the other type.

Throughout the range of organisms, with genomes varying in total content over

a 100,000 fold range, a common principle prevails. The DNA codes for all the

proteins that the cell(s) of the organism must synthesize; and the proteins in

turn (directly or indirectly) provide the functions needed for survival.

The nucleic acid codes for the protein(s) needed to package the genome and

also for any functions additional to those provided by the host cell that are

needed to reproduce the virus during its infective cycle.

STRUCTURE OF DNA The salient features of the Watson-Crick model for the commonly found DNA (

B-DNA) are:

1. DNA molecule consists of two helical polynucleotide chains which are coiled

around (or wrapped about) a common axis in the form of a right handed double

helix. The two helices are wound in such a way so as to produce 2 interchain

spacings or grooves, a major or wide groove (width 12 Å, depth 8.5 Å) and a

minor or narrow groove (width 6 Å, depth 7.5 Å).

2. The two grooves arise because the glycosidic bonds of a base pair are not

diametrically opposite each other.



3. The minor groove contains the pyrimidine O-2 and the purine N-3 of the base

pair, and the major groove is on the opposite side of the pair.

4. Each groove is lined by potential hydrogen bond donor and acceptor atoms.

5. The two helices wind along the molecule parallel to the phosphodiester

backbones.

6. The phosphate and deoxyribose units are found on the periphery of the helix,

whereas the purine and pyrimidine bases occur in the centre.

7. The planes of the bases are perpendicular to the helix axis.

8. The planes of the sugars are almost at right angles to those of the bases.

9. The diameter of the helix is 20 Å. The bases are 3.4 Å apart along the helix axis

and are related by a rotation of 36 degrees. Therefore, the helical structure

repeats after 10 residues on each chain, i.e., at intervals of 34 Å. In other

words, each turn of the helix contains 10 nucleotide residues.

10. The two chains are held together by hydrogen bonds between pairs of bases.

11. Adenine always pairs with thymine by 2 hydrogen bonds and guanine with

cytosine with 3 hydrogen bonds. This specific positioning of the bases is called

base complementarity.

12. The individual hydrogen bonds are weak in nature but, a large number of them

involved in the DNA molecule confer stability to it. It is now thought that the

stability of the DNA molecule is primarily a consequence of van der Waals

forces between the planes of stacked bases.

13. Base complementarity of the polynucleotide chain.

14. An important feature of the double helix is the specificity of the pairing of

bases. Pairing always occurs between adenine and thymine and between

guanine and cytosine.

Steric factor: The steric restriction is imposed by the regular helical nature of the sugar-

phosphate backbone of each polynucleotide chain.

Hydrogen-bonding factor: The base pairing is further restricted by hydrogen-bonding requirements. The

hydrogen atoms in purine and pyrimidine bases have well-defined positions.



Adenine cannot pair with cytosine because there would be two hydrogen atoms

near one of the bonding positions and none at the other. Similarly, guanine

cannot pair with thymine.

DNA POLYMORPHISM (A, B, Z DNA)

Characteristics A-DNA B-DNA C-DNA Z-DNA

Conditions 75% relative

humidity

Na+, K+,

Cs+ ions

92% relative

humidity

Low ion

strength

60%

relative

humidity

Li+ ions

Very high salt

concentration

Shape Broadest Intermediate Narrow Narrowest

Helix sense Right-

handed

Right-handed Right-

handed

Left-handed

Helix diameter 25.5 Å 23.7 Å 19.0 Å 18.4 Å

Rise per base pair (‘h’)

2.3 Å 3.4 Ã 3.32 Å 3.8 Å

Base pairs/helix turn (‘n’)

11 10.4 9.33 12 (= 6 dimers)

Helix pitch (h × n) 25.30 Å 35.36 Å 30.97 Å 45.60 Å

Rotation / base + 32.72° + 34.61° + 38.58° –60° (per



pair dimer)

Base pair tilt 19° 1° 7.8° 9°

Glycosidic bond anti anti — anti for C, T

syn for A, G

Major groove Narrow and

very deep

Wide and

quite deep

— Flat

Minor groove Very broad

and shallow

Narrow and

quite deep

— Very narrow

and deep

Structure

—

STRUCTURE AND FUNCTION OF DIFFERENT TYPES OF RNA Ribonucleic acid (RNA), like DNA, is a long, unbranched macromolecule

consisting of nucleotides joined by 3’ to 5’ phosphodiester bonds. The number

of ribonucleotides in RNA ranges from as few as 75 to many thousands.

Types of RNA In all procaryotic and eucaryotic organisms, 3 general types of RNAs are found:

ribosomal, transfer and messenger RNAs. Each of these polymeric forms serves

as extremely important informational links between DNA, the master carrier of

information and proteins. The 3 types of RNA molecules differ from each other

by size, function and general stability.

Ribosomal RNA (rRNA) or Insoluble RNA



It is the most stable form of RNA and is found in

ribosomes. It has the highest molecular weight and

is sedimented when a cell homogenate containing

10−2 M of Mg2+ is centrifuged at high speed

(100,000 gravity for 120 minutes).

In the bacterium, Escherichia coli, there are 3 kinds

of RNA called 23 s, 16 s, and 5 s RNA because of

sedimentation behaviour.

These have molecular weights of 1,200,000,

550,000 and 36,000 respectively.

One molecule of each of these 3 types of rRNA is

present in each ribosome. Ribosomal RNA is most

abundant of all types of RNAs and makes up about

80% of the total RNA of a cell.

Ribosomal RNA represents about 40-60% of the total weight of ribosomes.

Ribosomes rRNA

Procaryotic ribosomes

30 s

50 s

16 s

5 s, 23 s

Eucaryotic ribosomes

40 s

60 s

18 s

5 s, 28 s

rRNA has G-C contents more than 50%. The rRNA molecule appears as a single

unbranched polynucleotide strand (= primary structure). At low ionic strength,

the molecule shows a compact rod with random coiling. But at high ionic

strength, the molecule reveals the presence of compact helical regions with

complementary base pairing and looped outer region ( = secondary structure).

The helical structure results from a folding back of a single-stranded polymer

at areas where hydrogen bonding is possible because of short lengths of



complementary structures. The double helical secondary structures in RNA can

form within a single RNA molecule or between 2 separate RNA molecules. RNAs

can often assume even more complex shapes as in bacteria.

Transfer RNA (tRNA) or Soluble RNA (sRNA): Transfer RNA is the smallest polymeric form of RNA. These molecules seem to be

generated by the nuclear processing of a precursor molecule. In abundance,

the tRNA comes next to rRNA and amounts to about 15% of the total RNA of

the cell. The tRNA remains dissolved in solution after centrifuging a broken cell

suspension at 100,000 X gravity for several hours. The tRNA molecules serve a

number of functions, the most important of which is to act as specific carriers

of activated amino acids to specific sites on the protein- synthesizing templates.

Common structural features of tRNAs All tRNA molecules have a common design and consist of 3 folds giving it a

shape of the cloverleaf with four arms; the longer tRNAs have a short fifth or

extra arm. The actual 3-dimensional structure of a tRNA looks more like a

twisted L than a cloverleaf

All tRNA molecules are unbranched chains containing from 73 to 93

ribonucleotide residues, corresponding to molecular weights between 24,000

and 31,000

They contain from 7 to 15 unusual modified bases. Many of these unusual

bases are methylated or dimethylated derivatives of A, U, G and C.

Methylation prevents the formation of certain base pairs so that some of the

bases become accessible for other interactions. Methylation imparts

hydrophobic character to some portions of tRNA molecules which may be

important for their interaction with the synthetases and with ribosomal

proteins.



The 5’ end of tRNAs is phosphorylated. The 5’ terminal residue is usually

guanylate (pG).

The base sequence at the 3’ end of all tRNAs is CCA. All amino acids bind to

this terminal adenosine via the 3’-OH group of its ribose.

50% of the nucleotides in tRNAs are base-paired to form double helices.

5 groups of bases which are not base-paired. These 5 groups, of which 4 form

‘loops’, are :

The 3′ CCA terminal region,

The ribothymine-pseudouracil-cytosine ( = T φ C) loop,

The ‘extra arm’ or little loop, which contains a variable number of residues,

The dihydrouracil ( = DHU) loop, which contains several dihydrouracil residues,

and

The anticodon loop, which consists of 7 bases with the sequence, 5′ —

pyrimidine — pyrimidine —X —Y—Z — modified purine — variable base — 3′

The 4 loops are recognition sites. Each tRNA must have at least two such

recognition sites : one for the activated amino acid-enzyme complex with which

it must react to form the aminoacyl-tRNA and another for the site on a



messenger RNA molecule which contains the code (codon) for that particular

amino acid.

A unique similarity among all tRNA molecules is that the overall distance from

CCA at one end to the anticodon at the other end is constant. The difference in

nucleotide numbers in various tRNA molecules is, in fact, compensated for by

the size of the “extra arm”, which is located between the anticodon loop and TΨ

C loop.

Messenger RNA (mRNA) or Template RNA

Messenger RNA is most heterogeneous in size

and stability among all the types of RNAs. It

has large molecular weight approaching 2 ×

106 and amounts to about 5% of the total

RNA of a cell. It is synthesized on the surface

of DNA template. Thus, it has base sequence

complementary to DNA and carries genetic

information or ‘message’ (hence its

nomenclature) for the assembly of amino

acids from DNA to ribosomes, the site of

protein synthesis.

In procaryotic cells, mRNA is metabolically

unstable with a high turnover rate whereas

it is rather stable in eucaryotes. It is

synthesized by DNA-dependent RNA

polymerase. On account of its heterogeneity,

mRNA varies greatly in chain length. Since

few proteins contain less than 100 amino

acids, the mRNA coding for these proteins

must have at least 100 × 3 or 300 nucleotide

residues.



In E. coli, the average size of mRNA is 900 to 1,500 nucleotide units. If mRNA

carries the codes for the synthesis of simple protein molecule, it is called

monocistronic type and if it codes for more than one kind of protein, it is known

as polycistronic type as in Escherichia coli.

The mRNAs are single-stranded and complementary to the sense strand of

their respective structural genes. Although both types of mRNA molecules

(prokaryotic and eukaryotic) are synthesized with a triphosphate group at the

5′ end, there is a basic difference between the two the eukaryotic mRNA

molecules, especially those of mammals, have some peculiar characteristics.

The 5’ end of mRNA is ‘capped’ by a 7-methylguanosine triphosphate which is

linked to an adjacent 2’- O-methylribonucleo side at its 5’-hydroxyl through the

3 phosphates The other end of most mRNA molecules, the 3’ hydroxyl end, has

attached a polymer of adenylate residues, 20–250 nucleotides in length.

DNA DAMAGE AND REPAIR The integrity of DNA is under constant assault from radiation,chemical

mutagens, and spontaneously arising changes. In spite of this onslaught of

damaging agents, the rate of mutation remains remarkably low, thanks to the

efficiency with which DNA is repaired. It has been estimated that fewer than

one in a thousand DNA lesions becomes amutation; all the others are

corrected.

DNA repair is possible largely because the DNA molecule consists of two

complementary strands. DNA damage in one strand can be removed and

accurately replaced by using the undamaged complementary strand as a

template.

General features:

Most DNA repair mechanisms require two nucleotide strands of DNA because

most replace whole nucleotides, and a template strand is needed to specify the

base sequence. The complementary, double-stranded nature of DNA not only

provides stability and efficiency of replication, but also enables either strand to

provide the information necessary for correcting the other.



Many types of DNA damage can be corrected by more than one pathway of

repair. This redundancy testifies to the extreme importance of DNA repair to

the survival of the cell. It ensures that almost all mistakes are corrected. If a

mistake escapes one repair system, it’s likely to be repaired by another system

TYPES OF DNA DAMAGE DNA is by no means the inert substance that might be supposed from naive

consideration of genome stability. Rather, the reactive environment of the cell,

the presence of a variety of toxic substances, and exposure to UV or ionizing

radiation subjects it to numerous chemical insults that excise or modify bases

and alter sugar–phosphate groups

Indeed, some of these reactions occur at surprisingly high rates. For example,

under normal physiological conditions, the glycosidic bonds of ~10,000 of the 3

billion purine nucleotides in each human cell hydrolyze spontaneously each

day. The types and sites of chemical damage to which DNA is normally susceptible in vivo. Red arrows indicate sites subject to oxidative attack, blue

arrows indicate sites subject to spontaneous hydrolysis, and green arrows

indicate sites subject to nonenzymatic methylation by S-

adenosylmethionine.The width of an arrow is indicative of the relative

frequency of the reaction.

DEAMINATION

For some time after the essential functions of nucleic acids had been

elucidated, there seemed no apparent reason for nature to go to the



considerable metabolic effort of using thymine in DNA and uracil in RNA when

these substances have virtually identical base pairing properties.This enigma

was solved by the discovery of cytosine’s penchant for conversion to uracil by

deamination, either via spontaneous hydrolysis, which is estimated to occur

~120 times per day in each human cell, or by reaction with nitrites (Section 32-

1Aa). If U were the normal DNA base, the deamination of C would be highly

mutagenic because there would be no indication of whether the resulting

mismatched G _ U base pair had originally been G C or A=U. Since T is DNA’s

normal base, however, any U in DNA is almost certainly a deaminated C. U’s

that occur in DNA are efficiently excised by uracil–DNA glycosylase [UDG; also

called uracil N-glycosylase (UNG)] and then replaced by C through BER.

UDG also has an important function in DNA replication. dUTP, an intermediate

in dTTP synthesis, is present in all cells in small amounts. DNA polymerases

do not discriminate well between dUTP and dTTP so that, despite the low dUTP

level that cells maintain, newly synthesized DNA contains an occasional U.

These U’s are rapidly replaced by T through BER. However, since excision

occurs more rapidly than repair, all newly synthesized DNA is

fragmented.When Okazaki fragments were first discovered,it therefore seemed

that all DNA was synthesized discontinuously.This ambiguity was resolved with

the discovery of E. coli defective in UDG. In these ung_ mutants, only about half

of the newly synthesized DNA is fragmented, strongly suggesting that DNA’s

leading strand is synthesized continuously. OXIDATIVE DAMAGE

The repair pathways considered to this point generally work only for lesions in

double-stranded DNA, the undamaged strand providing the correct genetic

information to restore the damaged strand to its original state. However, in

certain types of lesions, such as doublestrand breaks, double-strand cross-

links, or lesions in a single-stranded DNA, the complementary strand is itself

damaged or is absent. Double-strand breaks and lesions in single-stranded

DNA most often arise when a replication fork encounters an unrepaired DNA



lesion. Such lesions and DNA cross-links can also result from ionizing

radiation and oxidative reactions. ALKYLATION

The exposure of DNA to alkylating agents such as Nmethyl- N’-nitro-N-

nitrosoguanidine (MNNG) yields, among other products, O6-alkylguanine

residues.

The formation of these derivatives is highly mutagenic because on replication,

they frequently cause the incorporation of thymine instead of cytosine.

PYRIMIDINE DIMMERS

UV radiation of 200 to 300 nm promotes the formation of a cyclobutyl ring

between adjacent thymine residues on the same DNA strand to form an

intrastrand thymine dimer. Similar cytosine and thymine–cytosine dimers are likewise formed but at lesser

rates. Such cyclobutane pyrimidine dimers (CPDs) locally distort DNA’s

base-paired structure such that it can be neither transcribed nor replicated.

Indeed, a single thymine dimer, if unrepaired, is sufficient to kill an E. coli.



The cyclobutylthymine dimer that forms on UV irradiation of two adjacent thymine residues on a DNA strand. The ~1.6-Å-long covalent bonds

joining the thymine rings (red) are much shorter than the normal 3.4-Å spacing

between stacked rings in B-DNA, thereby locally distorting the DNA.

REPAIR PATHWAYS Types:

Direct repair systems as the name suggests, act directly on damaged

nucleotides, converting each one back to its original structure.

Excision repair Involves excision of a segment of the polynucleotide containing

a damaged site, followed by resynthesis of the correct nucleotide sequence by a

DNA polymerase.

Mismatch repair corrects errors of replication, again by excising a stretch of

single-stranded DNA containing the offending nucleotide and then repairing the

resulting gap.

Recombination repair is used to mend double-strand breaks.

PHOTO – REACTIVATION



Cyclobutyl dimers are repaired by a light-dependent direct systemcalled

photoreactivation. In E. coli, the process involves the enzyme calledDNA

photolyase (more correctly named deoxyribodipyrimidine photolyase). When

stimulated by light with a wavelength between 300 and 500 nm the enzyme

binds to cyclobutyl dimers and converts them back to the original monomeric

nucleotides.

Photoreactivation is a widespread but not universal type of repair: it is known

in many but not all bacteria and also in quite a few eukaryotes, including some

vertebrates, but is absent in humans and other placental mammals.



A similar type of photoreactivation involves the photoproduct photolyase and

results in repair oF lesions. Neither E. coli nor humans have this enzyme but it

is possessed by a variety of other organisms.

Photolyases generally contain two cofactors that serve as light-absorbing

agents, or chromophores.One of the chromophores is always FADH. In E. coli

and yeast, the other chromophore is a folate.The reaction mechanism entails

the generation of freeradicals. DNA photolyases are not present in the cells of

placental mammals (which include humans).

A blue-light photon (300 to 500 nm wavelength) is absorbed by the

MTHFpolyGlu, which functions as a photoantenna.

The excitation energy passes to FADH_ in the active site of the enzyme.

The excited flavin (*FADH_) donates an electron to the pyrimidine dimer (shown

here in a simplified representation) to generate an unstable dimer radical.

Electronic rearrangement restores the monomeric pyrimidines, and

The electron is transferred back to the flavin radical to regenerate FADH-

EXCISION REPAIR Every cell has a class of enzymes called DNA glycosylases that recognize

particularly common DNA lesions (such as the products of cytosine and

adenine deamination; and remove the affected base by cleaving the N-glycosyl

bond. This cleavage creates an apurinic or apyrimidinic site in the DNA,

commonly referred to as an AP site or abasic site. This table will give the list of enzymes / protein and their functions.

Each DNA glycosylase is generally specific for one type of lesion. Uracil DNA

glycosylases, for example, found in most cells, specifically remove from DNA

the uracil that results from spontaneous deamination of cytosine.



Bacteria generally have just one type of uracil DNA glycosylase, whereas

humans have at least four types, with different specificities—an indicator of the

importance of uracil removal from DNA.

The most abundant human uracil glycosylase, UNG, is associated with the

human replisome, where it eliminates the occasional U residue inserted in

place of a T during replication.

The deamination of C residues is 100-fold faster in single stranded DNA than

in double-stranded DNA, and humans have the enzyme hSMUG1, which

removes any U residues that occur in single-stranded DNA during replication

or transcription.

Two other human DNA glycosylases, TDG and MBD4, remove either U or T

residues paired with G, generated by deamination of cytosine or 5-



methylcytosine, respectively. Other DNA glycosylases recognize and remove a

variety of damaged bases, including formamidopyrimidine and 8-

hydroxyguanine (both arising from purine oxidation), hypoxanthine (arising

from adenine deamination), and alkylated bases such as 3-methyladenine and

7-methylguanine.

Glycosylases that recognize other lesions, including pyrimidine dimers, have

also been identified in some classes of organisms.

Once an AP site has formed, another group of enzymes must repair it. The

repair is not made by simply inserting a new base and re-forming the N-glycosyl

bond. Instead, the deoxyribose 5’-phosphate left behind is removed and

replaced with a new nucleotide. This process begins with AP endonucleases, enzymes that cut the DNA strand containing the AP site.

The position of the incision relative to the AP site (5’ or 3’ to the site) varies with

the type of AP endonuclease.

A segment of DNA including the AP site is then removed, DNA polymerase I

replaces the DNA, and DNA ligase seals the remaining nick. In eukaryotes,

nucleotide replacement is carried out by specialized polymerases, as described

below.

A DNA glycosylase recognizes a damaged base and cleaves between the base an

deoxyribose in the backbone.

An AP endonuclease cleaves the phosphodiester backbone near the AP site.

DNA polymerase I initiates repair synthesis from the free 3 hydroxyl at the

nick, removing a portion of the damaged strand and replacing it with

undamaged DNA.

The nick remaining after DNA polymerase I has dissociated is sealed by DNA

ligase.

EXCISION REPAIR- NUCLEOTIDE EXCISION REPAIR DNA lesions that cause large distortions in the helical structure of DNA

generally are repaired by the nucleotide-excision system, a repair pathway

critical to the survival of all free-living organisms.



In nucleotide-excision repair, a multisubunit enzyme hydrolyzes two

phosphodiester bonds, one on either side of the distortion caused by the lesion.

In E. coli and other prokaryotes, the enzyme system hydrolyzes the fifth

phosphodiester bond on the 3 side and the eighth phosphodiester bond on the

5 side to generate a fragment of 12 to 13 nucleotides (depending on whether

the lesion involves one or two bases). In humans and other eukaryotes, the

enzyme system hydrolyzes the sixth phosphodiester bond on the 3 side and the

twenty-second phosphodiester bond on the 5 side, producing a fragment of 27

to 29 nucleotides.

Following the dual incision, the excised oligonucleotides are released from the

duplex and the resulting gap is filled—by DNA polymerase I in E. coli and DNA

polymerase € in humans. DNA ligase seals the nick.

In E. coli, the key enzymatic complex is the ABC excinuclease, which has three

subunits, UvrA (Mr 104,000), UvrB (Mr 78,000), and UvrC (Mr 68,000). The

term “excinuclease” is used to describe the unique capacity of this enzyme

complex to catalyze two specific endonucleolytic cleavages, distinguishing this

activity from that of standard endonucleases.

A complex of the UvrA and UvrB proteins (A2B) scans the DNA and binds to the

site of a lesion. The UvrA dimer then dissociates, leaving a tight UvrB-DNA

complex. UvrC protein then binds to UvrB, and UvrB makes an incision at the

fifth phosphodiester bond on the 3_ side of the lesion. This is followed by a

UvrC-mediated incision at the eighth phosphodiester bond on the 5 side. The

resulting 12 to 13 nucleotide fragment is removed by UvrD helicase.

The short gap thus created is filled in by DNA polymerase I and DNA ligase.

This pathway is a primary repair route for many types of lesions, including



cyclobutane pyrimidine dimers, 6-4 photoproducts, and several other types of

base adducts including benzo[a]pyrene-guanine, which is formed in DNA by

exposure to cigarette smoke. The nucleolytic activity of the ABC excinuclease is

novel in the sense that two cuts are made in the DNA.The mechanism of

eukaryotic excinucleases is quite similar to that of the bacterial enzyme,

although 16 polypeptides with no similarity to the E. coli excinuclease subunits

are required for the dual incision.

The general pathway of nucleotide-excision repair is similar in all organisms.

An excinuclease binds to DNA at the site of a bulky lesion and cleaves the

damaged DNA strand on either side of the lesion.

The DNA segment—of 13 nucleotides (13 mer) or 29 nucleotides (29 mer)—is

removed with the aid of a helicase.

The gap is filled in by DNA polymerase, and



The remaining nick is sealed with DNA ligase.

POST REPLICATION REPAIR When the E. coli replication machinery encounters certain nucleotide adducts,

such as pyrimidine dimers, it stops replicating and reinitiates about 1000 base

pairs beyond the adduct, generating a single-stranded gap that contains a

damaged nucleotide.At the same time, a normal duplex is produced from the

complementary strand. Thus, replication of a damaged duplex gives rise to one

duplex with two normal strands and one partial duplex with a lesion in one

strand and a gap in the other. The duplex with the defect is repaired by a

process that involves both recombination and excision repair.

The RecA protein forms a helical filament at the post-replication gap and

promotes homologous pairing with the intact sister duplex. This is followed by

reciprocal strand exchange, so that the gap is “transferred” from the damaged

duplex to the undamaged duplex, concomitant with the formation of a Holliday intermediate. The latter is resolved by a resolvase encoded by the ruvABC or rusA genes.

Filling in the gap by DNA polymerase, using the intact strand as template,

yields two uninterrupted duplexes, one of which still contains a damaged base

which can now be eliminated by a conventional excision repair reaction.

At present, there is no evidence that such a single-strand gap across damage is

generated by the replication machinery of mammalian cells.Hence; post-

replication repair has a different meaning in these cells, namely, the

elimination of base lesions from DNA following replication of the damaged

strand. In mammalian cells, the damaged strand can be converted into a

duplex by “translesion synthesis” or by “template switching.” In translesion

synthesis, the replication machinery simply synthesizes across the damaged

base, frequently by inserting the wrong base. In template switching, the

replication fork stops at the lesion site on the damaged strand, but continues

DNA synthesis on the undamaged strand.

Then the newly synthesized strand is used as template for the strand of the

sister duplex that had been blocked by the lesion. Once the synthesis (error-



free) past the lesion is accomplished, the nascent strand complementary to the

damaged strand switches back to its parental strand. The end result of

translesion replication and template switching is, again, the creation of two

duplexes with no discontinuities. The lesion that remains following replication

is eventually removed by excision repair. Even if the lesion is not removed by

excision repair, however, the post-replication mechanisms outlined above can

be repeated through many rounds of replication and consequently aid cell

survival by ensuring the inheritance of uninterrupted duplexes to the daughter

cells. In contrast to the bacterial system, however, the eukaryotic post-

replication repair phenomenon remains ill-defined. For example, because of

multiple origins of replication in eukaryotes, small post-replication gaps may

be generated by utilizing adjacent replication origins, and the resulting gaps

may be processed as in prokaryotes.

In post-replication repair a lesion in one strand leads to gap formation gap is

invaded by the complementary strand from the sister duplex. Following further

processing by nucleases and DNA junction is formed which is resolved by



RuvABC resolvase. The remaining lesion in the duplex is then removed by

excision link repair, (A) BC excinuclease makes dual incisions in one strand,

and the cross-linked oligomer is displaced by the rec generates a Holliday

structure and a “dangling” oligomer cross-linked to the duplex. This structure

is recognized as a moleculeexcinuclease and is released by dual incisions. The

Holliday structure is resolved and the gaps resulting from recombination in by

polymerases and ligated. (c) In double-strand break the RecBCD

helicase/nuclease unwinds the duplex from both s generates a structure which

can be processed by the RecA strand transfer activity. Further action of

RecBCD and perhaps nucleases generates a double-Holliday structure which is

resolved by resolvases.

SOS REPAIR Extensive DNA damage in the bacterial chromosome triggers the induction of

many distantly located genes. This response, called the SOS response provides

another good example of coordinated gene regulation.

Many of the induced genes are involved in DNA repair.



The key regulatory proteins are the RecA protein and the LexA repressor. The

LexA repressor (Mr 22,700) inhibits transcription of all the SOS genes, and

induction of the SOS response requires removal of LexA.

Mechanism:

The LexA protein is the repressor in this system, which has an operator site

(red) near each gene. Because the recA gene is not entirely repressed by the

LexA repressor, the normal cell contains about 1,000 RecA monomers.

When DNA is extensively damaged (e.g., by UV light), DNA replication is halted

and the number of single-strand gaps in the DNA increases.

RecA protein binds to this damaged, single-stranded DNA, activating the

protein’s coprotease activity.

While bound to DNA, the RecA protein facilitates cleavage and inactivation of

the LexA repressor. When the repressor is inactivated, the SOS genes,

including recA, are induced; RecA levels increase 50- to 100-fold. METHYL DIRECTED MISMATCHED REPAIR

The mismatch repair (MMR) system is responsiblefor removal of base

mismatches caused by spontaneousand induced base deamination, oxidation,

methylation and replication errors. The maintargets of MMR are base



mismatches such as G/T(arising from deamination of 5-methylcytosine), G/G,

A/C and C/C MMR notonly binds to spontaneously occurring base mismatches

but also to various chemically induced DNAlesions such as alkylation-induced

O6-methylguaninepaired with cytosine or thymine.,1,2-intrastrand (GpG)

cross-links generated by cisplatin UV-induced photoproducts purineadducts of

benzo[a]pyrene-7,8-dihydrodiol-9,10-epoxides , 2-aminofluorene orN-acetyl-2-

aminofluorene, and8-oxoguanine The importanceof MMR in maintaining

genomic stability and reducingmutation load is clearly illustrated by

MMRdeficiency syndromes such as HNPCC. The steps by which MMR proceeds

are as follows:

The list of enzymes/proteins are given in the table

Recognition of DNA lesions: The recognition ofmismatches or chemically modified bases are performed by

the so-called MutS_ complex, whichbinds to the lesions. MutS_ is composed of

theMutS homologous proteins MSH2 and MSH6 (also known as GT-binding

protein, GTBP) For an efficient binding to mismatches, phosphorylation of the

MutS_ complex is required MSH2 can also form acomplex with the mismatch

repair protein MSH3. This complex is designated MutS_ Depending on

thebinding partner, the heterodimers have differentsubstrate specificities and,

therefore, play a different role in mismatch repair. Thus, the MutS_complex is

able to bind to base–base mismatchesand to insertion/deletion mismatches,

whereas MutS_ is only capable of bindingto insertion/deletion mismatches.

Strand discrimination:



Presently, it is not clearhow MMR discriminates between the parental andthe

newly synthesised DNA strand. It is supposedthat the daughter strand is

identified by non-ligatedsingle-strand breaks (SSB) arising during

replicationThe problem withthis model is that the SSB and the mismatch

canbe separated from each other by a great distance.How then can MutS_

recognize both the SSBand the mismatch? An answer could be providedby the

studies concerning the role of ATP duringMMR. Both proteins (MSH2 and

MSH6) containATP/ADP-binding sites. Mutationof these sites lead to

attenuation of MMR activity but not to abrogation of GT binding. Twomodels

are under consideration concerning therole of ATP/ADP binding and ATP



hydrolysis: Inthe molecular switch model it is assumed that the MutS_–

ADPcomplex is responsible for the recognition andbinding of the mismatch

(‘active state’). Bindingto a mismatch triggers ADP → ATP transition.and

stimulates the intrinsicATPase activity leading to conformational changes and

the formation of a hydrolysis-independent sliding clamp.

This sliding clamp passively diffusesfrom the mismatch and signals the

dissociationof the MMR proteins from the DNA (‘inactivestate’. In this model,

hydrolysisof ATP by MutS_ provokes conformationalchanges and thereby

enables the binding of10 MMutL_. In addition, dissociation of MutS_ fromthe

DNA depends on ATP binding and not hydrolysisInthe hydrolysis-driven

translocation model, MutS_ uses the energy gained by ATP hydrolysis

totranslocate actively along the DNA from the siteof mismatch recognition to a

site responsible forsignaling the strand specificity (most likely SSB).The

assembly of the MutL_ complex occurs at thissignaling site

Excision and repair synthesis: Upon binding tothe mismatch, MutS_ associates with anotherheterodimeric

complex (MutLα_), consisting ofthe MutL homologous mismatch repair

proteinsMLH1 and PMS2 The excision of the DNA strand containing them is

paired base is performed by exonuclease I and the new synthesisby PolWhether

or notMMR is inducible by genotoxic stress is still amatter of debate. The

promoter of MSH2 harboursa p53-binding site and was found to Be

inducibleupon co-transfection with p53 and Fos/JunIncreaseof MSH2 mRNA

in genotoxin-exposed cellshowever still needs to be demonstrated. Treatmentof

cells with alkylating agents such as MNNGprovoked nuclear translocation of

MSH2/MSH6and increase of MutS_ mismatch binding activity Therefore, both

transcriptional and post-translational mechanisms appear likely to be involved

in the regulation ofMMR..

Importance: The importance of the MutSL system for mismatch repair is indicated by the

high rate at which it is found to be defective in human cancers. Loss of this

system leads to an increased mutation rate.



Unit – 2 DNA REPLICATION DNA replication in prokaryotes and viruses ( The rolling circle and M13 bacteriophages replication), asymmetric replication, looped, rolling circle, semiconservative replication, primer or template, concotamy formation – P1. Origin of replication, replication fork – leading and lagging strands, enzymes involved at different steps of replication. Fidelity of replication. Extrachromosomal replicons.

DNA REPLICATION IN PROKARYOTES INITIATION: The synthesis of a DNA molecule can be divided into three stages: initiation,

elongation, and termination, distinguished both by the reactions taking place

and by the enzymes required.

ori-c plays important role in initiation of replication.

ORIGIN OF REPLICATION Ori-C

The E. coli replication origin, oriC, consists of 245 bp; it bears DNA sequence

elements that are highly conserved among bacterial replication origins.

The key sequences of interest here are two series of short repeats: three repeats

of a 13 bp sequence and four repeats of a 9 bp sequence.

Arrangement of sequences in the E. coli replication origin, oriC. Although the repeated sequences (shaded in color) are not identical, certain

nucleotides are particularly common in each position, forming a consensus

sequence.

In positions where there is no consensus, N represents any of the four

nucleotides. The arrows indicate the orientations of the nucleotide sequences.



The timing of replication initiation is affected by DNA methylation and

interactions with the bacterial plasma membrane.

The oriC DNA is methylated by the Dam methylase , which methylates the N6

position of adenine within the palindromic sequence (5’)GATC.

The oriC region of E. coli is highly enriched in GATC sequences—it has 11 of

them in its 245 bp, whereas the average frequency of GATC in the E. coli

chromosome as a whole is 1 in 256 bp.

Immediately after replication, the DNA is hemimethylated: the parent strands

have methylated oriC sequences but the newly synthesized strands do not.

The hemimethylated oriC sequences are now sequestered for a period by

interaction with the plasma membrane.

After a time, oriC is released from the plasma membrane, and it must be fully

methylated by Dam methylase before it can again bind DnaA. Regulation of

initiation also involves the slow hydrolysis of ATP by DnaA protein, which



cycles the protein between active (with bound ATP) and inactive (with bound

ADP) forms on a timescale of 20 to 40 minutes.

At least nine different enzymes or proteins participate in the initiation phase of

replication. They open the DNA helix at the origin and establish a prepriming

complex for subsequent reactions.

DnaA protein

The crucial component in the initiation process is the DnaA protein.

A single complex of four to five DnaA protein molecules binds to the four 9 bp

repeats in the origin, then recognizes and successively denatures the DNA in

the region of the three 13 bp repeats, which are rich in A=T pairs.

This process requires ATP and the bacterial histone like protein HU.

After this other proteins comes into picture and continues the process.

About 20 DnaA protein molecules, each with a bound ATP, bind at the four 9

bp repeats. The DNA is wrapped around this complex.

The three AUT-rich 13 bp repeats are denatured sequentially.

Hexamers of the DnaB protein bind to each strand, with the aid of DnaC

protein.

The DnaB helicase activity further unwinds the DNA in preparation for priming

and DNA synthesis.

ELONGATION DNA POLYMERASES OF E-COLI:



Properties DNA polymerase I and III:

These 2 have fundamental properties that carry critical implications for DNA

replication. All polymerases synthesize DNA only 3’ to 5’ direction, adding a

dNTPs to the 3’ hydroxyl group of a growing chain.

DNA polymerases can add a new deoxyribonucleotide only to a preformed

primer strand that is hydrogen bonded to the template; they are not able to

initiate DNA synthesis de novo by catalyzing the polymerization of free

dNTPs.In this respect, DNA polymerases differ from RNA polymerases, which

can initiate the synthesis of new strand of RNA in the absence of primer.

Mechanism of DNA polymerase I and III:

Introduction: The E.coli genome encodes three DNA polymerases(DNA polymerase I, II and III

or Pol I, II, III.

DNA polymerase I or Pol I: This was discovered by Nobel Laurate Arthur Korenberg in E-coli in 1957 and

also called as Kornberg enzyme.

It is a single polypeptide with molecular weight of 109 KDa.

There are about 400 molecules of enzymes in a single bacterial cell



These are roughly spherical in nature with diameter of 6.5 nm and are metallo

enzyme that contains Zn2+

The pol –I enzyme do not execute the DNA synthesis rather, it can concentrate

on proof reading and DNA repair.

The enzyme has following biological functions:

5’ to 3’ Exonuclease activity: This activity is in the smaller fragment of DNA pol

This activity is responsible to remove the primer from the 5’ end of newly

synthesized chain.

It also plays important role in DNA repair mechanism.

Thymic dimer occurs in DNA, when cell is exposed to ultraviolet light and such

dimers interferes with the movement of replication fork and blocks replication.

Therefore, the 5’ to 3’ exonuclease activity of pol-I can correct such DNA

damages by excession of pyrimidine dimer regions. b. 3’ to 5’ Exonuclease

activity:

It involves the elimination of mismatch base pair on primer thus it functions as

a proof reading enzyme.

The ligase subunit of polymerase I known as klenow fragment has this activity

This mismatch base pair results (mol.wt=68 KDa) resulted during

polymerization are corrected by 3’ to 5’ exonuclease activity.

5’ to 3’ Exonuclease activity: The activity of this enzyme helps in the synthesis of small fragment of DNA and

thus takes part in repair synthesis.

This helps in filling of gaps resulted due to removal of RNA primers.

Klenow fragment: DNA polymerase I, is not the primary enzyme of replication; instead it performs

a host of clean-up functions during replication, recombination, and repair. The

polymerase’s special functions are enhanced by its 5’→3’ exonuclease activity.

This activity, distinct from the 3’→5’ proofreading exonuclease is located in a

structural domain that can be separated from the enzyme by mild protease

treatment. When the 5’→3’exonuclease domain is removed, the remaining



fragment (Mr 68,000), the large fragment called Klenow fragment retains the

polymerization and proofreading activities. The 5’→3’ exonuclease activity of

intact DNA polymerase I can replace a segment of DNA (or RNA) paired to the

template strand, in a process known as nick translation DNA polymerase III: It is also known as replicase and is chiefly involved in DNA synthesis in 5’ to 3’

direction

It is the principle replication DNA pol of E.COLI

This enzyme in its action form is associated with 9 proteins to form a

Holoenzymehaving mol.wt.140KDa.

The smallest aggregate of subunits having enzyme activity is known as” Core

enzyme”.

It has both 5’ to 3’ polymerization activity and 3’to5’exonucleaseactivity.

Leading and lagging strand:

A replication fork (Growing point) is the point at which strands of parental

duplex DNA are separated so that replication can proceed.

A complex of proteins including DNA polymerase is found at the fork.

When the circular DNA chromosomeof E. coli is copied, replication begins at a

single point, theorigin. Synthesis occurs at the replication fork, the place

atwhich the DNA helix is unwound and individual strands are replicated.

Two replication forks move outward from the origin untilthey have copied the

whole replicon, that portion of the genome that contains an origin and is

replicated as a unit. When the replicationforks move around the circle, a

structure shaped like theGreek letter theta (θ) is formed. Finally, since the

bacterial chromosome is a single replicon, the forks meet on the other side and

two separate chromosomes are released.

In both bacteria and mammals replication forks originate at a structure called

a replication bubble,a local region where the two strands of the parental DNA

helix have been separated from eachother to serve as templates for DNA

synthesis



Events occring

During replication the DNA double helix must be unwound togenerate separate

single strands. Helicaseswhich binds to atrich region of DNA called

replication origins, are responsible for DNA unwinding. These enzymes

useenergy from ATP to unwind short stretches of helix just ahead of

thereplication fork. Once the strands have separated, they are kept single

through specific binding with single-stranded DNA bindingproteins (SSBs)

Rapid unwinding can lead to tension and formation of supercoils or supertwists

in the helix. The tension generated by unwinding is relieved, and the

unwinding process is promoted by enzymes known as topoisomerases.

DNA gyrase is an E. coli topoisomerase that removes the supertwists produced

during replication.

DNA is probably replicated continuously by DNA polymerase III when the

leading strand is copied. Lagging strand replication is discontinuous, and the

fragments are synthesized in the 5′ to 3′ direction just as in leading strand

synthesis.

First, a special RNA polymerase called a primase synthesizes a short RNA

primer, usually around 10 nucleotides long, complementary to the DNA. It

appears that the primase requires the assistance of several other proteins, and

the complex of the primase with its accessory proteins is called the

primosome.

DNA polymerase III holoenzyme then synthesizes complementary DNA

beginning at the 3′ end of the RNA primer.

In order for DNA polymerases to move along and copy a duplex DNA, helicase

must sequentially unwind the duplexand topoisomerase must remove the

supercoils that form.

A major complication in the operation of a DNA replicationfork arises from two

properties: the two strands of theparental DNA duplex are antiparallel, and

DNA polymerases (like RNA polymerases) can add nucleotides to thegrowing

new strands only in the 5’→3’ direction.



Synthesisof one daughter strand, called the leading strand, can

proceedcontinuously from a single RNA primer in the 5’→3’direction, the same

direction as movement of the replicationfork. The problem comes in synthesis

of theother daughter strand, called the lagging strand.

A cell accomplishes lagging strand synthesis by synthesizing a new primer

every few hundred bases or so on the second parental strand, as more of the

strand is exposed by unwinding. Each of these primers, base-paired to their

template strand, is elongated in the 5’→3’ direction, forming discontinuous

segments called Okazaki fragments.

Ligation or Nick translation: The 5’ to 3’ exonuclease activity at a single strand break( a nick) can occur

simultaneously with polymerization. That is as a, 5’-P nucleotide is removed, a

replacement can be made by the polymerizing activity. Since pol I cannot form

a bond between a 3’-OH group and 5’-monophosphate, the nick moves along

the DNA molecule in the direction of synthesis. This movement is called Nick

Translation.

The process steps have followingly:



In this process, an RNA or DNA strand paired to a DNA template is

simultaneously degraded by the 5’ to 3’ exonuclease activity of DNA polymerase

I and replaced by the polymerase activity of the same enzyme.

These activities have a role in both DNA repair and the removal of RNA primers

during replication (both described later).

The strand of nucleic acid to be removed (either DNA or RNA) is shown in

green, the replacement strand in red. DNA synthesis begins at a nick (a broken

phosphodiester bond, leaving a free 3’ hydroxyl and a free 5’ phosphate).

Polymerase I extend the nontemplate DNA strand and moves the nick along the

DNA—a process called nick translation.

A nick remains where DNA polymerase I dissociates, and is later sealed by

another enzyme.

TERMINATION OF REPLICATION

DNA replication stops when the polymerase complex reaches a termination

site on the DNA in E. coli. The Tus protein binds to these Tersites and halts



replication. In many procaryotes, replication stops randomly when the forks

meet.

Eventually, the two replication forks of the circular E. coli chromosome meet

at a terminus region containing multiple copies of a 20 bp sequence called

Ter (for terminus). The Ter sequences are arranged on the chromosome to

create a sort of trap that a replication fork can enter but cannot leave. The

Ter sequences function as binding sites for a protein called Tus (terminus

utilization substance). The Tus-Ter complex can arrest a replication fork

from only one direction.

Only one Tus-Ter complex functions per replication cycle—the complex first

encountered by either replication fork. Given that opposing replication

forksgenerally halt when they collide, Ter sequences do notseem essential,

but they may prevent overreplication byone replication fork in the event that

the other is delayedor halted by an encounter with DNA damage orsome

other obstacle.

When either replication fork encounters a functional Tus-Ter complex, it

halts; the other fork halts when it meets the first (arrested) fork.



The final few hundred base pairs of DNA between these large protein

complexes are then replicated (by an as yet unknown mechanism),

completing two topologically interlinked (catenated) circular chromosomes.

DNA circles linked in this way are known as catenanes. Separation of the

catenated circles in E. coli requires topoisomerase IV (a type II

topoisomerase). The separated chromosomes then segregate into daughter

cells at cell division. The terminal phase of replication of other circular

chromosomes, including many of the DNA viruses that infect eukaryotic

cells, is similar. DNA REPLICATION IN VIRUSES

Replication of various human adenoviruses entry takes place via interactions of

the fiber knob with specific receptors on the surface of a susceptible cell

followed by internalization via interactions between the penton base and

cellular integrins. After uncoating, the virus core is delivered to the nucleus,

which is the site of virus transcription, DNA replication, and assembly.

Virus infection mediates the shutdown of host DNA synthesis and later RNA

and protein synthesis. Transcription of the adenovirus genome by host RNA

polymerase II involves both DNA strands of the genome and initiates (in HAdV-

2) from five early (E1A, E1B, E2, E3, and E4), two intermediate, and the major

late (L) promoter.

All primary transcripts are capped and polyadenylated, with complex splicing

patterns producing families of mRNAs. In primate adenoviruses, one or two VA

RNA genes are usually present upstream from the main pTP coding region.

These are transcribed by cellular RNA polymerase III and facilitate translation

of late mRNAs and blocking of the cellular interferon response.

Corresponding VA RNA genes have not been identified in nonprimate

adenoviruses, although a nonhomologous VA RNA gene has been mapped in

some aviadenoviruses near the right end of the genome. More generally, the

replication of aviadenoviruses has been shown to involve significantly different

pathways from those characterized in human adenoviruses.



This is not unexpected, given the considerable differences in gene layout

between nonconserved regions of the genome.

About 40 different polypeptides (the largest number being in fowl adenoviruses

and the smallest in siadenoviruses) are produced. Almost a third of these

compose the virion, including a virus-encoded cysteine protease

1. Adsorption of virions to the cell surface

2. Entry by endocytosis

3. Transport to the cell nucleus (route and mechanism not yet known);

4. Uncoating;

5. Transcription to produce early region mRNAs;

6. Translation to produce early proteins (T antigens);

7. Viral DNA replication;

8. Transcription to produce late region mRNAs;

9. Translation to produce late proteins (capsid proteins);

10. Assembly of progeny virions in the nucleus;

11. Entry of virions into cytoplasmic vesicles (mechanism unknown);

12. Release of virions from the cell by fusion of membrane vesicles with the plasma

membrane;

13. Released virion. Virions are most likely also released from cells at cell death

when virions have an opportunity to leak out of the nucleus.



In nonpermissive cells, the first six steps occur normally, but viral DNA

replication cannot occur and subsequent events do not take place. M13 BACTERIOPHAGES REPLICATION

In DNA replication, the DNA polymerase cannot initiate the synthesis of a new

DNA strand and must rely on a priming device.

In general, an RNA primer is synthesized at or near a replication origin to start

synthesis of the leading strand. However, a DNA primer terminus can be

generated by a nuclease-generated nick at a specific place in some circular

duplex DNA, and replication will then proceed unidirectionally, as shown in

Figure. This mode of replication is called rolling circle replication and is found

for replication of the replicative form (RF) form of bacteriophage singlestranded

genomes of Gram-negative bacteria and of the multicopy plasmids of Gram-

positive bacteria.

Rolling circle replication is also observed in the late stage of the replication of

the lambda phage genome and in the process of the conjugative transfer of

bacterial plasmids.



DNA synthesis initiates using the free 3′-OH end at the nick as a primer, and a

replication fork proceeds around the template. In the process, the newly

synthesized strand displaces the old strand from the template. In the case of

replication of the RF form of single-stranded phage genomes and of plasmids

of Gram-positive bacteria, the displaced old strand is cleaved off after one

round of replication and is converted into the circular, double-stranded form.

In contrast, in phage lambda replication, the replication fork precedes a

number of revolutions around the template without cleavage of the displaced

strand, and the displaced strand becomes double-stranded as it is peeled off.

The linear concatemer thus created is cleaved into one unit length and

packaged into the phage particles. In the conjugation process of plasmids, the

displaced strand is transferred into the new cell. ASYMMETRIC REPLICATION

LOOPED Mitochondrial DNA replication:

The origins of replicons in both prokaryotic and eukaryotic chromosomes are

static structures: they comprise sequences of DNA that are recognized in

duplex form and used to initiate replication at the appropriate time.

Initiation requires separating the DNA strands and commencing bidirectional

DNA synthesis. A different type of arrangement is found in mitochondria.

Replication starts at a specific origin in the circular duplex DNA.But initially

only one of the two parental strands (the H strand in

mammalianmitochondrial DNA) is used as a template for synthesis of a

newstrand.

Synthesis proceeds for only a short distance, displacing the originalpartner

(L) strand, which remains single-stranded. The condition of this region gives

rise to its name as thedisplacement or D loop.

DNA polymerases cannot initiate synthesis, but require a priming 3'end.

Replicationat the H strand origin is initiated when RNA polymerase



transcribes aprimer. 3' ends are generated in the primer by an endonuclease

thatcleaves the DNA-RNA hybrid at several discrete sites.

The endonucleaseis specific for the triple structure of DNA-RNA hybrid plus

the displacedDNA single strand. The 3' end is then extended into DNA by

theDNA polymerase.

A single D loop is found as an opening of 500-600 bases in mammalian

mitochondria. The short strand that maintains the D loop is unstableand

turns over; it is frequently degraded and resynthesized tomaintain the

opening of the duplex at this site.

Some mitochondrialDNAs possess several D loops, reflecting the use of

multiple origins. The same mechanism is employed in chloroplast DNA,

where (inhigher plants) there are two D loops.

To replicate mammalian mitochondrial DNA, the short strand in theD loop is

extended. The displaced region of the original L strand becomeslonger,

expanding the D loop.

This expansion continues until itreaches a point about two-thirds of the way

around the circle. Replicationof this region exposes an origin in the displaced

L strand. Synthesisof an H strand initiates at this site, which is used by a

special primasethat synthesizes a short RNA.

The RNA is then extended by DNA polymerase,proceeding around the

displaced single-stranded L template inthe opposite direction from L-strand

synthesis.

Because of the lag in its start, H-strand synthesis hasproceeded only a third

of the way around the circle whenL-strand synthesis finishes.

This releases one completedduplex circle and one gapped circle, which

remains partiallysingle-stranded until synthesis of the H strand iscompleted.

Finally, the new strands are sealed to becomecovalently intact.

The existence of D loops exposes a general principle.An origin can he a

sequence of DNA that serves to initiateDNA synthesis using one strand as

template.



Theopening of the duplex does not necessarily lead to theinitiation of

replication on the other strand. In the case ofmitochondrial DNA replication,

the origins for replicatingthe complementary strands lie at different

locations. Origins thatsponsor replication of only one strand are also found

in the rolling circlemode of replication

ROLLING CIRCLE Phage ØX174 consists of a single-stranded circular DNA, known as the plus (+)

strand. A complementary strand, called the minus (-) strand, is synthesized.

This action generates the duplex circle shown at the top of the figure, which is

then replicated by a rolling circle mechanism.

The duplex circle is converted to a covalently closed form, which becomes

supercoiled. A protein coded by the phage genome, the A protein, nicks the (+)



strand of the duplex DNA at a specific site that defines the origin for

replication.

After nicking the origin, the A protein remains connected to the 5' end that it

generates, while the 3' end is extended by DNA polymerase.

The structure of the DNA plays an important role in this reaction, for the DNA

can be nicked only when it is negatively supercoiled.

The A protein is able to bind to a single-stranded decamer fragment of DNA

that surrounds the site of the nick. This suggests that the supercoiling is

needed to assist the formation of a single-stranded region that provides the A

protein with its binding site. (An enzymatic activity in which a protein cleaves

duplex DNA and binds to a released 5' end is sometimes called a relaxase. The

nick generates a 3'-OH end and a 5'-phosphate end (covalently attached to the

A protein), both of which have roles to play in ØX174 replication.

Using the rolling circle, the 3'-OH end of the nick is extended into a new chain.

The chain is elongated around the circular (-) strand template, until it reaches

the starting point and displaces the origin. Now the A protein functions again.

It remains connected with the rolling circle as well as to the 5' end of the

displaced tail, and it is therefore in the vicinity as the growing point returns

past the origin. So the same A protein is available again to recognize the origin

and nick it, now attaching to the end generated by the new nick.

The cycle can be repeated indefinitely. Following this nicking event, the

displaced single (+) strand is freed as a circle. The A protein is involved in the

circularization. In fact, the joining of the 3' and 5' ends of the (+) strand

product is accomplished by the A protein as part of the reaction by which it is

released at the end of one cycle of replication, and starts another cycle.

The A protein has an unusual property that may be connected with these

activities. It is cz's-acting in vivo. (This behavior is not reproduced in vitro, as

can be seen from its activity on any DNA template in a cellfree system. The

implication is that in vivo the A protein synthesized by a particular genome can

attach only to the DNA of that genome. We do not know how this is



accomplished. However, its activity in vitro shows how it remains associated

with the same parental (-) strand template.

The A protein has two active sites; this may allow it to cleave the "new" origin

while still retaining the "old" origin; then it ligates the displaced strand into a

circle. The displaced (+) strand may follow either of two fates after

circularization.

During the replication phase of viral infection, it may be used as a template to

synthesize the complementary (-) strand. The duplex circle may then be used

as a rolling circle to generate more progeny.

During phage morphogenesis, the displaced (+) strand is packaged into the

phage virion.

SEMICONSERVATIVE REPLICATION

Definition Each DNA strand serves as a template for the synthesis of a new

strand,producing two new DNA molecules, each with one new strand and one

old strand. This is semiconservativereplication.

Processing



In the semiconservative mode, first proposed by Watson and crick each

parental DNA strand serves as a tempelate for one new or daughter strand and

as each new strand is formed, it is hydrogen- bonded to its parent tempelate.

Thus, replication proceeds, the parental double helix unwinds and then

rewinds again into two new double helices, each of which contains one

originally parental strand and newly formed daughter strand.

Experimental proof: Meselson- Stahl experiment

Aim: To prove that DNA replication of double stranded DNA follows semiconservative

mode of replication.

Principle: If the parental DNA "heavy,, density label because the organism has been

grown in medium containing a suitable isotope such as 15N, its strands can be

distinguished from those that are synthesized when the organism is transferred

to a medium containing normal "light" isotopes e.g. 14N. When DNA was

extracted from bacteria and its density measured by centrifugation, the DNA

formed bands corresponding to its density depicting the amount of parental

and newly synthesized DNA during the process of replication.

Procedure: A simple method was developed by the scientists by which the parental and

daughter strands could be distinguished.

Culture of bacteria ( E. coli ) was grown for many generations in growth

medium containing 15N- labeled NH4Cl as sole source of nitrogen ( called a

heavy medium ).

In this way parent DNA was labeled with heavy isotope 15N therebyincreasing

the density of the DNA.

The cells were transferred to a medium containing common isotope of nitrogen, 14N (light medium). At various times after transfer, the samples of the cells were

collected and the DNA was isolated.

The DNA molecules were fragmented during isolation.



In semiconservative mode, after one generation all the daughter molecules

would have one 15Nand one 14N strand called as hybrid molecule.

Hence all the daughter molecules would have same density ( hybrid density )-

namely midway between that of (15N15N) and (14N14N) molecules.

When DNA was extracted from bacteria and its density measured by

centrifugation in CsClas function of time after the change from heavy to light

medium, the result obtained showed that all DNA had a hybrid density after

one round of replication, indicated that semiconservative mode is correct.

The second experiment confirmed the structure of the (15N14N) DNA found after

one generation. In this experiment the hybrid DNA was denatured by heating to

1000 C and centrifugedin CsCl.

The heated DNA yielded two bands having the densities of denatured single

stranded (15N) and (14N) DNA of hybrid density did in fact consist of one 14N and

one 15N strand.

Result: During the two generations, the DNA formed bands corresponding to its

density— heavy for parental, hybrid for the first generation, and half hybrid

and half light in the second generation.



PRIMER Introduction A primer is a strand segment (complementary to the template) with a free

3’-hydroxyl group to which a nucleotide can be added. The free 3’end of the

primer is called the primer terminus. It is required during initiation

process of replication.

Characteristic features It is a part of the new strand must already be in place as all DNA

polymerases can only add nucleotides to a preexisting strand.

Most primers are oligonucleotides. These are RNA rather than DNA.

A specialized RNA polymerase called primase forms a short RNA primer

complementary to the unwound template strand

TEMPLATE: Introduction

All DNA polymerases require a template for DNA replication. It is required

during initiation process of replication.

Characteristic features It is complementary to newly synthesized strand in replication.

The polymerization reaction is guided by a template DNA strand according to

the base-pairing rules.

As predicted by Watson and Crick: where a guanine is present in the

template, a cytosine deoxynucleotide is added to the new strand, and where

a adenine is present thymine is added and vice versa.



The two DNA strands are antiparallel, thus the strand serving as the

template is read from its 3’end toward its 5’end.

CONCOTAMY FORMATION – P1 REPLICATION FORK – LEADING AND LAGGING STRANDS

Introduction:

A replication fork (Growing point) is the point at which strands of parental

duplex DNA are separated so that replication can proceed.

A complex of proteins including DNA polymerase is found at the fork.

When the circular DNA chromosomeof E. coli is copied, replication begins at a

single point, theorigin. Synthesis occurs at the replication fork, the place

atwhich the DNA helix is unwound and individual strands are replicated.

Two replication forks move outward from the origin untilthey have copied the

whole replicon, that portion of the genome that contains an origin and is

replicated as a unit. When the replicationforks move around the circle, a

structure shaped like theGreek letter theta (θ) is formed. Finally, since the

bacterial chromosome is a single replicon, the forks meet on the other side and

two separate chromosomes are released.

In both bacteria and mammals replication forks originate at a structure called

a replication bubble,a local region where the two strands of the parental DNA

helix have been separated from eachother to serve as templates for DNA

synthesis

Events occring

During replication the DNA double helix must be unwound togenerate separate

single strands. Helicaseswhich binds to atrich region of DNA called

replication origins, are responsible for DNA unwinding. These enzymes

useenergy from ATP to unwind short stretches of helix just ahead of



thereplication fork. Once the strands have separated, they are kept single

through specific binding with single-stranded DNA bindingproteins (SSBs)

Rapid unwinding can lead to tension and formation of supercoils or supertwists

in the helix. The tension generated by unwinding is relieved, and the

unwinding process is promoted by enzymes known as topoisomerases.

DNA gyrase is an E. coli topoisomerase that removes the supertwists produced

during replication.

DNA is probably replicated continuously by DNA polymerase III when the

leading strand is copied. Lagging strand replication is discontinuous, and the

fragments are synthesized in the 5′ to 3′ direction just as in leading strand

synthesis.

First, a special RNA polymerase called a primase synthesizes a short RNA

primer, usually around 10 nucleotides long, complementary to the DNA. It

appears that the primase requires the assistance of several other proteins, and

the complex of the primase with its accessory proteins is called the

primosome.

DNA polymerase III holoenzyme then synthesizes complementary DNA

beginning at the 3′ end of the RNA primer.

In order for DNA polymerases to move along and copy a duplex DNA, helicase

must sequentially unwind the duplexand topoisomerase must remove the

supercoils that form.

A major complication in the operation of a DNA replicationfork arises from two

properties: the two strands of theparental DNA duplex are antiparallel, and

DNA polymerases (like RNA polymerases) can add nucleotides to thegrowing

new strands only in the 5’→3’ direction.

Synthesisof one daughter strand, called the leading strand, can

proceedcontinuously from a single RNA primer in the 5’→3’direction, the same

direction as movement of the replicationfork. The problem comes in synthesis

of theother daughter strand, called the lagging strand.



A cell accomplishes lagging strand synthesis by synthesizing a new primer

every few hundred bases or so on the second parental strand, as more of the

strand is exposed by unwinding. Each of these primers, base-paired to their

template strand, is elongated in the 5’→3’ direction, forming discontinuous

segments called Okazaki fragments.

ENZYMES INVOLVED AT DIFFERENT STEPS OF REPLICATION



FIDELITY OF REPLICATION Fidelity of Polymerases: Fidelity exhibit varying degrees of fidelity, ranging from one misincorporation

per 5000 to one per 107 nucleotides polymerized.

Those that incorporate the proper templated nucleotide at high efficiency are

termed high-fidelity enzymes, and those that frequently misinsert a nucleotide

are termed low-fidelity. Several polymerases contain a 3′-5′ exonuclease

subdomain (ie, a proofreading subunit) which increases the fidelity of the

enzyme by approximately 10- to 100-fold.

The fidelity of polymerases is determined by one of several procedures. Fidelity

of DNA synthesis was initially measured by utilizing polynucleotide templates

consisting of only one or two types of nucleotides, such as an alternating poly

d(A-T) template, and measuring the extent of misincorporation of radioactive

cytosine or guanine nucleotides.

Greater sensitivity has been obtained with biological reversion assays, in which

misincorporation by DNA polymerase results in the converting an amber

mutation (ie, stop codon) in a plasmid into one that encodes an active, full-

length protein.

The forward mutational assays developed more recently offer the additional

advantage of determining the mutational spectrum, that is, the types of

misincorporated nucleotides catalyzed by the polymerase.

LacZ has been most extensively utilized in these forward mutational assays as

a reporter gene for studies on the mutational spectrum of DNA polymerases.

Upon transformation of the copied plasmid (which encodes the LacZ gene) into

E. coli and plating the transfected bacteria in the presence of X-gal (which is

converted to a blue staining metabolite by the protein encoded by the LacZ

gene, b-galactosidase), the fidelity is determined simply by counting the

number of blue and white colonies resulting from functional (or nonmutated) or

nonfunctional (or mutated) LacZ gene, respectively. Sequencing the LacZ gene

mutants determines the mutational spectrum. The fidelity of incorporation is



also determined kinetically by comparing the ratio k cat/Km of the incorrect

nucleotide to that of the correct nucleotide, this ratio directly reflects the

efficiency of nucleotide incorporation.

As a second step, the same assay measures the fidelity of extension by using

primers that terminate in a noncomplementary nucleotide and measuring the

incorporation of complementary nucleotides onto the end of this primer.

Processivity: Processivity refers to the number of nucleotides incorporated per binding event

of the polymerase with the template-primer complex. The processivity values of

different polymerases range from one nucleotide to about ten thousand. The

processivities of several polymerases involved in genomic replication are

enhanced upon binding to a second protein, termed the processivity factor.

For example, to fulfill their roles efficiently during DNA replication in

eukaryotes, DNA polymerases d and associate with a homotrimer that has 36-

kDa subunits of proliferating cellular nuclear antigen (PCNA) which form a

“sliding clamp”. Phage T4 gene 45 protein and E. coli beta similarly augment

the processivity of T4 DNA pol and pol III, respectively, by acting as “sliding

clamps” bound to the polymerase, thus preventing its dissociation from DNA.

PROOFREADING: One mechanism intrinsic to virtually all DNA polymerases is a separate 3’→5’

exonuclease activity that double-checks each nucleotide after it is added. This

nuclease activity permits the enzyme to remove a newly added nucleotide and

is highly specific for mismatched base pairs .

If the polymerase has added the wrong nucleotide, translocation of the enzyme

to the position where the next nucleotide is to be added is inhibited. This

kinetic pause provides the opportunity for a correction. The 3’→5’ exonuclease

activity removes the mispaired nucleotide, and the polymerase begins again.



This activity, known as proofreading, is not simply the reverse of the

polymerization reaction because pyrophosphate is not involved.

The polymerizing and proofreading activities of a DNA polymerase can be

measured separately.



Proof reading improves the inherent accuracy of the polymerization reaction

10² to 10³ fold.

In the monomeric DNA polymerase I, the polymerizing and proofreading

activities have separate active sites within the same polypeptide.

When base selection and proofreading are combined, DNA polymerase leaves

behind one net error for every 106 to 108 bases added. Yet the measured

accuracy of replication in E. coli is higher still.

The additional accuracy is provided by a separate enzyme system that repairs

the mismatched base pairs remaining fter replication

EXTRACHROMOSOMAL REPLICONS. Two basic types of extrachromosomal replicons are found in bacterial cells:

Plasmids are small circular double-stranded DNA molecules which individually

contain very few genes. Their existence is intracellular, being vertically

distributed to daughter cells following host cell division, but they can be

transferred horizontally to neighboring cells during bacterial conjugation.

Natural examples include plasmids which carry the sex factor (F) and those

which carry drug-resistance genes.

Bacteriophages are viruses which infect bacterial cells. DNA-containing

bacteriophages often have genomes containing double-stranded DNA which

may be circular or linear. Unlike plasmids, they can exist extracellularly. The

mature virus particle (virion) has its genome encased in a protein coat so as to

facilitate adsorption and entry into a new host cell.

In order for naturally occurring replicons to be used as vector molecules for

cellbased DNA cloning, various modifications need to be made. Similarly, the

host cells that are used for cloning are specialized cells whose genotype has

been selected to optimize their use in DNA cloning. Typically, cloning systems

are designed to ensure that joining of the foreign DNA fragment occurs at a

unique location in the vector molecule. Additionally, they have in-built

selection systems so that cells which contain the relevant vector molecule can

be specifically selected. In many cases, there are additional screening systems

to ensure detection and propagation of cells containing recombinant DNA



UNIT – 3 TRANSCRIPTION: Transcription factors and machinery, formation of initiation complex, transcription activators and repressors, RNA polymerases. Intiation, elongation and termination. Heat shock response. Inhibitors of RNA synthesis and their mechanism. Polycystronic and monocystronic mRNA. Control of elongation and termination. Alternate sigma factors. Post transcriptional modifications of m-RNA – capping, editing, splicing, polyadenylation, modifications of t RNA and r RNA.

TRANSCRIPTION FACTORS AND MACHINERY The transcription reaction can be divided into four stages,in which a bubble is

created, RNA synthesis begins,the bubble moves along the DNA, and finally is

terminated:

FORMATION OF INITIATION COMPLEX Template recognition begins with the binding of RNA polymerase to the double-

stranded DNA at a promoter to form a "closed complex". Then the strands of

DNA are separated to form the "open complex" that makes the template strand

available for base pairing with ribonucleotides.

The transcription bubble is created by a local unwinding that begins at the site

bound by RNA polymerase.

Initiation describes the synthesis of the first nucleotide bonds in RNA. The

enzyme remains at the promoter while it synthesizes the first ~9 nucleotide

bonds. The initiation phase is protracted by the occurrence of abortive events,

in which the enzyme makes short transcripts, releases them, and then starts

synthesis of RNA again.

The initiation phase ends when the enzyme succeeds in extending the chain

and clears the promoter. The sequence of DNA needed for RNA polymerase to

bind to the template and accomplish the initiation reaction defines the promoter.

Abortive initiation probably involves synthesizing an RNA chain that fills the

active site. If the RNA is released, the initiation is aborted and must start

again. Initiation is accomplished if and when the enzyme manages to move

along the template to move the next region of the DNA into the active site.



During elongation the enzyme moves along the DNA and extends the growing

RNA chain. As the enzyme moves, it unwinds the DNAhelix to expose a new

segment of the template in single-stranded condition. Nucleotides are

covalently added to the 3' end of the growing RNA chain, forming an RNA-DNA

hybrid in the unwound region. Behind the unwound region, the DNA template

strand pairs with its original partner to reform the double helix. The RNA

emerges as a free single strand. Elongation involves the movement of the

transcription bubble by a disruption of DNA structure, in which the template

strand of the transiently unwound region is paired with the nascent RNA at the

growing point.

Termination involves recognition of the point at which no further bases should

be added to the chain. To terminate transcription, the formation of

phosphodiester bonds must cease, and the transcription complex must come

apart. When the last base is added to the RNA chain, the transcription bubble

collapses as the RNA-DNA hybrid is disrupted, the DNA reforms in duplex

state, and the enzyme and RNA are both released. The sequence of DNA

required for these reactions defines the terminator.



TRANSCRIPTION ACTIVATORS AND REPRESSORS ACTIVATORS:

Regulatory DNA-binding proteins are multi-functional. Aside from their DNA-

binding property, they also have the ability to register regulatory signals and

transmit these on to the transcription apparatus.

Specific DNA Binding: Regulatory DNA-binding proteins generally display specific and selective DNA-

binding capacity. In this way, only those genes which possess a copy of a

particular DNAbinding element are subjected to regulation by the

corresponding binding protein.

Registering a Regulatory Signal: Activation and Inactivation: A regulatory DNA-binding protein possesses structural elements for the

registration of incoming signal, which leads to a change in concentration of the

active binding protein. The activation (or inactivation) of the binding protein

can be connected with a change in the ability to bind DNA, or can influence the



capacity of the protein to interact with the transcription apparatus and with

chromatin-modifying proteins.

Communication with the Transcription Apparatus: The DNA-binding protein must be capable of transmitting signals to the

transcription apparatus via protein-protein interactions. Distinct regions of

transcription factors contain interaction motifs that bind to and recruit protein

components of the transcription apparatus. DNA binding alone can be ascribed

the function of increasing the effective concentration of the transcription

regulator at the site of the transcription apparatus.

REPRESSORS Regulatory DNA-binding proteins are controlled by a multitude of mechanisms.

These controls operate at the level of the concentration of the binding protein

or they act on preexisting DNA-binding proteins by post-translational

mechanisms.

In the latter case the control may influence the DNA-binding activity of the

protein or it may change the ability of the protein to communicate with the

transcription apparatus or with chromatin components.

Binding of Effector Molecules: Low-molecular-weight effectors are commonly employed in bacteria to change

the DNA-binding activity of repressors or transcriptional activators and to

control the amount of active DNA-binding proteins. This type of regulatory

mechanism is frequently used for metabolic pathways, as in, for example, the

biosynthesis and degradation of amino acids. The effector molecules represent

components arising from the particular metabolic pathway. The goal of this

regulation is to adjust the transcription rate to the current demand of the gene

product. The binding of low-molecular-weight effectors to regulatory DNA-

binding protein can lead to an increase or decrease in the affinity of the protein

for its recognition sequence.

The strategies and mechanisms of action of effector molecules on regulatory

DNAbinding proteins can be elucidated using the example of the Trp repressor



of E. coli. In this system binding of the effector increases the affinity of the

binding protein to its DNA element.

Regulation of the Trp operon in E. coli.

The Trp repressor requires Trp in order to bind its affiliated DNA binding

element. In the absence of tryptophan, the Trp repressor can not bind to the

regulatory sequence and is therefor inactive. Upon an increase in the tryptohan

concentration, tryptophan binds to the Trp repressor and transforms it into a

binding-proficient form. The DNA bound Trp repressor prevents the

transcription of the structural genes, and the biosynthesis of tryptophan is

halted.

RNA POLYMERASES RNA-dependent RNA polymerase (RdRP), (RDR), or RNA replicase, is an

enzyme (EC 2.7.7.48) that catalyzes the replication of RNA from an RNA

template.

This is in contrast to a typical DNA-dependent RNA polymerase, which

catalyzes the transcription of RNA from a DNA template.

RNA-dependent RNA polymerase (RdRp) is an essential protein encoded in the

genomes of all RNA-containing viruses with no DNA stage.

It catalyses synthesis of the RNA strand complementary to a given RNA

template. The RNA replication process is a two-step mechanism



First, the initiation step of RNA synthesis begins at or near the 3' end of the

RNA template by means of a primer-independent (de novo), or a primer-

dependent mechanism that utilizes a viral protein genome-linked (VPg) primer

For synthesis of an RNA strand complementary to one of two DNA strands in a

double helix, the DNA is transiently unwound.

MOLECULAR COMPOSITION

About 17 bp are unwound at any given time. RNA polymerase and the bound

transcription bubble move from left to right along the DNA as shown;

facilitating RNA synthesis. The DNA is unwound ahead and rewound behind as

RNA is transcribed. Red arrows show the direction in which the DNA must

rotate to permit this process. As the DNA is rewound, the RNA-DNA hybrid is

displaced and the RNA strand extruded. The RNA polymerase is in close

contact with the DNA ahead of the transcription bubble, as well as with the

separated DNA strands and the RNA within and immediately behind the

bubble. A channel in the protein funnels new nucleoside triphosphates (NTPs)

to the polymerase active site. The polymerase footprint encompasses about 35

bp of DNA during elongation.



Catalytic mechanism of RNA synthesis by RNA polymerase. The addition of nucleotides involves an attack by the 3’hydroxyl group at the

end of the growing RNA molecule on the _ phosphate of the incoming NTP. The

reaction involves two Mg2+ ions, coordinated to the phosphate groups of the

incoming NTP and to three Asp residues (Asp460, Asp462, and Asp464 in the

β’ subunit of the E. coli RNA polymerase), which are highly conserved in the

RNA polymerases of all species. One Mg2+ ion facilitates attack by the

3’hydroxyl group on the α phosphate of the NTP; the other Mg2+ ion facilitates

displacement of the pyrophosphate; and both metal ions stabilize the

pentacovalent transition state.

INTIATION

In Initiation the steps follows are

The enzyme recognizes a region called a promoter, which lies just “upstream” of the gene. Initiation of RNA synthesis at random points in a DNA molecule would be an

extraordinarily wasteful process. Instead, an RNA polymerase binds to specific

sequences in the DNA called promoters, which direct the transcription of

adjacent segments of DNA (genes).



In E. coli, RNA polymerase binding occurs within a region stretching from about

70 bp before the transcription start site to about 30 bp beyond it. By

convention, the DNA base pairs that correspond to the beginning of an RNA

molecule are given positive numbers, and those preceding the RNA start site

are given negative numbers. The promoter region thus extends between

positions -70 and +30.

Analyses and comparisons of the most common class of bacterial promoters

(those recognized by an RNA polymerase holoenzyme containing σ70) have

revealed similarities in two short sequences centered about positions -10 and -

35.

These sequences are important interaction sites for the σ70 subunit. Although

the sequences are not identical for all bacterial promoters in this class, certain

nucleotides that are particularly common at each position form a consensus sequence.



The consensus sequence at the -10 region is (5’) TATAAT (3’); the consensus

sequence at the - 35 region is (5’) TTGACA (3’).

A third AT-rich recognition element, called the UP (upstream promoter)

element, occurs between positions -40 and -60 in the promoters of certain

highly expressed genes.

The UP element is bound by theαsubunit of RNA polymerase. The efficiency

with which an RNA polymerase binds to a promoter and initiates transcription

is determined in large measure by these sequences, the spacing between them,

and their distance from the transcription start site.

The pathway of transcription initiation consists of two major parts, binding and

initiation, each with multiple steps. First, the polymerase binds to the

promoter, forming, in succession, a closed complex (in which the bound DNA is

intact) and an open complex (in which the bound DNA is intact and partially

unwound near the -10 sequence).

Second, transcription is initiated within the complex, leading to a

conformational change that converts the complex to the elongation form,

followed by movement of the transcription complex away from the promoter

(promoter clearance). Any of these steps can be affected by the specific makeup

of the promoter sequences. The σ subunit dissociates as the polymerase enters

the elongation phase of transcription.

E. coli has other classes of promoters, bound by RNA polymerase holoenzymes

with differentσ subunits. An example is the promoters of the heat-shock genes.

The products of this set of genes are made at higher levels when the cell has

received an insult, such as a sudden increase in temperature. RNA polymerase

binds to the promoters of these genes only when σ70 is replaced with the σ32

(Mr 32,000) subunit, which is specific for the heat-shock promoters.

By using different σ subunits the cell can coordinate the expression of sets of

genes, permitting major changes in cell physiology.

The polymerase binds tightly to the promoter and causes localized melting, or

separation, of the two DNA strands within the promoter. At least 12 bp are

melted.



Next, the polymerase starts building the RNA chain. The substrates, or

building blocks, it uses for this job are the four ribonucleoside triphosphates: ATP, GTP, CTP, and UTP. The first, or initiating, substrate is usually a purine

nucleotide.

After the first nucleotide is in place, the polymerase joins a second nucleotide

to the first, forming the initial phosphodiester bond in the RNA chain.

Several nucleotides may be joined before the polymerase leaves the promoter

and elongation begins.

ELONGATION In the region being transcribed, the DNA double helix is unwound by about a

turn to permit the DNA’s sense strand to form a short segment of DNA–RNA

hybrid double helix with the RNA’s 3’ end. As the RNAP advances along the



DNA template (here to the right), the DNA unwinds ahead of the RNA’s growing

3’ End and rewinds behind it, thereby stripping the newly synthesized RNA

from the template (antisense) strand.

One way this might occur is by the RNAP following the path of the template

strand about the DNA double helix, in which case the transcript would become

wrapped about the DNA once per duplex turn.

A second and more plausible possibility is that the RNA moves in a straight line

while the DNA rotates beneath it. In this case the RNA would not wrap around

the DNA but the DNA would become overwound ahead of the advancing

transcription bubble and unwound behind it (consider the consequences of

placing your finger between the twisted DNA strands in this model and pushing

toward the right).The model presumes that the ends of the DNA, as well as the

RNAP, are prevented from rotating by attachments within the cell TERMINATION

There are two types of terminators in E. coli A core enzyme can terminate in

vitro at certain sites in the absence of any other factor. These sites are called

intrinsic terminators.

Rho-dependent terminators are defined by the need for addition of rho factor

(p) in vitro; and mutations show that the factor is involved in termination in

vivo.

Intrinsic terminators have the two structural features evident in Figure a

hairpin in the secondary structure; and a region that is rich in U residues at



the very end of the unit. Both features are needed for termination. The hairpin

usually contains a G-C-rich region near the base of the stem. The typical

distance between the hairpin and the U-rich region is 7-9 bases. There are ~l

100 sequences in the E. coli genome that fit these criteria, suggesting that

about half of the genes have intrinsic terminators.

Rho-dependent: Rho factor is an essential protein in E. coli. It functions solely at the stage of

termination. It is a -275 kD hexamer of identical subunits. The subunit has an

RNA-binding domain and an ATP hydrolysis domain. Rho is a member of the

family of hexameric ATP-dependent helicases that function by passing nucleic

acid through the hole in the middle of the hexamer formed from the RNA-

binding domains of the subunits. Rho functions as an ancillary factor for RNA

polymerase; typically its maximum activity in vitro is displayed when it is

present at ~ 10% of the concentration of the RNA polymerase.

Rho-dependent terminators account for about half of E. coli terminators. They

were discovered in phage genomes, where they have been most fully

characterized. The sequences required for rho-dependent termination are 50-

90 bases long and lie upstream of the termination site. Their common feature

is that the RNA is rich in C residues and poor in G residues.



A rho-dependent terminator has a sequence rich in C and poor in G preceding

the actual site(s) of termination.

The sequence is shown in the form of the RNA. It represents the 3' end of the

RNA.

An individual rho factor acts processively on a single RNA substrate. Rho's key

function is its helicase activity, for which energy is provided by an RNA-

dependent ATP hydrolysis. The initial binding site for rho is an extended (~70

nucleotide) single-stranded region in the RNA upstream of the terminator. Rho

binds to RNA and then uses its ATPase activity to provide the energy to

translocate along the RNA until it reaches the RNA-DNA helical region, where it

unwinds the duplex structure

Rho- independent termination: The signal of arrest at the end of the gene is provided by a G-C-rich

palindromic sequence, followed by a poly-A sequence; transcription of these



sequences leads to the formation of a stable hairpin directly followed by a poly-

U sequence.

Once the hairpin structure has induced pausing of the polymerase, the poly-

U/poly-A heteroduplex allows further release of the transcript and the enzyme.

Transcription of palindromic and polyA sequences from the factor-independent terminator leads to the formation of a hairpin structure (h) and to a poly-U sequence, respectively. The hairpin induces pausing of the polymerase; together with the adjacent polyA–polyU hybrid, this promotes transcription arrest and release of the RNA transcript and the RNA polymerase.

HEAT SHOCK RESPONSE The cellular response to heat shock includes the transcriptional up-regulation

of genes encoding heat shock proteins (HSPs) as part of the cell's internal

repair mechanism. They are also called stress-proteins and respond to heat,

cold and oxygen deprivation by activating several cascade pathways. HSPs are

also present in cells under perfectly normal conditions. Some HSPs, called

chaperones, ensure that the cell’s proteins are in the right shape and in the

right place at the right time. For example, HSPs help new or misfolded proteins

to fold into their correct three-dimensional conformations, which is essential

for their function. They also shuttle proteins from one compartment to another

inside the cell, and target old or terminally misfolded proteins to proteases for



degradation. Heat shock proteins are also believed to play a role in the

presentation of pieces of proteins (or peptides) on the cell surface to help the

immune system recognize diseased cells.

The up-regulation of HSPs during heat shock is generally controlled by a single

transcription factor; in eukaryotes this regulation is performed by heat shock

factor (HSF), while σ32 is the heat shock sigma factor in Escherichia coli.

The subunit of the E. coli RNA polymerase holoenzyme is a specificity factor

that mediates promoter recognition and binding. Most E. coli promoters are

recognized by a single subunit (Mr 70,000), 70. Under some conditions,

some of the 70 subunits are replaced by another specificity factor. One

notable case arises when the bacteria are subjected to heat stress, leading to

the replacement of 70 by 32 (Mr 32,000). When bound to 32, RNA

polymerase is directed to a specialized set of promoters with a different

consensus sequence.

These promoters control the expression of a set of genes that encode the heat-

shock response proteins. Thus, through changes in the binding affinity of the

polymerase that direct it to different promoters, a set of genes involved in

related processes is coordinately regulated.

INHIBITORS OF RNA SYNTHESIS AND THEIR MECHANISM

The enzymes which transcribe DNA synthesizing RNA (DNA-dependent RNA

polymerases) have structural differences in eukaryotic and prokaryotic cells, as

indicated by the fact, among others, that there are substances which inhibit

their function selectively in prokaryotic cells (streptolydigin and the ansa

antibiotics, such as rifamycins and streptovaricin) and in eukaryotic cells (α-

amanitin).

Ansa antibiotics inhibit the initiation of RNA synthesis, whereas streptolydigin

interferes with RNA elongation. Among ansa antibiotics, rifamycins have been

studied more extensively.



In vitro activity, showed better pharmacokinetic properties in vivo. Rifampicin,

3-(4-methylpiperazinoiminomethyl) rifamycin SV, has been selected for the oral

treatment of various bacterial infections.

Specific inhibitors of eukaryotic transcriptase α-Amanitin. -Amanitin is a highly toxic cyclic octapeptide, isolated from the

poisonous fungus Amanita phalloides5 . It is a potent specific inhibitor of DNA-

dependent RNA polymerase II of eukaryotes, while it does not inhibit nucleolar

polymerase I and polymerase III of eukaryotes and bacterial RNA polymerase

The enzymatic reaction is blocked immediately after adding the inhibitor, which

seems to act at the stage of RNA-chain elongation. The eukaryotic RNA

polymerase from yeast is much less sensitive to the action of -arnanitin than

the mammalian enzyme.

Its polypeptidic nature could constitute a suitable model for the synthesis and

testing of analogous polypeptidic compounds, in order to obtain information

concerning the part of the molecule of cx-amanitin responsible for the binding

to RNA polymerase II of eukaryotes.

Specific inhibitors of prokaryotic transcriptase Streptolydigin is an antibiotic produced by Streptomyces. It exhibits in vitro

activity primarily against streptococci, diplococci and clostridia and is relatively

nontoxic. It acts by binding and thus specifically inhibiting bacterial RNA

polymerase

Only at high concentrations of the drug is the initiation process affected,

because the formation of the first phosphodiester bond is also inhibited.

Rifamycins, tolypomycins and streptovaricins are very active against Gram-

positive bacteria and mycobacteria.Rifamycins selectively inhibits the synthesis

of all cellular RNA in sensitive bacteria30, because they are potent inhibitors of

the bacterial DNA-dependent RNA polymerase.

POLYCYSTRONIC AND MONOCYSTRONIC mRNA Polycistronic mRNA is a mRNA that encodes several proteins and is

characteristic of many bacterial and chloroplast mRNAs. Polycistronic mRNAs

consist of a leader sequence which precedes the first gene. The gene is followed



by an intercistronic region and then another gene. A trailer sequence follows

the last gene in the mRNA. Examples of polycistronic transcripts are found in

the chloroplast. One region that exhibits a group of different polycistronic

messages from the same region is the psbb/psbH/petB/petD region. The

following table lists the genes, their products and the complex of which the

product is a part.

Gene Product Complex

psbB 51 kd chl a binding protein of PSII PSII

psbH 10 kd phosphoprotein of PSII PSII

petB cytochrome b6 Cytochrome

petD subunit 4 of cytochrome b6/f Cytochrome

Although the transcripts are co-transcribed, the ratio of the two complex varies

in the light and the dark as well as between the mesophyll and the bundle

sheath cells. Thus some sort of regulation must exist. At least 15 different

mRNAs are produced from this gene cluster.

Monocistronic mRNA is a mRNA that encodes only one protein and all

eukaryotic mRNAs are monocistronic. The development of the mature

monocistronic eukaryotic transcript involves several different processing steps.

These steps are:

5' capping

3' polyadenylation

Splicing together of exons if introns are present

Each of these steps are post-transcriptional modification steps. Thus the

original transcript is not the same as the final product. All of the post-

transcriptional steps occur in the nucleus of the cell and the resultant product,

the mRNA, is transported to the cytoplasm for translation. CONTROL OF ELONGATION AND TERMINATION

Repressors bind to specific sites on the DNA. In prokaryotic cells, such binding

sites, called operators, are generally near a promoter. RNA polymerase

binding, or its movement along the DNA after binding, is blocked when the



repressor is present. Regulation by means of a repressor protein that blocks

transcription is referred to as negative regulation. Repressor binding to DNA

is regulated by a molecular signal (or effector), usually a small molecule or a

protein, that binds to the repressor and causes a conformational change. The

interaction between repressor and signal molecule either increases or

decreases transcription. In some cases, the conformational change results in

dissociation of a DNA-bound repressor from the operator. (Fig.a).

Transcription initiation can then proceed unhindered. In other cases,

interaction between an inactive repressor and the signal molecule causes the

repressor to bind to the operator (Fig.b).

In eukaryotic cells, the binding site for a repressor may be some distance from

the promoter; binding has the same effect as in bacterial cells: inhibiting the

assembly or activity of a transcription complex at the promoter. Activators

provide a molecular counterpoint to repressors; they bind to DNA and enhance

the activity of RNA polymerase at a promoter; this is positive regulation. Activator binding sites are often adjacent to promoters that are bound weakly

or not at all by RNA polymerase alone, such that little transcription occurs in

the absence of the activator. Some eukaryotic activators bind to DNA sites,

called enhancers, which are quite distant from the promoter, affecting the rate

of transcription at a promoter that may be located thousands of base pairs

away. Some activators are normally bound to DNA, enhancing transcription

until dissociation of the activator is triggered by the binding of a signal

molecule (Fig.c).

In other cases the activator binds to DNA only after interaction with a signal

molecule (Fig. d). Signal molecules can therefore increase or decrease

transcription, depending on how they affect the activator.



ALTERNATE SIGMA FACTORS

Role of sigma factor: Initiation requires tight binding only to particular sequences (promoters), while

elongation requires close association with all sequences that the enzyme

encounters during transcription.

The association with sigma factor changes at initiation), sigma factor is either

released following initiation or changes its association with core enzyme so that

it no longer participates in DNA binding. Because there are fewer molecules of

sigma than of core enzyme, the utilization of core enzyme requires that sigma

recycles. This occurs immediately after initiation in about one third of cases;

presumably sigma and core dissociate at some later point in the other cases.

Irrespective of the exact timing of its release from core enzyme, sigma factor is

involved only in initiation.



When sigma factor is released from core enzyme, it becomes immediately

available for use by another core enzyme. Whether sigma is released or remains

more loosely associated with core enzyme, the core enzyme in the ternary

complex is bound very tightly to DNA. It is essentially "locked in" until

elongation has been completed. When transcription terminates, the core

enzyme is released. It is then "stored" by binding to a loose site on DNA. If it

has lost its sigma factor, it must find another sigma factor in order to

undertake a further cycle of transcription. Core enzyme has a high intrinsic

affinity for DNA, which is increased by the presence of nascent RNA. But its

affinity for loose binding sites is too high to allow the enzyme to distinguish

promoters efficiently from other sequences. By reducing the stability of the

loose complexes, sigma allows the process to occur much more rapidly; and by

stabilizing the association at tight binding sites, the factor drives the reaction

irreversibly into the formation of open complexes. When the enzyme releases

sigma (or changes its association with it), it reverts to a general affinity for all

DNA, irrespective of sequence, that suits it to continue transcription.

Conformation enchances of σ factor: Sigma factor has domains that recognize the promoter DNA. As an independent

polypeptide, sigma does not bind to DNA, but when holoenzyme forms a tight

binding complex, σ contacts the DNA in the region upstream of the startpoint.

This difference is due to a change in the conformation of sigma factor when it

binds to core enzyme.

The N-terminal region of free sigma factor suppresses the activity of the DNA-

binding region; when sigma binds to core, this inhibition is released, and it

becomes able to bind specifically to promoter sequences. The inability of free

sigma factor to recognize promoter sequences may be important: if σ could

freely bind to promoters, it might block holoenzyme from initiating

transcription.

Sporulation involves successive changes in the sigma factors that control the initiation specificity of RNA polymerase. The cascades in the



forespore (left) and the mother cell (right) are related by signals passed across the septum (indicated by horizontal arrows).

POST TRANSCRIPTIONAL MODIFICATIONS OF m-RNA

Many of the RNA molecules in bacteria and virtually all RNA molecules in

eukaryotes are processed to some degree after synthesis. Some of the most

interesting molecular events in RNA metabolism occur during this

postsynthetic processing. A newly synthesized RNA molecule is called a

primary transcript. Perhaps the most extensive processing of primary

transcripts occurs in eukaryotic mRNAs and in tRNAs of both bacteria and

eukaryotes.

The primary transcript for a eukaryotic mRNA typically contains sequences

encompassing one gene, although the sequences encoding the polypeptide may

not be contiguous. Noncoding tracts that break up the coding region of the

transcript are called introns, and the coding segments are called exons.



In a process called splicing, the introns are removed from the primary

transcript and the exons are joined to form a continuous sequence that

specifies a functional polypeptide. Eukaryotic mRNAs are also modified at each

end.

A modified residue called a 5’ cap is added at the 5’ end. The 3’ end is cleaved,

and 80 to 250 A residues are added to create a poly(A) “tail.”

In effect, a eukaryotic mRNA, as it is synthesized, is ensconced in an elaborate

complex involving dozens of proteins. The composition of the complex changes

as the primary transcript is processed, transported to the cytoplasm, and

delivered to the ribosome for translation.

The primary transcripts of prokaryotic and eukaryotic tRNAs are processed by

the removal of sequences from each end (cleavage) and in a few cases by the

removal of introns (splicing). Many bases and sugars in tRNAs are also

modified; mature tRNAs are replete with unusual bases not found in other

nucleic acids.

CAPPING

Most eukaryotic mRNAs have a 5’ cap, a residue of 7’ methylguanosine linked

to the 5’ terminal residue of the mRNA through an unusual 5’, 5’-triphosphate

linkage.



The 5’ cap helps protect mRNA from ribonucleases. The cap also binds to a

specific capbinding complex of proteins and participates in binding of the

mRNA to the ribosome to initiate translation.

The 5’ cap is formed by condensation of a molecule of GTP with the

triphosphate at the 5’ end of the transcript. The guanine is subsequently

methylated at N-7, and additional methyl groups are often added at the 2’

hydroxyls of the first and second nucleotides adjacent to the cap.

The methyl groups are derived from S-adenosylmethionine. All these reactions

occur very early in transcription, after the first 20 to 30 nucleotides of the

transcript have been added. All three of the capping enzymes, and through

them the 5’ end of the transcript itself, are associated with the RNA polymerase

II CTD until the cap is synthesized. The capped 5’ end is then released from the

capping enzymes and bound by the cap-binding complex.



EDITING Certain mRNAs from a variety of eukaryotic organisms have been found to

differ from their corresponding genes in several unexpected ways, including

C→ U and U→ C changes, the insertion or deletion of U residues, and the

insertion of multiple G or C residues. The most extreme examples of this

phenomenon, which occur in the mitochondria of trypanosomes (whose DNA

encodes only 20 genes), involve the addition and removal of up to hundreds of

U’s to and from 12 otherwise untranslatable mRNAs. The process whereby a

transcript is altered in this manner is called RNA editing because it originally

seemed that the required enzymatic reactions occurred without the direction of

a nucleic acid template and hence violated the central dogma of molecular

biology.

Eventually, however, a new class of trypanosomal mitochondrial transcripts

called guide RNAs (gRNAs) was identified. gRNAs, which consist of 40 to 80

nucleotides, have 3. oligo (U) tails, an internal segment that is precisely

complementary to the edited portion of the pre-edited mRNA (if G. U pairs,

which are common in RNAs, are taken to be complementary), and a 10- to 15-

nt so-called anchor sequence near the 5. end that is largely complementary in

the Watson–Crick sense to a segment of the mRNA that is not edited.

An unedited transcript presumably associates with the corresponding gRNA via

its anchor sequence.

Then, in a process mediated by the appropriate enzymatic machinery in a ~20S

RNP named the editosome, the gRNA’s internal segment is used as a template



to “correct” the transcript, thereby yielding the edited mRNA. Inser tion editing

requires at least three enzymatic activities that, somewhat surprisingly, are

encoded by nuclear genes.

1. An endonuclease at a mismatch between the gRNA and the pre-edited

mRNA to cleave the preedited mRNA on the 5. side of the insertion point; 2. Terminal uridylyltransferase (TUTase) to insert the new U(s); and (3)

an RNA ligase to reseal the RNA. Deletion requires similar enzymatic

apparatus with the exceptions that the endonuclease cleaves the RNA being

edited on the 3. side of the U(s) to be deleted and TUTase is replaced by 3’-U-exonuclease (3’-U-exo), which excises the U(s) at the deletion site. 3. A single gRNA mediates the editing of a block of 1 to 10 sites.Thus, the

genetic information specifying an edited mRNA is derived from two or more

genes.



SPLICING There are four classes of introns. The first two, the group I and group II introns

differ in the details of their splicing mechanisms but share one surprising

characteristic: they are self-splicing—no protein enzymes are involved. Group I

introns are found in some nuclear, mitochondrial, and chloroplast genes

coding for rRNAs, mRNAs, and tRNAs. Group II introns are generally found in

the primary transcripts of mitochondrial or chloroplast mRNAs in fungi, algae,

and plants.

Splicing mechanism of group I introns. The nucleophile in the first step may

be guanosine, GMP, GDP, or GTP. The spliced intron is eventually degraded.

Splicing mechanism of group II introns. The chemistry is similar to that of

group I intron splicing, except for the identity of the nucleophile in the first

step and formation of a lariatlike intermediate, in which one branch is a 2’,5’-

phosphodiester bond.



Splicing mechanism in mRNA primary transcripts.

RNA pairing interactions in the formation of spliceosome complexes. The U1

snRNA has a sequence near its 5’ end that is complementary to the splice site

at the 5’ end of the intron. Base pairing of U1 to this region of the primary

transcript helps define the 5’ splice site during spliceosome assembly (Ψ is

pseudouridine)

U2 is paired to the intron at a position encompassing the A residue (shaded

pink) that becomes the nucleophile during the splicing reaction. Base pairing of

U2 snRNA causes a bulge that displaces and helps to activate the adenylate,

whose 2’ OH will form the lariat structure through a 2’,5’-phosphodiester bond.

Assembly of spliceosomes: The U1 and U2 snRNPs bind, then the remaining snRNPs (the U4/U6 complex

and U5) bind to form an inactive spliceosome. Internal rearrangements convert

this species to an active spliceosome in which U1 and U4 have been expelled

and U6 is paired with both the 5’ splice site and U2. This is followed by the

catalytic steps, which parallel those of the splicing of group II introns.

Coordination of splicing with transcription provides an attractive mechanism

for bringing the two splice sites together.



POLYADENYLATION

Addition of the poly(A) tail to the primary RNA transcript of eukaryotes.

Pol II synthesizes RNA beyond the segment of the transcript containing the

cleavage signal sequences, including the highly conserved upstream sequence

(5’)AAUAAA.



1. The cleavage signal sequence is bound by an enzyme complex that includes an

endonuclease, a polyadenylate polymerase, and several other multisubunit

proteins involved in sequence recognition, stimulation of cleavage, and

regulation of the length of the poly(A) tail.

2. The RNA is cleaved by the endonuclease at a point 10 to 30 nucleotides 3’ to

(downstream of) the sequence AAUAAA.

3. The polyadenylate polymerase synthesizes a poly(A) tail 80 to 250 nucleotides

long, beginning at the cleavage site.

MODIFICATIONS OF t RNA

t RNAs are commonly synthesized as precursor chains with additional material

at one or both ends. The extra sequences are removed by combinations of

endonucleolytic and exonucleolytic activities.

One feature that is common to most tRNAs is that the three nucleotides at the

3' terminus, always the triplet sequence CCA, are not coded in the genome, but

are added as part of tRNA processing. The 5' end of tRNA is generated by a



cleavage action catalyzed by the enzyme ribonuclease P. The enzymes that

process the 3' end are best characterized in E. coli, where an endonuclease

triggers the reaction by cleaving the precursor downstream, and several

exonucleases then trim the end by degradation in the 3' -5' direction. The

reaction also involves several enzymes in eukaryotes. It generates a tRNA that

needs the CCA trinucleotide sequence to be added to the 3' end.

The addition of CCA is the result solely of an enzymatic process, that is, the

enzymatic activity carries the specificity for the sequence of the trinucleotide,

which is not determined by a template. There are several models for the

process, which may be different in different organisms. In some organisms, the

process is catalyzed by a single enzyme.

One model for its action proposes that a single enzyme binds to the 3' end, and

sequentially adds C, C, and A, the specificity at each stage being determined by

the structure of the 3' end. Other models propose that the enzyme has different

active sites for CTP and ATP.

In other organisms, different enzymes are responsible for adding the C and A

residues, and they function sequentially. When a tRNA is not properly

processed, it attracts the attention of a quality control system that degrades it.



This ensures that the protein synthesis apparatus does not become blocked by

nonfunctional tRNAs.

MODIFICATIONS OF r RNA.

The seven E. coli rRNA operons all contain one (nearly identical) copy of each of

the three types of rRNA genes. Their polycistronic primary transcripts, which

are ~5500 nucleotides in length, contain 16S rRNA at their 5’ ends followed by

the transcripts for 1 or 2 tRNAs, 23S rRNA, 5S rRNA, and, in some rRNA

operons, 1 or 2 more tRNAs at the 3’ end.

The steps in processing these primary transcripts to mature rRNAs were

elucidated with the aid of mutants defective in one or more of the processing

enzymes. The initial processing, which yields products known as pre-rRNAs, commences while the primary transcript is still being synthesized. It consists of

specific endonucleolytic cleavages by RNase III, RNase P, RNase E, and RNase F at the sites indicated in Fig.

The base sequence of the primary transcript suggests the existence of several

basepaired stems.The RNase III cleavages occur in a stem consisting of

complementary sequences flanking the 5’ and 3’ ends of the 23S segment as

well as that of the 16S segment. Presumably, certain features of these stems

constitute the RNase III recognition site. The 5’ and 3’ ends of the pre-rRNAs

are trimmed away in secondary processing steps through the action of RNases D, M16, M23, and M5 to produce the mature rRNAs. These final cleavages

only occur after the pre-rRNAs become associated with ribosomal proteins.



UNIT – 4 TRANSLATION Genetic Code – Features and character, Wobble hypothesis. Ribosome assembly, Intiation factors and their regulation, formation of initiation complex, Initiation, elongation and termination of polypeptide chain, elongation factors and releasing factors, translational proof reading, inhibitors of translation and their mechanism, post translational modification of proteins – glycosylation. Control of translation in eukaryotes. Differences between prokaryotes and eukaryotes

GENETIC CODE The sequence of bases that encodes a functional protein molecule is called a

gene. And the genetic code is the relation between the base sequence of a

gene and the amino acid sequence of the polypeptide whose synthesis the gene

directs. In other words, the specific correspondence between a set of 3 bases

and 1 of the 20 amino acids is called the genetic code.

J.D. Burke (1970) defined genetic code in the following words, “The genetic

code for protein synthesis is contained in the base sequence of DNA. ... The

genetic code is a code for amino acids. Specifically, it is concerned with what

codons specify what amino acids.”

The genetic code is the key that relates, in Crick’s words, “...the two great

polymer languages, the nucleic acid language and the protein language.”

The “letters” in the “language” were found to be the bases; the “words” (codons)

are groups of bases; and the “sentences” and “paragraphs” equate with groups

of codons (Eldon J. Gardner, 1968).

Thus,

Letters Bases

Words Groups of bases (i.e., codons)

Sentences and Paragraphs Groups of codons

The basic problem of such a genetic code is to indicate how information written

in a four-letterlanguage (four nucleotides or nitrogen bases of DNA) can be

translated into a twenty-letter-language (twenty amino acids of proteins). The



group of nucleotides that specifies one amino acid is a code word or codon. The

simplest possible code is a singlet code (a code of single letter) in which one

nucleotide codes for one amino acid. Such a code is inadequate for only four

amino acids could be specified.

A doublet code (a code of two letters) is also inadequate because it could

specify only sixteen (4 × 4) amino acids, whereas a triplet code (a code of three

letters) could specify sixty four (4 × 4 × 4) amino acids. Therefore, it is likely

that there may be 64 triplet codes for 20 amino acids. The possible singlet,

doublet and triplet codes, which are customarily represented in terms of “mRNA

language”, (mRNA is a complementary molecule which copies the genetic

informations during its transcription) can be illustrated as in Figure

FEATURES AND CHARACTERS

The genetic code is endowed with many characteristic properties which have

actually been proved by definite experimental evidences. These are described

below:

Triplet nature As earlier outlined, singlet and doublet codes are not adequate to code for 20

amino acids; therefore, it was pointed out that triplet code is the minimum

required. But it could be a quadruplet code or of a higher order. As pointed out



above, in a triplet code of 64 codons, there is an excess of (64 – 20) = 44 codons

and, therefore, more than one codons are present for the same amino acid.

This excess will be still greater if more than three-letter words are used. In a

quadruplet code there will be 44 (4 X 4 X 4 X 4) = 256 possible codons. An

account of the 20 amino acids along with their corresponding codons is

presented below: 2 amino acids (Met, Trp) ... have 1 codon each = 2

9 amino acids (Asn, Asp, Cys, Gln, Glu, His, Lys, Phe, Tyr)… have 2 codons

each = 18

1 amino acid (Ile) ... has 3 codons = 3

5 amino acids (Ala, Gly, Pro, Thr, Val) ... have 4 codons each = 20

3 amino acids (Arg, Leu, Ser) ... have 6 codons each = 18

3 terminator codons = 3

20 Amino acids 64

Amino acids with similar structural properties tend to have related codons.

Thus, aspartic acid codons (GAU, GAC) are similar to glutamic acid codons

(GAA, GAG); the difference being exhibited only in the third base (toward 3′

end).

Similarly, the codons for the aromatic amino acids phenylalanine (UUU, UUC),

tyrosine (UAU, UAC) and tryptophan (UGG) all begin with uracil (U).

The first two bases of all the 4 codons assigned to each of the 5 amino acids

are similar: GC for alanine, GG for glycine, CC for proline, AC for threonine and

GU for valine.



All codons with U in the second position specify hydrophobic amino acids (Ile,

Leu, Met, Phe, Val).

All codons with A in the second position specify the charged amino acids,

except Arg.

The entire acidic (Asp, Glu) and basic (Arg, Lys) amino acids have A or G as the

second base.

2. Degeneracy The code is degenerate which means that the same amino acid is coded by

more than one base triplet. Degeneracy, as used here, does not imply lack of

specificity in protein synthesis. It merely means that a particular amino acid

can be directed to its place in the peptide chain by more than one base triplets.

For example, the three amino acids arginine, alanine and leucine each have six

synonymous codons. A non-degenerate code would be one where there is one

to one relationship between amino acids and the codons, so that from the 64

codons, 44 will be useless or nonsense codons. It has been definitely shown

that there are no nonsense codons. The codons which were initially called

nonsense codons were later shown to mean stop signals.The code degeneracy is

basically of 2 types: partial and complete.

In partial degeneracy, the first two nucleotides are identical but the third (i.e.,

3′ base) nucleotide of the degenerate codon differs; for example, CUU and CUC

code for leucine.



Complete degeneracy occurs when any of the 4 bases can take third position

and still code for the same amino acid; for example, UCU, UCC, UCA and UCG

all code for serine.

Degeneracy of genetic code has certain biological advantages. For example, it

permits essentially the same complement of enzymes and other proteins to be

specified by the microorganisms varying widely in their DNA base composition.

Degeneracy also provides a mechanism of minimizing mutational lethality.

Degeneracies occur frequently in the third letter of the codon. Exceptions are,

however, arginine (Arg), leucine (Leu) and serine (Ser) which have 2 groups of

codons or triplets, which differ in either the first base only (Arg, Leu) or in both

the first and second bases (Ser). the first base only (Arg, Leu) or in both the

first and second bases (Ser).

Nonoverlapping The genetic code is nonoverlapping, i.e.,the adjacent codons do not overlap. A

nonoverlapping code means that the same letter is not used for two different

codons. In other words, no single base can take part in the formation of more

than one codon. Fig. 30–4 shows that an overlapping code can mean coding for

four amino acids from six bases. In actual practice, six bases code for not more

than two amino acids. As an illustration, an end-to-end sequence of 5′

UUUCCC 3′ on mRNA will code only 2 amino acids, i.e., phenylalanine (UUU)

and proline (CCC).

Commaless There is no signal to indicate the end of one codon and the beginning of the

next. The genetic code is commaless (or comma-free). A commaless code means

that no codon is reserved for punctuations or the code is without spacers or

space words. There are no intermediary nucleotides (or commas) between the

codons. In other words, we can say that after one amino acid is coded, the

second amino acid will be automatically coded by the next three letters and

that no letters are wasted for telling that one amino acid has been coded and

that second should now be coded.

Non-ambiguity



By non-ambiguous code, we mean that there is no ambiguity about a

particular codon. A particular codon will always code for the same amino acid.

In an ambiguous code, the same codon could code for two or more than two

different amino acids. Such is not the case. While the same amino acid can be

coded by more than one codon (the code is degenerate), the same codon shall not

code for two or more different amino acids (non-ambiguous). But sometimes the

genetic code is ambiguous, that is, same codon may specify more than one

amino acid. For example, UUU codon usually codes for phenylalanine but in

the presence of streptomycin, may also code for isoleucine, leucine or serine.

Universality The genetic code applies to all modern organisms with only very minor

exceptions. Although the code is based on work conducted on the bacterium

Escherichia coli but it is valid for other organisms. This important characteristic

of the genetic code is called its universality. It means that the same sequences

of 3 bases encode the same amino acids in all life forms from simple

microorganisms to complex, multicelled organisms such as human beings.

Consider any codon. It codes for the same amino acid from the smallest

organism to the largest, plant or animal. Thus, UUU codes for phenylalanine

and GUC for valine in all living things, from amoeba to ape, bacteria to the

banyan tree, and from cabbage to kings. The genetic code which was first

developed in the bacteria about 3 billion (300 crore) years ago has not

undergone any change and has been preserved in its almost original form in

the course of evolution. In other words, the code is a conservative one, i.e., the

code was fixed early in the course of evolution and has been maintained to the

present day.

Polarity The genetic code has polarity, that is, the code is always read in a fixed

direction, i.e., in the 5′ → 3′ direction. It is apparent that if the code is read in

opposite direction (i.e., 3′ → 5′), it would specify 2 different proteins, since the

codon would have reversed base sequence:



Chain Initiation Codons The triplets AUG and GUG play double roles in E. coli. When they occur in

between the two ends of a cistron (intermediate position), they code for the

amino acids methionine and valine, respectively in an intermediate position in

the protein molecule. But when they occur immediately after a terminator

codon, they act as “chain initiation” (C.I.) signals or “starter codons” for the

synthesis of a polypeptide chain. It has also been shown that the initiating

methionine molec ule should be found in the formylated state. This makes a

distinction between the initiating methionine and the methionine at internal

position. The methionine when required at internal position should not be

formylated. Also while formyl methionine is carried by tRNAfMet, there is a

separate species of tRNA for internal methionine and it is designated as

tRNAmMet.

Chain Termination Codons The 3 triplets UAA, UAG, UGA do not code for any amino acid. They were

originally described as non-sense codons, as against the remaining 61 codons,

which are termed as sense codons. The so-called non-sense codons have now

been found to be of “special sense”. When any one of them occurs immediately

before the triplet AUG or GUG, it causes the release of the polypeptide chain

from the ribosome. Hence, the use of the term ‘non-sense’ is unfortunate.

These special-sense codons perform the function of punctuating genetic

message like a full stop at the end of a sentence. They are also called chain

termination codons because these codons are used by the cell to signal the

natural end of translation of a particular peptidyl chain. However, their

inclusion in any mRNA results in the abrupt termination of the message at the

point of their location even though the polypeptide chain has not been

completed. The codons UAA and UAG were discovered in bacteria and were

respectively associated with the ochre and amber mutations. Hence, UAA is



also called ochre and UAG is also known as amber (because an investigator

who studied the properties of this codon belonged to the Bernstein family, and

Bernstein means amber in German). UGA is also called opal. They resulted in

the formation of incomplete polypeptide chains. UGA is the usual terminator

codon in all cases.

WOBBLE HYPOTHESIS

Wobble hypothesis states that the pairing between codon and anticodon at the

first two codon positions always follows the usual rules, but that exceptional

wobbles occur at the third position. Wobbling occurs because the conformation

of the tRNA anticodon loop permits flexibility at the first base of the anticodon.

This single change creates a pattern of base pairing in which A can no longer

have a unique meaning in the codon (because the U that recognizes it must

also recognize G). Similarly, C also no longer has a unique meaning (because

the G that recognizes it also must recognize U).

It is therefore possible to recognize unique codons only when the third bases

are G or U; this option is not used often, since UGG and AUG are the only

examples of the first type, and there is none of the second type. (G-U pairs are

common in RNA duplex structures. But the formation of stable contacts



between codon and anticodon, when only 3 base pairs can be formed, is more

constrained, and thus G-U pairs can contribute only in the last position of the

codon.)

RIBOSOME ASSEMBLY

The ribosomes in the cytoplasm of eukaryotic cells (other than mitochondrial

and chloroplast ribosomes) are substantially larger and more complex than

bacterial ribosomes.

They have a diameter of about 23 nm and a sedimentation coefficient of 80S.

They also have 2 subunits, which vary in size between species but on an

average are 60S and 40S. The small subunit (40S) contains a single 18S rRNA

molecule and the large subunit (60S) contains a molecule each of 5S, 5.8S and

28S rRNAs. Altogether, eukaryotic ribosomes contain over 80 different proteins.

Thus, a eukaryotic ribosome contains more proteins in each subunit and also

has an additional RNA (5.8S) in the larger 60S subunit.

The ribosomes of mitochondria and chloroplasts are different from those in the

cytoplasm of eukaryotes. They are more like bacterial ribosomes. In fact, there



are many similarities between protein synthesis in mitochondria, chloroplasts,

and bacteria.

Polypeptide synthesis takes place on the head and plateform regions of the 30S

subunit and the upper half of the 50S subunit (translational domain). The

mRNAs and tRNAs attach to the 30S subunit, and the peptidyl transferase site

(where peptide bond formation occur) is associated with the central

protuberance of the larger 50S subunit.

ORGANELLE RIBOSOMES.

Organelle ribosomes are distinct from the ribosomes of the cytosol, and take

varied forms. In some cases, they are almost the size of bacterial ribosomes

and have 70% RNA; in other cases, they are only 60S and have <30% RNA. The



ribosome possesses several active centers, each of which is constructed from a

group of proteins associated with a region of ribosomal RNA. The active centers

require the direct participation of rRNA in a structural or even catalytic role.

Some catalytic functions require individual proteins, but none of the activities

can be reproduced by isolated proteins or groups of proteins; they function

only in the context of the ribosome.

Two types of information are important in analyzing the ribosome. Mutations

implicate particular ribosomal proteins or bases in rRNA in participating in

particular reactions. Structural analysis, including direct modification of

components of the ribosome and comparisons to identify conserved features in

rRNA, identifies the physical locations of components involved in particular

functions. Bacterial ribosomes have three sites that bind aminoacyl-tRNAs, the

aminoacyl (A) site, the peptidyl (P) site, and the exit (E) site. Both the 30S

and the 50S subunits contribute to the characteristics of the A and P sites,

whereas the E site is largely confined to the 50S subunit. The initiating (5’)AUG

is positioned at the P site, the only site to which fMettRNAfMet can bind.

The fMet-tRNAfMet is the only aminoacyl-tRNA that binds first to the P site;

during the subsequent elongation stage, all other incoming aminoacyl-tRNAs

(including the Met-tRNAMet that binds to interior AUG codons) bind first to the

A site and only subsequently to the P and E sites. The E site is the site from

which the “uncharged” tRNAs leave during elongation. Factor IF-1 binds at the

A site and prevents tRNA binding at this site during initiation.



INTIATION FACTORS AND THEIR REGULATION - FORMATION OF INITIATION COMPLEX

Amino acid activation: Introduction: Two important events must occur even before translation initiation can take

place. One of these prerequisites is to generate a supply of aminoacyl-tRNAs (tRNAs with their cognate amino acids attached). In other words, amino acids

must be covalently bound to tRNAs. This process is called tRNA charging; the

tRNA is said to be “charged” with an amino acid. Another preinitiation event is

the dissociation of ribosomes into their two subunits. This is necessary

because the cell assembles the initiation complex on the small ribosomal

subunit, so the two subunits must separate to make this assembly possible.

tRNA Charging: All tRNAs have the same three bases (CCA) at their 3′-ends, and the terminal

adenosine is the target for charging. An amino acid is attached by an ester

bond between its carboxyl group and the 2′- or 3′-hydroxyl group of the

terminal adenosine of the tRNA, as shown in Figure.

Structure of aminoacyl t RNA synthetases: The activation of amino acids takes place in the cytosol and in it the 20

different amino acids are esterified to their corresponding tRNAs by aminoacyl-

tRNA synthetases. In most organisms, there is generally one aminoacyl-tRNA

synthetase (also called aminoacyl-tRNA ligase or simply activation enzyme) for

each amino acid. However, for amino acids that have 2 or more corresponding

tRNAs, the same aminoacyl-tRNA synthetase usually aminoacylates all of them.



However, in E. coli, the only exception to this rule is lysine, for which there are

two aminoacyl-tRNA synthetases. There is only one tRNA in E. coli, and the

biological rationale for the presence of two Lys-tRNA synthetases is not yet

understood.

All the aminoacyl-tRNA synthetases have been divided into 2 classes, based on

distinctions in their structure and on differences in reaction mechanisms.



Function of aminoacyl t RNA synthetases: Step 1 is formation of an aminoacyl adenylate, which remains bound to the

active site. In the second step the aminoacyl group is transferred to the tRNA.

The mechanism of this step is somewhat different for the two classes of

aminoacyl-tRNA synthetases.

For class I enzymes, 2a the aminoacyl group is transferred initially to the

2’hydroxyl group of the 3’-terminal A residue, then 3a to the 3’-hydroxyl group

by a transesterification reaction. For class II enzymes, 2b the aminoacyl group

is transferred directly to the 3’-hydroxyl group of the terminal adenylate

INITIATION

The initiation of polypeptide synthesis in bacteria requires (1) the 30S

ribosomal subunit, (2) the mRNA coding for the polypeptide to be made, (3) the

initiating fMet-tRNAfMet, (4) a set of three proteins called initiation factors (IF-

1, IF-2, and IF-3), (5) GTP, (6) the 50S ribosomal subunit, and (7) Mg2+.

Formation of the initiation complex takes place in three steps

Step 1: The 30S ribosomal subunit binds two initiation factors, IF-1 and IF-3. Factor

IF-3 prevents the 30S and 50S subunits from combining prematurely. The

mRNA then binds to the 30S subunit. The initiating (5’) AUG is guided to its

correct position by the Shine- Dalgarno sequence in the mRNA. This

consensus sequence is an initiation signal of four to nine purine residues, 8 to

13 bp to the 5’ side of the initiation codon.



The sequence base-pairs with a complementary pyrimidine-rich sequence near

the 3’ end of the 16S rRNA of the 30S ribosomal subunit.

This mRNA-rRNA interaction positions the initiating (5’) AUG sequence of the

mRNA in the precise position on the 30S subunit where it is required for

initiation of translation. The particular (5’)AUG where fMet-tRNAfMet is to be

bound is distinguished from other methionine codons by its proximity to the

Shine-Dalgarno sequence in the mRNA.

Bacterial ribosomes have three sites that bind aminoacyl-tRNAs, the

aminoacyl (A) site, the peptidyl (P) site, and the exit (E) site. Both the 30S

and the 50S subunits contribute to the characteristics of the A and P sites,

whereas the E site is largely confined to the 50S subunit. The initiating (5’)

AUG is positioned at the P site, the only site to which fMettRNAfMet can bind.

The fMet-tRNAfMet is the only aminoacyl-tRNA that binds first to the P site;

during the subsequent elongation stage, all other incoming aminoacyl-tRNAs

(including the Met-tRNAMet that binds to interior AUG codons) bind first to the

A site and only subsequently to the P and E sites. The E site is the site from

which the “uncharged” tRNAs leave during elongation. Factor IF-1 binds at the

A site and prevents tRNA binding at this site during initiation.

Step 2:



The initiation process, the complex consisting of the 30S ribosomal subunit, IF-

3, and mRNA is joined by both GTP-bound IF-2 and the initiating fMet-

tRNAfMet. The anticodon of this tRNA now pairs correctly with the mRNA’s

initiation codon.

Step 3:



This large complex combines with the 50S ribosomal subunit; simultaneously,

the GTP bound to IF-2 is hydrolyzed to GDP and Pi, which are released from

the complex. All three initiation factors depart from the ribosome at this point.

Completion of the steps in Figure produces a functional 70S ribosome called

the initiation complex, containing the mRNA and the initiating fMettRNAfMet.

The correct binding of the fMet-tRNAfMet to the P site in the complete 70S

initiation complex is assured by at least three points of recognition and

attachment: the codon-anticodon interaction involving the initiation AUG fixed

in the P site; interaction between the Shine-Dalgarno sequence in the mRNA

and the 16S rRNA; and binding interactions between the ribosomal P site and

the fMet-tRNAfMet. The initiation complex is now ready for elongation.

Initiation of protein synthesis: Eukaryotes Translation is generally similar in eukaryotic and bacterial cells; most of the

significant differences are in the mechanism of initiation. Eukaryotic mRNAs

are bound to the ribosome as a complex with a number of specific binding

proteins. Several of these tie together the 5’ and 3’ ends of the message.

At the 3’ end, the mRNA is bound by the poly(A) binding protein (PAB).

Eukaryotic cells have at least nine initiation factors. A complex called eIF4F,

which includes the proteins eIF4E, eIF4G, and eIF4A, binds to the 5’ cap

through eIF4E. The protein eIF4G binds to both eIF4E and PAB, effectively

tying them together



The protein eIF4A has an RNA helicase activity. It is the eIF4F complex that

associates with another factor, eIF3, and with the 40S ribosomal subunit. The

efficiency of translation is affected by many properties of the mRNA and

proteins in this complex, including the length of the 3’ poly(A) tract (in most

cases, longer is better). The initiating (5’) AUG is detected within the mRNA not

by its proximity to a Shine-Dalgarno-like sequence but by a scanning process:

a scan of the mRNA from the 5’ end until the first AUG is encountered,

signaling the beginning of the reading frame. The eIF4F complex is probably

involved in this process, perhaps using the RNA helicase activity of eIF4A to

eliminate secondary structure in the 5’ untranslated portion of the mRNA.

Scanning is also facilitated by another protein, eIF4B. The roles of the various

bacterial and eukaryotic initiation factors in the overall process are

summarized in Table. The mechanism by which these proteins act is an

important area of investigation.

Summary of translation initiation in eukaryotes. (a) The eIF3 factor converts the 40S ribosomal subunit to 40SN, which

resists association with the 60S ribosomal particle and is ready to accept the

initiator aminoacyl-tRNA. (b) With the help of eIF2, Met-tRNAi Met binds to the

40SN particle, forming the 43S complex. (c) Aided by eIF4, the mRNA binds to

the 43S complex, forming the 48S complex. (d) The eIF5 factor helps the 60S

ribosomal particle bind to the 48S complex, yielding the 80S complex that is

ready to begin translating the mRNA.



ELONGATION The third stage of protein synthesis is elongation. Again, our initial focus is on

bacterial cells. Elongation requires (1) the initiation complex described above,

(2) aminoacyl-tRNAs, (3) a set of three soluble cytosolic proteins called

elongation factors (EF-Tu, EF-Ts, and EF-G in bacteria), and (4) GTP. Cells

use three steps to add each amino acid residue, and the steps are repeated as

many times as there are residues to be added.

Elongation Step 1: Binding of an Incoming Aminoacyl-tRNA In the first step of the elongation cycle, the appropriate incoming aminoacyl-

tRNA binds to a complex of GTP-bound EF-Tu. The resulting aminoacyltRNA–

EF-Tu–GTP complex binds to the A site of the 70S initiation complex. The GTP

is hydrolyzed and an EF-Tu–GDP complex is released from the 70S ribosome.

The EF-Tu–GTP complex is regenerated in a process involving EF-Ts and GTP.



Elongation Step 2: Peptide Bond Formation A peptide bond is now formed between the two amino acids bound by their

tRNAs to the A and P sites on the ribosome. This occurs by the transfer of the

initiating N-formylmethionyl group from its tRNA to the amino group of the

second amino acid, now in the A site. The _-amino group of the amino acid in

the A site acts as a nucleophile, displacing the tRNA in the P site to form the

peptide bond. This reaction produces a dipeptidyltRNA in the A site, and the

now “uncharged” (deacylated) tRNAfMet remains bound to the P site. The

tRNAs then shift to a hybrid binding state, with elements of each spanning two

different sites on the ribosome, as shown in Figure. The enzymatic activity that



catalyzes peptide bond formation has historically been referred to as peptidyl transferase and was widely assumed to be intrinsic to one or more of the

proteins in the large ribosomal subunit.

Elongation Step 3: Translocation In the final step of the elongation cycle, translocation, the ribosome moves one

codon toward the 3_ end of the mRNA. This movement shifts the anticodon of

the dipeptidyltRNA, which is still attached to the second codon of the mRNA,

from the A site to the P site, and shifts the deacylated tRNA from the P site to

the E site, from where the tRNA is released into the cytosol. The third codon of

the mRNA now lies in the A site and the second codon in the P site. Movement

of the ribosome along the mRNA requires EF-G (also known as translocase) and

the energy provided by hydrolysis of another molecule of GTP.

A change in the three-dimensional conformation of the entire ribosome results

in its movement along the mRNA. Because the structure of EF-G mimics the

structure of the EF-Tu–tRNA complex, EF-G can bind the A site and

presumably displace the peptidyl-tRNA. The ribosome, with its attached

dipeptidyl-tRNA and mRNA, is now ready for the next elongation cycle and

attachment of a third amino acid residue. This process occurs in the same way

as addition of the second residue. For each amino acid residue correctly added

to the growing polypeptide, two GTPs are hydrolyzed to GDP and Pi as the

ribosome moves from codon to codon along the mRNA toward the 3’ end.

The polypeptide remains attached to the tRNA of the most recent amino acid to

be inserted. This association maintains the functional connection between the

information in the mRNA and its decoded polypeptide output. At the same

time, the ester linkage between this tRNA and the carboxyl terminus of the

growing polypeptide activates the terminal carboxyl group for nucleophilic

attack by the incoming amino acid to form a new peptide bond. As the existing

ester linkage between the polypeptide and tRNA is broken during peptide bond

formation, the linkage between the polypeptide and the information in the

mRNA persists, because each newly added amino acid is still attached to its

tRNA.



The elongation cycle in eukaryotes is quite similar to that in prokaryotes. Three

eukaryotic elongation factors (eEF1_, eEF1__, and eEF2) have functions



analogous to those of the bacterial elongation factors (EF-Tu, EF-Ts, and EF-G,

respectively). Eukaryotic ribosomes do not have an E site; uncharged tRNAs

are expelled directly from the P site. Elongation of protein synthesis: Eukaryotes The elongation cycle in eukaryotes is quite similar to that in prokaryotes. Three

eukaryotic elongation factors (eEF1α, eEF1βγ, and eEF2) have functions

analogous to those of the bacterial elongation factors (EF-Tu, EF-Ts, and EF-G,

respectively). Eukaryotic ribosomes do not have an E site; uncharged tRNAs

are expelled directly from the P site. TERMINATION OF POLYPEPTIDE CHAIN

Termination, the fourth stage of polypeptide synthesis, is signaled by the

presence of one of three termination codons in the mRNA (UAA, UAG, and

UGA), immediately following the final coded amino acid.

In bacteria, once a termination codon occupies the ribosomal A site, three

termination factors, or release factors—the proteins RF-1, RF-2, and RF-3—

contribute to (1) hydrolysis of the terminal peptidyltRNA bond; (2) release of the

free polypeptide and the last tRNA, now uncharged, from the P site; and (3)

dissociation of the 70S ribosome into its 30S and 50S subunits, ready to start

a new cycle of polypeptide synthesis.

RF-1 recognizes the termination codons UAG and UAA, and RF-2 recognizes

UGA and UAA. Either RF-1 or RF-2 (depending on which codon is present)

binds at a termination codon and induces peptidyl transferase to transfer the

growing polypeptide to a water molecule rather than to another amino acid.

The release factors have domains thought to mimic the structure of tRNA, as

shown for the elongation factor EF-G in Figure 27–25b. The specific function of

RF-3 has not been firmly established, although it is thought to release the

ribosomal subunit. In eukaryotes, a single release factor, eRF, recognizes all

three termination codons.



Termination of protein synthesis: Eukaryotes Eukaryotes have two release factors: eRF1, which recognizes all three

termination codons, and eRF3, a ribosome-dependent GTPase that help eRF1

release the finished polypeptide. ELONGATION FACTORS AND RELEASING FACTORS

TRANSLATIONAL PROOF READING



Introduction: The aminoacylation of tRNA performs two functions: the activation of an amino

acid for peptide bond formation and attachment of the amino acid to an

adaptor tRNA which directs its placement within a growing polypeptide. In fact,

the identity of the amino acid attached to a tRNA is not checked on the

ribosome and attaching the correct amino acid to each tRNA is, henceforth,

essential to the fidelity of protein synthesis as a whole.

Screening and editing: The correct translation of genetic message depends on the high degree of

specificity of aminoacyltRNA synthetases. These enzymes are highly sensitive in

their recognition of the amino acid to be activated and of the prospective tRNA

acceptor. A very high specificity is indeed necessary during the Stage I (i.e.,

activation of amino acids), in order to avoid errors in the biosynthesis of

proteins, because once the aminoacyl-tRNA is formed, there is no longer any

control mechanism in the cell to verify the nature of the amino acid and it is

therefore not possible to reject an amino acid which would have been

incorrectly bound. Consequently, the amino acid would be erroneously

incorporated in the protein molecule.

Therefore, the aminoacyl-tRNA synthetases, in vivo, must either not commit

any errors or be able to rectify them. The tRNA molecules, that accept different

amino acids, have differing base sequences, and hence they can be easily

recognized by the synthetases. But the synthetases must, in particular, be able

to distinguish between two amino acid of very similar structure; and some of

them have this capacity. For example, isoleucine (Ile) differs from valine (Val)

only in having an additional methylene (-CH2-) group.

The additional binding energy contributed by this extra —CH2— group favours

the activation of isoleucine (to form Ile-AMP) over valine by a factor of 200. The

concentration of valine in vivo is about 5 times that of isoleucine, and so valine

would be mistakenly incorporated in place of isoleucine 1 in every 40 times.

However, the observed error frequency in vivo is only 1 in 3,000, indicating that

there must be subsequent editing steps to enhance fidelity. In fact, the Ile-tRNA



synthetase corrects its own errors, i.e., in the presence of tRNAIle, the Val-AMP

formed is hydrolyzed (but not Ile-AMP), thus preventing an erroneous

aminoacylation (i.e., a misacylation) of tRNAIle .

Furthermore, this hydrolytic reaction frees the synthetase for the activation

and transfer of Ile, the correct amino acid. Hydrolysis of Ile-AMP, the desired

intermediate, is however avoided because the hydrolytic site is just large

enough to accomodate Val- AMP, but too small to allow the entry of Ile-AMP.

Thus, most aminoacyl-tRNA synthetases contain two sites: the acylation or

synthetic site and the hydrolytic site. And the entire system is forced through

two successive “filters”, rather than one, whereby increasing the potential

fidelity by a power of 2. The first filter is the synthetic site on synthetases which brings about the initial amino acid binding and activation to aminoacyl-

AMP.

The second filter is the separate active site or hydrolytic site on synthetases

which catalyzes deacylation of incorrect aminoacyl-AMPs. The synthetic site

rejects amino acids that are larger than the correct one because there is

insufficient room for them, whereas the hydrolytic site destroys activated

intermediates that are smaller than the correct species. Hydrolytic proofreading



is central to the fidelity of many aminoacyltRNA synthetases, as it is to DNA

polymerases. In addition to proofreading after formation of the aminoacyl- AMP

intermediate, most aminoacyl-tRNA synthetases are also capable of hydrolyzing

the ester linkage between amino acids and tRNAs in aminoacyltRNAs.

This hydrolysis is greatly accelerated for incorrectly-charged tRNAs, providing

yet a third filter to further enhance the fidelity of the overall process. In

contrast, in a few aminoacyl-tRNA synthetases that activate amino acids that

have no close structural relatives, little or no proofreading occurs; in these

cases, the active site can sufficiently discriminate between the proper amino

acid and incorrect amino acids. Proofreading is costly in energy and time and

hence is selected in the course of evolution only when fidelity must be enhanced.

The overall error rate of protein synthesis (~ 1 mistake per 104 amino acids

incorporated) is not nearly as low as for DNA replication, perhaps because a

mistake in a protein is erased by destroying the protein and is not passed onto

future generations. This degree of fidelity is sufficient to ensure that most

proteins contain no mistakes and that the large amount of energy required

synthesizing a protein is rarely wasted. INHIBITORS OF TRANSLATION AND THEIR MECHANISM

List of inhibitors in translational initiation: Streptomycin Streptomycin, which was discovered by Selman Wakesman in 1944, is a

medically important member of a family of antibiotics known as

aminoglysosides that inhibit prokaryotic ribosomes in a variety of ways. It is a

highly basic trisaccharide and, at higher concentrations, interferes with the

binding of fMet-tRNA to ribosomes and thereby prevents the correct initiation

of protein synthesis. And at relatively low concentrations, streptomycin also

leads to a misreading of the genetic code on the mRNA and inhibit initiation of

the polypeptide chain. If poly U is the template, Ile (AUU) is incorporated in

addition to Phe (UUU). An extensive series of experiments revealed that a single

protein in the 30S subunit, namely protein S12, is the determinant of

streptomycin sensitivity.



Tetracyclines Tetracycline and its derivatives are broad-spectrum antibiotics that inhibit

protein synthesis by blocking the A site on the ribosome so that the binding of

aminoacyl-tRNAs is inhibited; the nascent polypeptide chain remains in the P

site and can react normally with pyromycin, another antibiotic inhibitor List of inhibitors in translational elongation:

Chloramphenicol, CAP (= Chloromycetin) Chloramphenicol, the first of the “broadspectrum” anibiotics, inhibits peptidyl

transferase activity on the large subunit of prokaryotic ribosomes. However, its

clinical uses are limited to only severe infections because of its toxic side

effects, which are caused, at least in part, by the chloramphenicol sensitivity of

mtochondrial ribosomes. It is a classic inhibitor of protein synthesis in bacteria

and acts, at relatively low concentrations on bacterial (also mitochondrial and

chloroplast) ribosomes by blocking peptidyl transfer by interfering with the

interactions of ribosomes with A site-bound aminoacyl-tRNAs, but does not



affect cytosolic protein synthesis in eukaryotes. Of the various possible optical

isomers, only the D (–) threo form shows significant inhibitory activity. Cycloheximide (= Actidione) It is a potent fungicide antibiotic and blocks the peptidyl transferase of 80S

eukaryotic ribosomes but not that of 70S bacterial (also mitochondrial and

chloroplast) ribosomes. Contrary to chloramphenicol, cycloheximide affects

only ribosomes in the cytosol. The difference in the sensitivity of protein

synthesis to these two drugs provides a powerful way to determine in which cell

compartment a particular protein is translated.

Erythromycin: It binds to the bacterial 50S ribosomal subunit and blocks the translocation

step, thereby “freezing” the peptidyl-tRNA in the A site.

Fusidic acid: It is a steroid and affects the translocation step in eukaryotic ribosomes after

formation of the peptide bond, possibly by preventing cleavage of GTP in the

eEF2-mediated cleavagetranslocation reaction.

POST TRANSLATIONAL MODIFICATION OF PROTEINS – GLYCOSYLATION. Introduction: The final step of protein synthesis, the newly-formed peptide chain is folded

and processed into its biologically-active form. At some point of time, during or

after protein synthesis, the polypeptide chain spontaneously assumes its native

conformation by forming sufficient number of hydrogen bonds and van der

Waals, ionic, and hydrophobic interactions. In this way, the linear (or one

dimensional) genetic message encoded in mRNA is converted into the 3-

dimensional structure of the protein. However, there are some other nascent

proteins which undergo one or more processing reactions called

posttranslational modifications, for their conversion to the active forms.

Such modifications occur in both eukaryotes and prokaryotes and include the

following:

N-terminal and C-terminal Modifications



All polypeptides begin with a residue of N-formylmethionine (in bacteria) or

methionine (in eukaryotes). However, the formyl group, the terminal

methionine residue, and often additional Nterminal or C-terminal residues

must be removed enzymatically before they convert into the final functional

proteins. The formyl group at the N-terminus of bacterial proteins is hydrolyzed

by a deformylase. One or more N-terminal residues may be removed by

aminopeptidases. In about half of the eukaryotic proteins, the amino group of

the N-terminal residue is acetylated after translation. The C-terminal residues

are also sometimes modified.

Loss of Signal Sequences In certain proteins, some (15 to 30) amino acid residues at the N-terminus play

a role in directing the protein to its ultimate destination in the cell. Such signal sequences, as they are called, are ultimately removed by specific peptidases.

Modification of Individual Amino Acids Certain amino acid side chains may be specifically modified. For instance, the

hydroxyl groups of certain serine, threonine, and tyrosine residues of some

proteins undergo enzymatic phosphorylation by ATP the phosphate groups add

negative charge to these polypeptides. The functional significance of this

modification varies from one protein to the other. For example, the milk protein

casein has many phosphoserine groups, which function to bind Ca2+. Given

that Ca2+ and phosphate, as well as amino acids, are required by suckling

young, casein provides three essential nutrients. The phosphorylation and

dephosphorylation of the OH group of certain serine residues regulate the

activity of some enzymes, such as glycogen phosphorylase.

Sometimes, additional carboxyl groups are added to Asp and Glu residues of

some proteins. For instance, the blood clotting protein prothrombin contains

many γ-carboxyglutamate residues in its N-terminal region. These groups bind

Ca2+ which is required to initiate the clotting mechanism.

In some proteins, certain lysine residues are methylated enzymatically.



Monomethyl- and dimethyllysine residues are present in some muscle proteins

and in cytochrome c.

Calmodulin of most organisms contains one trimethyllysine residue at a specific

position. In other proteins, the carboxyl groups of some Glu residues undergo

methylation, which removes their negative charge.

Some proline and lysine residues in collagen are hydroxylated.

Formation of Disulfide Cross-links Some proteins after acquiring native conformations are often covalently cross-

linked by the formation of disulfide bridges between cysteine residues. These

cross-links help to protect the native conformation of the protein molecule from

denaturation in an extracellular environment that is quite different from that

inside the cell.

Attachment of Carbohydrate Side Chains In glycoproteins, the carbohydrate side chains are attached covalently during

or after the synthesis of polypeptide chain. In some glycoproteins, the

carbohydrate side chain is attached enzymatically to Asn residues (N-linked

oligosaccharides), in others to Ser or Thr residues (O-linked oligosaccharides).

Many proteins that function extracellularly contain oligosaccharide side chains.

Addition of Prosthetic Groups Many prokaryotic and eukaryotic proteins require for their activity covalently-

bound prosthetic groups. These groups become attached to the polypeptide

chain after it leaves the ribosome. The two significant examples are the

covalently-bound biotin molecule in acetyl-CoA carboxylase and the heme

group of cytochrome c.

Addition of Isoprenyl Groups Many eukaryotic proteins are isoprenylated; a thioester bond is formed between

the isoprenyl group and a cysteine residue of the protein. The isoprenyl groups

are derived from pyrophosphate intermediates of the cholesterol biosynthetic

pathway, such as farnesyl pyrophosphate.



Proteins so modified include the products of the ras oncogenes and proto-

oncogenes, G proteins, and proteins called lamins, found in the nuclear matrix.

Proteolytic Trimming Many proteins (insulin, collagen) and proteases (trypsin, chymotrypsin) are

initially synthesized as larger, inactive precursor proteins. These precursors are

proteolytically trimmed to produce their final, active forms. Some animal

viruses, notably poliovirus, synthesize long polycistronic proteins from one long

mRNA molecule. These protein molecules are subsequently cleaved at specific

sites to provide the several specific proteins required for viral function. CONTROL OF TRANSLATION IN EUKARYOTES

Eukaryotic mRNAs are much longer-lived than prokaryotic ones, so there is

more opportunity for translational control. The rate-limiting factor in

translation is usually initiation, so we would expect to find most control

exerted at this level. In fact, the most common mechanism of such control is

phosphorylation of initiation factors, and we know of cases where such

phosphorylation can be inhibitory and others where it can be stimulatory. We

also know of at least one case in which an initiation factor is bound and

inhibited by another protein until that protein is phosphorylated. This

phosphorylation releases the initiation factor so initiation can occur. Finally,

there is an example of a protein binding directly to the 5′- untranslated region

of an mRNA and preventing its translation. Removal of this protein activates

translation.

Phosphorylation of Initiation Factor eIF2α: The best known example of inhibitory phosphorylation occurs in reticulocytes,

which make one protein, hemoglobin, to the exclusion of almost everything

else. But sometimes reticulocytes are starved for heme, the iron-containing

part of hemoglobin, so it would be wasteful to go on producing α- and β-

globins, the protein parts. Instead of stopping the production of the globin

mRNAs, reticulocytes block their translation as follows.

The absence of heme unmasks the activity of a protein kinase called the

hemecontrolled repressor, or HCR. This enzyme phosphorylates one of the



subunits of eIF2, known as eIF2α. The phosphorylated form of eIF2 binds more

tightly than usual to eIF2B, which is an initiation factor whose job is to

exchange GTP for GDP on eIF2. When eIF2B is stuck fast to phosphorylated

eIF2, it cannot get free to exchange GTP for GDP on other molecules of eIF2, so

eIF2 remains in the inactive GDP-bound form and cannot attach Met-tRNAi

Met to 40S ribosomes. Thus, translation initiation grinds to a halt.

Diagram: Repression of translation by phosphorylation of eIF2α. Heme abundance, no repression:

Step 1, Met-tRNAi Met binds to the eIF2-GTP complex, forming the ternary

Met-tRNAi Met GTP-eIF2 complex. The eIF2 factor is a trimer of nonidentical

subunits (α [green], β [yellow], and γ [orange]).

Step 2, the ternary complex binds to the 40S ribosomal particle (blue).

Step 3, GTP is hydrolyzed to GDP and phosphate, allowing the GDP–eIF2

complex to dissociate from the 40S ribosome, leaving Met-tRNAi Met attached.

Step 4, eIF2B (red) binds to the eIF2–GDP complex.

Step 5, eIF2B exchanges GTP for GDP on the complex.

Step 6, eIF2B dissociates from the complex. Now eIF2–GTP and Met-tRNAi Met

can get together to form a new complex to start a new round of initiation.

Heme starvation leads to translational repression.



Step A, HCR (activated by heme starvation) attaches a phosphate group

(purple) to the α-subunit of eIF2.

Then, steps 1–5 are identical to those in panel (a), but

Step 6 is blocked because the high affinity of eIF2B for the phosphorylated

eIF2α prevents its dissociation. Now eIF2B will be tied up in such complexes,

and translation initiation will be repressed.

DIFFERENCES BETWEEN PROKARYOTES AND EUKARYOTES Outline for all the stages in prokaryotic and wukaryotic translation and mention

the below tabular column



UNIT – 5 REGULATION OF GENE EXPRESSION: Transcriptional control. Operon concept, catabolite repression. Inducible and repressible systems. Negative gene regulation – E.Coli lac operon; Positive regulation - E.Coli ara operon; Regulation by attenuation – his and trp operons, anti – termination - N protein and nut sites, DNA binding sites, DNA binding protein, enhancer sequences, identification of protein binding site on DNA. Maturation and processing of RNA – Methylation, cutting and modification of t RNA degradation system.

TRANSCRIPTIONAL CONTROL



At the level of transcription, it can be determined whether a gene is transcribed

at a given point in time. The chromatin structure plays an important role in

this decision. Chromatin structures exist that can effectively inhibit

transcription and shut down a gene. This “silencing” of genes can be transient

or permanent and is generally observed in development and differentiation

processes. The regulated transcription of genes requires as an essential step

reorganization and modification of the chromatin, which is a prerequisite for

the initiation of transcription and is influenced by epigenetic changes in the

DNA in the form of methylation of cytidine residues. Following chromatin

reorganization and modification, transcription initiation requires the selection

of the target gene and formation of a transcription initiation complex at the

starting point of transcription. A large number of proteins are involved in this

step. The main components are the multisubunit RNA polymerase, general and

specific transcription factors, and cofactors that help to coordinate the

chromatin structural changes and the process of RNA synthesis. The formation

of a functional initiation complex is often the rate-limiting step in transcription

and is subject to a variety of regulation mechanisms.

Conversion of the pre-mRNA into the mature mRNA Transcription of genes in mammals often initially produces a pre-mRNA, whose

information content can be modulated by subsequent polyadenylation or

splicing. Various final mRNAs coding for proteins with varying function and

localization can be produced in this manner starting from a single primary

transcript.

Regulation at the Translation Level The use of a particular mature mRNA for protein biosynthesis is also highly

regulated. The regulation can occur via the accessibility of the mRNA for the

ribosome or via the initiation of protein biosynthesis on the ribosome. In this

manner, a given level of mature mRNA can specifically determine when and

how much of a protein is synthesized on the ribosome.

Nature of the Regulatory Signals



Regulation always implies that signals are received, processed and translated

into a resulting action. The nature of the signals which are employed in the

course of the regulation of gene expression and are finally translated into a

change in protein concentration can vary dramatically. Regulatory molecules

can be small molecular metabolites, hormones, proteins or ions. The signals

can be of external origin or can be produced within the cell. External signals

originating from other tissues or cells of the organism are transferred across

the cell membrane into the interior of the cell, where they are transduced by

sequential reactions to the level of transcription or translation. Complex signal

chains are often involved in the transduction. OPERON CONCEPT

Bacteria have a simple general mechanism for coordinating the regulation of

genes encoding products that participate in a set of related processes: these

genes are clustered on the chromosome and are transcribed together. Many

prokaryotic mRNAs are polycistronic— multiple genes on a single transcript—

and the single promoter that initiates transcription of the cluster are the site of

regulation for expression of all the genes in the cluster. The gene cluster and

promoter, plus additional sequences that function together in regulation, are

called an operon. Operons that include two to six genes transcribed as a unit are common; some

operons contain 20 or more genes.

Many of the principles of prokaryotic gene expression were first defined by

studies of lactose metabolism in E. coli, which can use lactose as its sole

carbon source.

In 1960, François Jacob and Jacques Monod published a short paper in the

Proceedings of the French Academy of Sciences that described how two

adjacent genes involved in lactose metabolism were coordinately regulated by a

genetic element located at one end of the gene cluster. The genes were those for

β-galactosidase, which cleaves lactose to galactose and glucose, and

galactoside permease, which transports lactose into the cell.



Prokaryotic genes are clustered in operons; an operon is controlled by a single

promoter and contains several genes all devoted to the same metabolic

pathway, such as lactose catabolism in the case of the lac operon.

Regulation of transcription is performed essentially at two stages: (i) At the

initiation level, variation in the pattern of transcribed operons can result from

the use of different s factors, each specific to a type of promoter. In this

respect, events such as sporulation or response to heat shock are driven by a

cascade of s factors. The host RNA pol can be substituted by a newly

synthesized viral RNA pol upon bacteriophage infection. An alternative pathway

requires transacting factors that bind to the operator element, a specific

sequence generally located nearby the promoter, to enhance (activators) or

repress (repressors) the initiation of transcription.

(ii) When the elongating RNA pol encounters a first termination sequence, it

may either stop or continue to transcribe adjacent genes. This will depend on

factors that fasten on the elongating polymerase after having bound to a

specific site on the DNA or on the transcript. In the case of antitermination factors (or antiterminators), RNA pol will read through the stop signal, but

some other factors exist that will increase its propensity to stop, probably by

inducing frequent pausing. At some operons, the phenomenon of attenuation can also occur, when transcription is intimately coupled with translation:

progression of the ribosome along the nascent RNA modulates the formation of

secondary structures, enabling RNA pol to read through intrinsic terminators

(or attenuators).

CATABOLITE REPRESSION INDUCIBLE AND REPRESSIBLE SYSTEMS

Expression of many genes is controlled by availability of CRP-cAMP; lack of

expression due to low CRP-cAMP is referred to as catabolite repression.

cAMP–CRP Complex The best characterized mechanism of catabolite repression involves the

regulation of the intracellular concentration of the cAMP–CRP complex. It is

well known that the presence of glucose in the growth medium lowers the



intracellular cAMP level under certain conditions. Cyclic AMP is synthesized

from ATP by the enzyme adenylate cyclase. Although the mechanism of

regulation of the cAMP level remains elusive, glucose is thought to decrease

cAMP by decreasing the level of the phosphorylated form of enzyme IIAGlc,

which is involved in the activation of adenylate cyclase.

IIAGlc is one of the enzymes of the phosphoenolpyruvate-dependent

carbohydrate

phosphotransferase system (PTS) and is directly responsible for the active transport and

phosphorylation of glucose. Recently, it was discovered that the concentration

of CRP is also lowered by the presence of glucose and that this is an additional

factor contributing to catabolite repression. The decreased CRP is a

consequence of the complex autoregulation of expression of the crp gene.It

should be noted that the reduction in the cAMP-CRP level by glucose is usually

rather moderate (in the range of several-fold).

Inducer Exclusion: The second mechanism of catabolite repression is inducer exclusion, by which

glucose lowers the intracellular concentration of inducers necessary for the

induction of catabolic operons. The target of glucose signaling in inducer

exclusion is operon-specific regulators, such as the Lac repressor. The

dephosphorylated enzyme IIAGlc, which accumulates in the presence of

glucose, binds to and inactivates (for example) the Lac permease, resulting in

an increase of the active unliganded Lac repressor (see Lac Operon). Inducer

exclusion is a mechanism by which glucose inhibits more strictly the

expression of target operons.

Catabolite Repressor/Activator Protein: The third mechanism of catabolite repression is mediated by the catabolite

repressor/activator (Cra) protein, which acts as a global regulator of genes

encoding enzymes of central carbohydrate metabolism. The unliganded form of

Cra binds to the operator regions of target operons, causing either activation or

inhibition of transcription. The presence of glucose or other PTS sugars



produces glycolytic catabolites, such as fructose-1-phosphate, which bind to

the Cra protein and cause it to dissociate from the target DNA, resulting in

either catabolite repression or catabolite activation.

Relationships between the Various Mechanisms: While multiple mechanisms of catabolite repression have been identified in E.

coli, their signaling pathways appear to be interrelated to each other. For

example, the PTS plays a pivotal role in the regulation of the intracellular

concentrations of cAMP, inducer, and glycolytic catabolites. In addition, it is

particularly important to realize that the contribution of each mechanism

varies, depending on growth conditions and the target genes. For example, the

Cra-mediated mechanism may play no role in catabolite repression of the lac

operon, because this operon is not under the control of Cra. An unexpected

finding is that the presence of glucose in the lactose medium does not affect

the intracellular cAMP level. This means that catabolite repression mediated by

the reduction in cAMP never happens in glucose–lactose diauxie. The presence

of unliganded Lac repressor through inducer exclusion is the principal

mechanism for this historical phenomenon.

Mechanism of catabolite repression in the glucose-lactose system.

When both lactose and glucose are present, glucose is transported and

phosphorylated by the glucose PTS (IIAGlc + IICBGlc), increasing the



concentration of the nonphosphorylated form of IIAGlc, which prevents the

uptake of lactose by inhibiting the Lac permease activity. Thus, the

concentration of lac inducer is very low in the presence of glucose, so the Lac

repressor is active and represses transcription of the lac operon. It should be

noted that glucose does not affect the binding of cAMP–CRP to the promoter,

because the levels of cAMP and CRP are not reduced by the presence of

glucose.

NEGATIVE GENE REGULATION – E.COLI LAC OPERON Lactose metabolism in E. coli.

Uptake and metabolism of lactose require the activities of galactoside permease

and β galactosidase. Conversion of lactose to allolactose by transglycosylation

is a minor reaction also catalyzed by β-galactosidase.

The positive regulation of lac operon: A regulatory mechanism known as catabolite repression restricts expression

of the genes required for catabolism of lactose, arabinose, and other sugars in

the presence of glucose, even when these secondary sugars are also present.

The effect of glucose is mediated by cAMP, as a coactivator, and an activator



protein known as cAMP receptor protein, or CRP (the protein is sometimes

called CAP, for catabolite gene activator protein). CRP is a homodimer (subunit

Mr 22,000) with binding sites for DNA and cAMP. Binding is mediated by a

helix-turnhelix motif within the protein’s DNA-binding domain.

When glucose is absent, CRP-cAMP binds to a site near the lac promoter and

stimulates RNA transcription 50-fold. CRP-cAMP is therefore a positive

regulatory element responsive to glucose levels, whereas the Lac repressor is a

negative regulatory element responsive to lactose. The two act in concert CRP-

cAMP has little effect on the lac operon when the Lac repressor is blocking

transcription, and dissociation of the repressor from the lac operator has little

effect on transcription of the lac operon unless CRPcAMP is present to facilitate

transcription; when CRP is not bound, the wild-type lac promoter is a relatively

weak promoter.

(a) The binding site for CRP-cAMP is near the promoter. As in the case of the

lac operator, the CRP site has twofold symmetry (bases shaded beige) about the

axis indicated by the dashed line.

(b) Sequence of the lac promoter compared with the promoter consensus

sequence. The differences mean that RNA polymerase binds relatively weakly to

the lac promoter until the polymerase is activated by CRP-cAMP.

The open complex of RNA polymerase and the promoter does not form readily

unless CRP-cAMP is present. CRP interacts directly with RNA polymerase

through the polymerase’s α subunit.



The effect of glucose on CRP is mediated by the cAMP interaction (Fig. 28–18).

CRP binds to DNA most avidly when cAMP concentrations are high. In the

presence of glucose, the synthesis of cAMP is inhibited and efflux of cAMP from

the cell is stimulated. As [cAMP] declines, CRP binding to DNA declines,

thereby decreasing the expression of the lac operon. Strong induction of the lac

operon therefore requires both lactose (to inactivate the lac repressor) and a

lowered concentration of glucose (to trigger an increase in [cAMP] and

increased binding of cAMP to CRP). CRP and cAMP are involved in the

coordinated regulation of many operons, primarily those that encode enzymes

for the metabolism of secondary sugars such as lactose and arabinose. A

network of operons with a common regulator is called a regulon. This

arrangement, which allows for coordinated shifts in cellular functions that can

require the action of hundreds of genes, is a major theme in the regulated

expression of dispersed networks of genes in eukaryotes.

Combined effects of glucose and lactose on expression of the lac operon.

(a) High levels of transcription take place only when glucose concentrations

are low (so cAMP levels are high and CRP-cAMP is bound) and lactose

concentrations are high (so the Lac repressor is not bound). (b) Without bound activator (CRP-cAMP), the lac promoter is poorly

transcribed even when lactose concentrations are high and the Lac repressor is

not bound. The negative regulation of lac operon:



The lactose (lac) operon includes the genes for _-galactosidase (Z), galactoside

permease (Y), and thiogalactoside transacetylase (A). The last of these enzymes

appears to modify toxic galactosides to facilitate their removal from the cell.

Each of the three genes is preceded by a ribosome binding site that

independently directs the translation of that gene. Regulation of the lac operon

by the lac repressor protein (Lac) follows the pattern outlined in Figure.

The study of lac operon mutants has revealed some details of the workings of

the operon’s regulatory system.

In the absence of lactose, the lac operon genes are repressed. Mutations in the

operator or in another gene, the I gene, result in constitutive synthesis of the

gene products. When the I gene is defective, repression can be restored by

introducing a functional I gene into the cell on another DNA molecule,

demonstrating that the I gene encodes a diffusible molecule that causes gene

repression. This molecule proved to be a protein, now called the Lac repressor,

a tetramer of identical monomers. The operator to which it binds most tightly

(O1) abuts the transcription start site.

The I gene is transcribed from its own promoter (PI) independent of the lac

operon genes. The lac operon has two secondary binding sites for the Lac

repressor. One (O2) is centered near position 410, within the gene encoding β-

galactosidase (Z); the other (O3) is near position 90, within the I gene. To

repress the operon, the Lac repressor appears to bind to both the main

operator and one of the two secondary sites, with the intervening DNA looped

out. Either binding arrangement blocks transcription initiation.



When cells are provided with lactose, the lac operon is induced. An inducer

(signal) molecule binds to a specific site on the Lac repressor, causing a

conformational change that result in dissociation of the repressor from the

operator. The inducer in the lac operon system is not lactose itself but

allolactose, an isomer of lactose.

After entry into the E.coli cell (via the few existing molecules of permease),

lactose is converted to allolactose by one of the few existing βgalactosidase

molecules. Release of the operator by Lac repressor, triggered as the repressor

binds to allolactose, allows expression of the lac operon genes and leads to a

103-fold increase in the concentration of βgalactosidase. POSITIVE REGULATION - E.COLI ARA OPERON

The classical ara operon of the bacterium Escherichia coli comprises three

genes, araBAD (Fig).

The positions of the araBAD operon and the araC regulatory gene on the E. coli

chromosome. Transcription o nucleotide-pair region between araB and araC.

Nucleotide location on the E. coli chromosome is shown beneath the gen the



three inverted repeat REP sequence pairs that are assumed to produce three

self-paired hairpin structures in the mRNA entire chromosome contains

approximately 4,639,221 nucleotide pairs

This and four other operons, araC, araE, araFGH, and araJ, are uniquely

associated with metabolism of L-arabinose (Table).

ara Genes and Gene Products (Location is based on a 100 minute circular chromosome)

Gene Location on E. coli chromosome (min)a

Size of gene (amino acid Codons)

Activity of product

araA 1.4 500 L-arabinose isomerase

araB 1.5 566 L-ribulokinase

araC 1.5 292 AraC regulatory protein

araD 1.4 231 L-ribulose-5-phosphate-4-

epimerase

araE 64.2 472 L-arabinose transport, low

affinity

araF 42.8 329 L-arabinose transport,

high affinity

araG 42.7 504 L-arabinose transport,

high affinity

araH 42.6 329 L-arabinose transport,

high affinity

araJ 8.5 394 Transport or processing of

polymer? Efflux of toxic

arabinosides?

Each set of genes is under control of the activator protein AraC, the product of

the araC gene. The function of each gene is known, except for araJ, which may

encode a protein that processes or transports an arabinose-containing polymer,

or pumps potentially toxic arabinosides from the cell.



Uptake and Utilization of Arabinose: Two independent systems deliver arabinose from the environment across the

cell membrane into the cell. araE encodes a membrane protein that mediates

arabinose uptake via proton symport and is the lower affinity transporter (KM

= 50 μM).The araFGH operon encodes a periplasmic arabinosebinding protein

(araF), a probable ATPase subunit (araG), and a membrane protein (araH),

which together mediate ATP-driven arabinose transport. This transporter

shows higher affinity for arabinose (KM = 1 μM) than AraE, but lower

capacity.Internal arabinose is converted in three steps to D-xylulose-5-

phosphate, a metabolyte in the pentosephosphate shunt pathway and one that

is not unique to arabinose metabolism. The enzyme mediating the first step, L-

arabinose isomerase, has a low affinity for arabinose with a KM (Michaelis

constant) of 60 mM; this suggests that cells growing on arabinose have a very

high internal arabinose concentration.The glucose-specific phosphotransferase

enzyme IIAGlc when unphosphorylated inhibits the isomerase.

This inhibition may be one of the causes of the preferential use of glucose when

both arabinose and glucose are in the environment. The product of isomerase

activity, L-ribulose, is converted to L-ribulose-5-phosphate by Lribulokinase,

and the phosphorylated compound is converted in turn to xylulose phosphate

by the epimerase encoded by araD. Arabinose inhibits growth of araD mutants

on other nutrients, presumably because accumulation of a high concentration

of ribulose phosphate is toxic. Thus secondary mutants lacking isomerase or

kinase activity as the result of araA, araB, or araC mutation, and therefore not

forming ribulose phosphate, can be selected by plating an araD population on

broth plates containing arabinose.

Regulation of ara Operon Expression: In the absence of arabinose, the ara genes are essentially not expressed, except

for the regulatory gene, araC. On exposure to L-arabinose, all of the ara genes

are activated, transcription of araBAD begins within five seconds, and the Ara

proteins appear within several minutes, allowing growth on the sugar.



Cyclic AMP (3¢,5¢-cyclic AMP, cAMP) bound to cyclic AMP receptor protein

(CRP) is also a positive regulator for all the ara operons. Expression of many

genes is controlled by availability of CRP-cAMP; lack of expression due to low

CRP-cAMP is referred to as catabolite repression. In vitro transcription of

araBAD mimics that in vivo in that it requires both the AraC and CRP

regulatory proteins with their bound ligands. Analysis of transcription in vitro,

and further in vivo studies, have given a broad understanding at the molecular

level of control of araBAD transcription although details remain to be

determined.

The AraC protein structure has two domains connected by a flexible

polypeptide linker. The Nterminal domain binds arabinose and is responsible

for formation of the active dimeric form of the protein. The C-terminal domain

binds DNA at specific sites, with similar sequences, upstream of each ara

operon. In the regulatory region between the divergently transcribed araBAD

and araC operons (Fig), there are five sites at which AraC can bind (Fig).



In the absence of arabinose, the two DNA-binding domains of the AraC dimer

are oriented so that they can not readily bind both of the adjacent I sites at the

same time. Instead, the AraC dimer contacts I1 and O2, thereby forming a DNA

loop within the region between araB and araC. On addition of arabinose, the

dimer undergoes a conformational shift such that the two DNAbinding domains

preferentially bind I2I1. The presence of AraC at I2 stimulates addition of RNA polymerase and open complex formation, and transcription of araBAD

commences, if CRP-cAMP is present. Although this model was proposed and

refined before detailed structural information was available, X-ray

crystallographic studies of the AraC N-terminal domain and linker support the

model.Addition of arabinose also affects expression of araC. When the I1-O2

loop is opened, transcription of araC is accelerated. After a few minutes, AraC-

arabinose dimers are thought to reform DNA loops, this time by bridging O1

and O2. O1-O2 looping does not regulate araBAD transcription, but interferes

with RNA polymerase binding and initiation at the araC promoter; this reduces

the rate of araC transcription to that characteristic of cells in the absence of

arabinose. araC is controlled by CRP-cAMP as well. CRP-cAMP binding

increases, but is not essential for,araC transcription; it is necessary for

substantial expression of the other ara operons.

The mechanism by which AraC-arabinose, CRP-cAMP, and RNA polymerase

interact to trigger transcription is not clear. Bound CRP-cAMP helps to open

the I1I2 repression loop on addition of arabinose, but it probably activates RNA

polymerase binding or initiation as well, either by direct contact or indirectly

through contact with AraC. (CRP-cAMP does not aid AraC binding.)

REGULATION BY ATTENUATION – HIS OPERON Transcription of the his operon is about four-fold more efficient in bacteria

growing in minimal glucose medium than when growing in rich medium. This

form of control, called metabolic regulation, adjusts the expression of the

operon to the amino acid supply in the cell. It is mediated by the “alarmone”

guanosine 5′-diphosphate 3′-diphosphate (ppGpp), which is the effector of the



stringent response. The alarmone regulates the his operon positively by

stimulating the primary promoter hisp1 under conditions of moderate amino

acid starvation.

Histidine rich condition: In addition to this general metabolic control, his operon transcription is

specifically regulated by attenuation of transcription, a mechanism in which a

regulatory element, located upstream of the first structural gene of the cluster,

modulates the level of expression of the histidine biosynthetic enzymes in

response to the intracellular levels of charged histidyl-transfer RNA, His-tRNA

His. The his-specific regulatory element is transcribed in a 180-nucleotide RNA

leader, which exhibits two prominent features: (i) a 16-residue coding sequence

including seven consecutive codons specifying histidine, and (ii) overlapping

regions of dyad symmetry capable of folding into mutually exclusive, alternative

secondary structures that signal either transcription termination or

antitermination. Six RNA segments are involved in base pairing (In Fig. A to F)

and the stemloop structure formed by the E and F RNA regions, plus the

adjacent run of uridylate residues, constitutes the attenuator, a strong Rho-

independent transcription terminator.

Translational control of his operon transcription is determined by ribosome

occupancy of the leader RNA, which in turn depends, given the peculiar

composition of the his leader peptide, on the availability of HistRNA His. High

levels of His-tRNAHis allow rapid movement of ribosomes up to the B segment;

in this case, formation of the C:D and E:F stem-loop structures will result in

premature transcription termination (Fig.Attenuation).



Histidine low condition: In the presence of low levels of charged tRNAHis, ribosomes stall at the

consecutive histidine codons of the leader peptide and prevent the A:B pairing

by masking the A segment. Base pairing between the B and C and between the

D and E RNA regions prevents formation of the attenuator and determines the

antitermination conformation. In the case of severe limitation of the

intracellular pool of all charged tRNAs, translation of the leader peptide fails to

initiate: under these conditions, the A:B, C:D and E:F stem-loop structures

form sequentially, producing a strong transcription termination. RNA polymerase pauses after synthesis of the first RNA hairpin (A: B). This pausing

is believed to synchronize transcription and translation of the leader region by

halting the elongating RNA polymerase until a ribosome starts translation of

the leader peptide. The pause hairpin is the only portion of the structure

thought to form when RNA polymerase resides at the pause site.

Because the absolute amount of charged tRNAHis controls the level of his

attenuation, mutants exhibiting high his operon expression contain defects in

tRNA His biosynthesis, aminoacylation with histidine, or tRNA His modification

and processing. The hisR gene encodes the single cellular tRNAHis; and

mutations in the hisR promoter reduce the total cellular content of tRNAHis

molecules by about 50% and thereby cause increased readthrough

transcription of the his attenuator. The hisS gene encodes histidyl-aminoacyl

tRNA synthetase, which aminoacylates tRNA His molecules with histidine.

Mutations that lower the activity of the histidyl-tRNA synthetase or decrease



the enzyme's affinity for histidine, tRNAHis, or ATP, affect the level of his

attenuation by reducing the percentage of tRNAHis molecules charged with

histidine. The hisT gene encodes pseudouridine synthase I, which catalyzes the

formation of pseudouridine residues in the anticodon region of several tRNA

species, including tRNAHis. Although the undermodified tRNA His molecules

are charged with histidine to the same extent as in wild-type strains,

transcription termination at the hisattenuator is greatly decreased, because the

slow rate of translation of the consecutive histidine codons causes stalling of

ribosomes.

The overall contribution of the internal promoter hisp2 to the expression of the

distal genes of the operon is negligible when transcription proceeds from hisp1,

because hisp2 is inhibited by transcription readthrough, a phenomenon known

as promoter occlusion. hisp2 is also subjected to metabolic regulation, although

to a lesser extent than hisp1.

REGULATION BY ATTENUATION – TRP OPERON The E. coli tryptophan (trp) operon includes five genes for the enzymes required

to convert chorismate to tryptophan.

Two of the enzymes catalyze more than one step in the pathway. The mRNA

from the trp operon has a half-life of only about 3 min, allowing the cell to

respond rapidly to changing needs for this amino acid. The Trp repressor is a

homodimer, each subunit containing 107 amino acid residues. When

tryptophan is abundant it binds to the Trp repressor, causing a conformational

change that permits the repressor to bind to the trp operator and inhibit

expression of the trp operon. The trp operator site overlaps the promoter, so

binding of the repressor blocks binding of RNA polymerase.

Once again, this simple on/off circuit mediated by a repressor is not the entire

regulatory story. Different cellular concentrations of tryptophan can vary the

rate of synthesis of the biosynthetic enzymes over a 700-fold range. Once

repression is lifted and transcription begins, the rate of transcription is fine-

tuned by a second regulatory process, called transcription attenuation, in



which transcription is initiated normally but is abruptly halted before the

operon genes are transcribed. The frequency with which transcription is

attenuated is regulated by the availability of tryptophan and relies on the very

close coupling of transcription and translation in bacteria.

The trp operon attenuation mechanism uses signals encoded in four sequences

within a 162 nucleotide leader region at the 5’ end of the mRNA, preceding the

initiation codon of the first gene.



Within the leader lies a region known as the attenuator, made up of sequences

3 and 4. These sequences base-pair to form a GqC-rich stem-and-loop

structure closely followed by a series of U residues. The attenuator structure

acts as a transcription terminator.



Sequence 2 is an alternative complement for sequence 3. If sequences 2 and 3

base-pair, the attenuator structure cannot form and transcription continues

into the trp biosynthetic genes; the loop formed by the pairing of sequences 2

and 3 does not obstruct transcription.

Regulatory sequence 1 is crucial for a tryptophansensitive mechanism that

determines whether sequence 3 pairs with sequence 2 (allowing transcription

to continue) or with sequence 4 (attenuating transcription). Formation of the

attenuator stem-and-loop structure depends on events that occur during

translation of regulatory sequence 1, which encodes a leader peptide (so called

because it is encoded by the leader region of the mRNA) of 14 amino acids, two

of which are Trp residues.

The leader peptide has no other known cellular function; its synthesis is simply

an operon regulatory device.

This peptide is translated immediately after it is transcribed, by a ribosome

that follows closely behind RNA polymerase as transcription proceeds. When

tryptophan concentrations are high, concentrations of charged tryptophan

tRNA (Trp-tRNATrp) are also high. This allows translation to proceed rapidly

past the two Trp codons of sequence 1 and into sequence 2, before sequence 3

is synthesized by RNA polymerase. In this situation, sequence 2 is covered by



the ribosome and unavailable for pairing to sequence 3 when sequence 3 is

synthesized; the attenuator structure (sequences 3 and 4) forms and

transcription halts. When tryptophan concentrations are low, however, the

ribosome stalls at the two Trp codons in sequence 1, because charged tRNATrp

is less available. Sequence 2 remains free while sequence 3 is synthesized,

allowing these two sequences to base-pair and permitting transcription to

proceed. In this way, the proportion of transcripts that are attenuated declines

as tryptophan concentration declines.

ANTI – TERMINATION - N PROTEIN AND NUT SITES

After N protein recognizes and binds to the B box in the nut site, it interacts

with the NusA-polymerase complex. Subsequent rapid binding of NusB, NusG,

and S10 produces an antitermination complex stabilized by multiple protein-

protein contacts. As the polymerase moves along the template DNA away from

the nut site, the antitermination complex remains associated with the enzyme

and the nut RNA sequence, so a RNA loop of increasing size forms. The complex

prevents termination, and transcription proceeds. Inhibition of the terminating

action of hexameric Rho factor is diagrammed; antitermination also occurs at

Rho-independent sites. DNA BINDING SITES



Regulatory DNA binding proteins can occur in active and inactive forms. The

transition between the two forms is primarily controlled by the mechanisms

indicated. Activation or inactivation of transcription factors is determined by

signals that become effective either in the cytoplasm or in the nucleus.

Signaldirected translocation of transcription factors into the nucleus is a major

mechanism for transcriptional regulation. The amount of available

transcription factor can also be regulated via its degradation rate or rate of

expression. Furthermore, the interaction between DNA-bound activators and

the transcription complex can be regulated by various signals.

DNA BINDING PROTEIN A recurring motif on the pathway of information transfer from gene to protein is

the binding of proteins to DNA or RNA. At the DNA level, specific DNA-binding

proteins aid in the identification of genes for regulation via transcriptional

activation or inhibition. At the RNA level, specific RNAs are recognized in a



sequence-specific manner to attain a controlled transfer of genetic information

further on to the mature protein. The basis of all specific regulation processes

at the nucleic acid level is the recognition of nucleotide sequences by binding

proteins. For the regulation of gene activity the specific binding of proteins to

double-stranded DNA is of central importance. A specific DNA-binding protein

usually recognizes a certain DNA sequence, termed the recognition sequence or

DNA-binding element. Because of the enormous complexity of the genome, the

specificity of this recognition plays a significant role. The binding protein must

be capable of specifically picking out the recognition sequence in a background

of a multitude of other sequences and binding to it. The binding protein must

be able to discriminate against related sequences which differ from the actual

recognition element at only one or more positions.

ENHANCER SEQUENCES Insulators: Gary Felsenfeld has defined an insulator as a “neutral barrier to the influence

of neighboring elements.” Thus, insulators can protect a gene from both

activation and repression by nearby enhancers and silencers. Iinsulators

define boundaries between DNA domains. Thus, an insulator placed between

an enhancer and the promoter it usually activates abolishes that activation.

Similarly, an insulator placed between a silencer and a gene it usually

represses abolishes that repression. It appears that the insulator creates a

boundary between the domain of the gene and that of the enhancer (or silencer)

so the gene can no longer feel the activating (or repressing) effects.

Insulating against enhancer activity

The insulator between a promoter and an enhancer prevents the promoter from

feeling the activating effect of the enhancer.

Insulating against silencer activity:



The insulator between a promoter and condensed, repressive chromatin

(induced by a silencer) prevents the promoter from feeling the repressive effect

of the condensed chromatin (indeed, prevents the condensed chromatin from

engulfing the promoter).

LOCAL CONTROL REGIONS:

Every gene is controlled by its promoter, and some genes also respond to

enhancers (containing similar control elements but located farther away).

However, these local controls are not sufficient for all genes. In some cases, a

gene lies within a domain of several genes all of which are influenced by

regulatory elements that act on the whole domain. The existence of these

elements was identified by the inability of a region of DNA including a gene and

all its known regulatory elements to be properly expressed when introduced

into an animal as a transgene. The best characterized example of a regulated

gene cluster is provided by the mouse β-globin genes. The a globin and β-

globin genes in mammals each exist as clusters of related genes, expressed at

different times during embryonic and adult development. These genes are

provided with a large number of regulatory elements, which have been analyzed

in detail. In the case of the adult human β-globin gene, regulatory sequences

are located both 5' and 3' to the gene and include both positive and negative

elements in the promoter region, and additional positive elements within and

downstream of the gene.



But a human β-globin gene containing all of these control regions is never

expressed in a transgenic mouse within an order of magnitude of wild-type

levels. Some further regulatory sequence is required.

Regions that provide the additional regulatory function are identified by

DNAase I hypersensitive sites that are found at the ends of the cluster.

The map of Figure shows that the 20 kb upstream of the egene contains a

group of 5 sites; and there is a single site 30 kb downstream of the β-gene.

Transfecting various constructs into mouse erythroleukemia cells shows that

sequences between the individual hypersensitive sites in the 5' region can be

removed without much effect, but that removal of any of the sites reduces the

overall level of expression.

The 5' regulatory sites are the primary regulators, and the cluster of

hypersensitive sites is called the LCR (locus control region). It is not known

whether the 3' site has any function. The LCR is absolutely required for

expression of each of the globin genes in the cluster. Each gene is then further

regulated by its own specific controls. Some of these controls are autonomous:

expression of the e- and •γ-genes appears intrinsic to those loci in conjunction

with the LCR. Other controls appear to rely upon position in the cluster, which

provides a suggestion that gene order in a cluster is important for regulation.

The entire region containing the globin genes, and extending well beyond them,

constitutes a chromosomal domain. It shows increased sensitivity to digestion

by DNAase I. Deletion of the 5' LCR restores normal resistance to DNAase over

the whole region.

Two models for how an LCR works propose that its action is required in order

to activate the promoter, or alternatively, to increase the rate of transcription

from the promoter. The exact nature of the interactions between the LCR and

the individual promoters has not yet been fully defined.

The α-globin locus has a similar organization of genes that are expressed at

different times, with a group of hypersensitive sites at one end of the cluster,

and increased sensitivity to DNAase I throughout the region. Only a small

number of other cases are known in which an LCR controls a group of genes.



CHROMATIN MODIFICATION AND GENE EXPRESSION The molecular mechanisms for controlling the structure of chromatin start with

mutants that affect position effect variegation. Some 30 genes have been

identified in Drosophila. They are named systematically as Su(var) for genes

whose products act to suppress variegation and E(var) for genes whose

products enhance variegation. The genes were named for the behavior of the

mutant loci. Mutations that suppress variegation lie in genes whose products

are needed for the formation of heterochromatin. They include enzymes that

act on chromatin, such as histone deacetylases, and proteins that are localized

to heterochromatin. Mutations that enhance variegation lie in genes whose

products are needed to activate gene expression. They include members of the

SWI/SNF complex.

From these properties that modification of chromatin structure is important for

controlling the formation of heterochromatin. The universality of these

mechanisms is indicated by the fact that many of these loci have homologues

in yeast that display analogous properties. Some of the homologues in S.

pombe are clr (cryptic loci regulator) genes, in which mutations affect silencing.

Many of the Su(var) and E(var) proteins have a common protein motif of 60

amino acids called the chromo domain. The fact that this domain is found in

proteins of both groups suggests that it represents a motif that participates in

protein-protein interactions with targets in chromatin.

Among the Su(var) proteins is HP1 (heterochromatin protein 1). This was

originally identified as a protein that is localized to heterochromatin by staining

polytene chromosomes with an antibody directed against the protein. It was

later shown to be the product of the gene Su(var)2-5. Its homologue in the yeast



S. pombe is coded by swi6. HP1 contains a chromo domain near the N-

terminus, and another domain that is related to it, called the chromo-shadow

domain, at the C-terminus.

The importance of the chromo domain is indicated by the fact that it is the

location of many of the mutations in HP1. The chromo domain(s) are

responsible for targeting the protein to heterochromatin. They play a similar

role in other proteins, although the individual chromo domains in particular

proteins may have different detailed specificities for targeting, and can direct

proteins to either heterochromatin or euchromatin. The original protein

identified as HP1 is now called HPIα, since two related proteins, HP1 β and

HP1γ, have since been found.

Su(var)3-9 has a chromo domain and also a SET domain, a motif that is found

in several Su(var) proteins. Its mammalian homologues localize to centromeric

heterochromatin. It is the histone methyltransferase that acts on 9Lys of

histone H3.

The SET domain is part of the active site, and in fact is a marker for the

methylase activity.

The bromo domain is found in a variety of proteins that interact with

chromatin, including histone acetylases. The crystal structure shows that it

has a binding site for acetylated lysine. The bromo domain itself recognizes

only a very short sequence of 4 amino acids including the acetylated lysine, so

specificity for target recognition must depend on interactions involving other

regions. Besides the acetylases, the bromo domain is found in a range of

proteins that interact with chromatin, including components of the

transcription apparatus. This implies that it is used to recognize acetylated

histones, which means that it is likely to be found in proteins that are involved

with gene activation. Although there is a general correlation in which active

chromatin is acetylated while inactive chromatin is methylated on histones,

there are some exceptions to the rule. The best characterized is that acetylation

of 12Lys of H4 is associated with heterochromatin.



IDENTIFICATION OF PROTEIN BINDING SITE ON DNA

In the following, the basic features of specific recognition of DNA sequences by

DNA-binding proteins will be presented.

If the two proteins that are being tested can interact with one another, the two

hybrid proteins will interact. This is reflected in the name of the technique: the

two hybrid assay. The protein with the DNA-binding domain binds to a reporter

gene that has a simple promoter containing its target site. But it cannot

activate the gene by itself. Activation occurs only if the second hybrid binds to

the first hybrid to bring the activation domain to the promoter. Any reporter

gene can be used where the product is readily assayed, and this technique has

given rise to several automated procedures for rapidly testing protein-protein

interactions.

The effectiveness of the technique dramatically illustrates the modular nature

of proteins. Even when fused to another protein, the DNA-binding domain can

bind to DNA and the transcription-activating domain can activate

transcription. Correspondingly, the interaction ability of the two proteins being

tested is not inhibited by the attachment of the DNA-binding or transcription-

activating domains.



MATURATION AND PROCESSING OF RNA In most organisms non-coding genes (ncRNA) are transcribed as precursors

which undergo further processing. In the case of ribosomal RNAs (rRNA), they

are often transcribed as a pre-rRNA which contains one or more rRNAs. The

pre-rRNA is cleaved and modified (2′-O-methylation

and pseudouridine formation) at specific sites by approximately 150 different

small nucleolus-restricted RNA species, called snoRNAs. SnoRNAs associate

with proteins, forming snoRNPs. While snoRNA part basepair with the target

RNA and thus position the modification at a precise site, the protein part

performs the catalytical reaction. In eukaryotes, in particular a snoRNP called

RNase, MRP cleaves the 45S pre-rRNA into the 28S, 5.8S, and 18S rRNAs. The

rRNA and RNA processing factors form large aggregates called the nucleolus

In the case of transfer RNA (tRNA), for example, the 5' sequence is removed

by RNase P whereas the 3' end is removed by the tRNase Z enzyme and the



non-templated 3' CCA tail is added by a nucleotidyl transferase. In the case

of micro RNA (miRNA), miRNAs are first transcribed as primary transcripts or

pri-miRNA with a cap and poly-A tail and processed to short, 70-nucleotide

stem-loop structures known as pre-miRNA in the cell nucleus by the

enzymes Drosha and Pasha. After being exported, it is then processed to

mature miRNAs in the cytoplasm by interaction with the endonuclease Dicer,

which also initiates the formation of the RNA-induced silencing complex (RISC),

composed of the Argonaute protein.

METHYLATION, CUTTING AND MODIFICATION OF t RNA DEGRADATION SYSTEM.

t RNAs are commonly synthesized as precursor chains with additional material

at one or both ends. The extra sequences are removed by combinations of

endonucleolytic and exonucleolytic activities.

One feature that is common to most tRNAs is that the three nucleotides at the

3' terminus, always the triplet sequence CCA, are not coded in the genome, but

are added as part of tRNA processing. The 5' end of tRNA is generated by a

cleavage action catalyzed by the enzyme ribonuclease P. The enzymes that

process the 3' end are best characterized in E. coli, where an endonuclease

triggers the reaction by cleaving the precursor downstream, and several

exonucleases then trim the end by degradation in the 3' -5' direction. The

reaction also involves several enzymes in eukaryotes. It generates a tRNA that

needs the CCA trinucleotide sequence to be added to the 3' end.

The addition of CCA is the result solely of an enzymatic process, that is, the

enzymatic activity carries the specificity for the sequence of the trinucleotide,

which is not determined by a template. There are several models for the

process, which may be different in different organisms. In some organisms, the

process is catalyzed by a single enzyme.

One model for its action proposes that a single enzyme binds to the 3' end, and

sequentially adds C, C, and A, the specificity at each stage being determined by

the structure of the 3' end. Other models propose that the enzyme has different

active sites for CTP and ATP.



In other organisms, different enzymes are responsible for adding the C and A

residues, and they function sequentially. When a tRNA is not properly

processed, it attracts the attention of a quality control system that degrades it.

This ensures that the protein synthesis apparatus does not become blocked by

nonfunctional tRNAs.



UNIT – 6 CONTROL OF GENE EXPRESSION AT TRANSCRIPTION AND

TRANSLATION LEVEL Regulation of phages, viruses, prokaryotic and eukaryotic gene expression, role of chromatin in regulating gene expression. Gene silencing: Transcriptional and post transcriptional gene silencing – RNA I pathway (siRNA and miRNA).

REGULATION OF PHAGES, VIRUSES, PROKARYOTIC AND EUKARYOTIC GENE EXPRESSION

Lambda phage is the paradigm of a temperate phage. Not only is the process of

establishment of lysogeny better understood for this bacteriophage than for any

other, but it is also thought to be representative of the way most other

bacteriophages accomplish this feat. Bacteriophage l DNA relies entirely on the

bacterial host's RNA and protein biosynthesis machinery for its expression. It

contains all the appropriate signals to enable Escherichia coli RNA



polymerase to use its DNA for the synthesis of messenger RNA, and E. coli

ribosomes to use these mRNA to program the synthesis of bacteriophage

proteins. Soon after the l genomic DNA has been injected into the bacterial cell,

an irreversible decision is made, whether the infection will proceed along the

lytic or the lysogenic pathways, which are mutually exclusive.

The control region of bacteriophage l. This map is only intended to show the

relative order of important regions and is not to scale. The DNA regions labeled

cI, cII, cIII, N, and cro (gray “boxes”) are the genes encoding proteins cI, cII, cIII,

N, and cro; OL and O R (white boxes) are binding sites for regulatory proteins cI

and cro; PL, PR, PRM, and PRE are promoters from which transcription of RNA

is initiated in the direction indicated by the horizontal arrows, and tL and t R

are terminators where RNA synthesis stops and the transcription complex falls

apart. Upward pointing arrows symbolize the synthesis of cI mRNA and

protein. The downward arrows point to the binding sites for cI in a lysogen, and

they also show the effect of the bound protein on transcription at the nearest

promoter (shown as + for activation, – for inhibition). To establish lysogeny, cII

binds near the –35 region of PRE to activate transcription from this promoter.

Other important interactions and their effects: cro protein binds at OR to

inhibit transcription from PRM; N binds to the transcription complex that

initiated at PL to prevent termination at t L and to the one that initiated at PR

to prevent termination at tR. Note that names of genes are in “italic” letters and

those of proteins in “roman”letters.

ROLE OF CHROMATIN IN REGULATING GENE EXPRESSION



Histone acetyltransferase (HAT), an enzyme that transfers acetyl groups from

a donor (acetyl-CoA) to core histones.

Acetylation is reversible. Each direction of the reaction is catalyzed by a specific

type of enzyme. Enzymes that can acetylate histones are called histone

acetyltransferases or HATs; the acetyl groups are removed by histone

deacetylases or HDACs.

There are two groups of HAT enzymes: group A describes those that are

involved with transcription; group B describes those involved with nucleosome

assembly. Two inhibitors have been useful in analyzing acetylation.

Role of inhibitors in gene expression: Trichostatin and butyric acid inhibit histone deacetylases, and cause

acetylated nucleosomes to accumulate. The use of these inhibitors has

supported the general view that acetylation is associated with gene expression;

in fact, the ability of butyric acid to cause changes in chromatin resembling

those found upon gene activation was one of the first indications of the

connection between acetylation and gene activity.

The breakthrough in analyzing the role of histone acetylation was provided by

the characterization of the acetylating and deacetylating enzymes, and their

association with other proteins that are involved in specific events of activation

and repression.

Role of HATs in gene expression: A basic change in our view of histone acetylation was caused by the discovery

that HATs are not necessarily dedicated enzymes associated with chromatin:

rather it turns out that known activators of transcription have HAT activity.

The connection was established when the catalytic subunit of a group A HAT

was identified as a homologue of the yeast regulator protein GCN5. Then it was

shown that GCN5 itself has HAT activity (with histones H3 and H4 as

substrates). GCN5 is part of an adaptor complex that is necessary for the

interaction between certain enhancers and their target promoters. Its HAT

activity is required for activation of the target gene.



The action of coactivators, where RNA polymerase is bound at a hypersensitive

site and coactivators are acetylating histones on the nucleosomes in the

vicinity. Many examples are now known of interactions of this type. GCN5

leads us into one of the most important acetylase complexes. In yeast, GCN5 is

part of the 1.8 MDa SAGA complex, which contains several proteins that are

involved in transcription. Among these proteins are several TAFns. Also, the

TAFn145 subunit of TFITD is an acetylase.

There are some functional overlaps between TFnD and SAGA, most notably

that yeast can manage with either TAFn145 or GCN5, but is damaged by the

deletion of both. This suggests that an acetylase activity is essential for gene

expression, but can be provided by either TFnD or SAGA.

One of the first general activators to be characterized as an HAT was

p300/CBP. (Actually, p300 and CBP are different proteins, but they are so

closely related that they are often referred to as a single type of activity.)

p300/CBP is a coactivator that links an activator to the basal apparatus.

p300/CBP interacts with various activators, including hormone receptors, AP-1

(c-Jun and c-Fos), and MyoD.

The interaction is inhibited by the viral regulator proteins adenovirus El A and

SV40 T antigen, which bind to p300/CBP to prevent the interaction with

transcription factors; this explains how these viral proteins inhibit cellular

transcription.



p300/CBP acetylates the N-terminal tails of H4 in nucleosomes. Another

coactivator, called PCAF, preferentially acetylates H3 in nucleosomes.

p300/CBP and PCAF form a complex that functions in transcriptional

activation. In some cases yet another HAT is involved: the coactivator ACTR,

which functions with hormone receptors, is itself an HAT that acts on H3 and

H4, and also recruits both p300/CBP and PCAF to form a coactivating

complex.

One explanation for the presence of multiple HAT activities in a coactivating

complex is that each HAT has a different specificity, and that multiple different

acetylation events are required for activation.

A general feature of acetylation is that an HAT is part of a large complex.

Typically the complex will contain a targeting subunit(s) that determines the

binding sites on DNA. This determines the target for the HAT. The complex also

contains effector subunits that affect chromatin structure or act directly on

transcription. Probably at least some of the effectors require the acetylation

event in order to act. Deacetylation, catalyzed by an HDAC, may work in a

similar way.

Acetylation occurs at both replication (when it is transient) and at transcription

(when it is maintained while the gene is active).

Significance: Acetylation may be necessary to "loosen" the nucleosome core. At replication,

acetylation of histones could be necessary to allow them to be incorporated into

new cores more easily. At transcription, a similar effect could be necessary to

allow a related change in structure, possibly even to allow the histone core to

be displaced from DNA. Alternatively, acetylation could generate binding sites

for other proteins that are required for transcription.

GENE SILENCING: The transcription map in the figure reveals an intriguing feature.

Transcription of either MATa or MATa initiates within the Y region. Only the

MAT locus is expressed; yet the same Y region is present in the corresponding

nontranscribed cassette (HML or HMR). This implies that regulation of



expression is not accomplished by direct recognition of some site overlapping

with the promoter. A site outside the cassettes must distinguish HML and HMR

from MAT.

Deletion analysis shows that sites on either side of both HML and HMR are

needed to repress their expression. They are called silencers. The sites on the

left of each cassette are called the E silencers, and the sites on the right side

are called the I silencers.

Significance: Silencers control sites can function at a distance (up to 2.5 kb away from a

promoter) and in either orientation. They behave like negative enhancers.

TRANSCRIPTIONAL AND POST TRANSCRIPTIONAL GENE SILENCING The RNA-induced silencing complex, or RISC, is a multiprotein complex that

incorporates one strand of a small interfering RNA (siRNA) or microRNA

(miRNA). RISC uses the siRNA or miRNA as a template for recognizing

complementary mRNA. When it finds a complementary strand, it activates



RNase and cleaves the RNA. This process is important both in gene regulation

by microRNAs and in defense against viral infections, which often use double-

stranded RNA as an infectious vector

Biochemical basis of RNA interference: Posttranscriptional gene silencing, or RNA interference, occurs when a cell

encounters dsRNA or an added transgene. The added dsRNA, or one derived

from the transgene, is degraded into 21–23-nt fragments by a nuclease. The

nuclease then presumably associates with the dsRNA, perhaps denaturing it

with a helicase activity. The 21–23-nt fragment can then dictate the sites to

attack on the corresponding mRNA. RNA i PATHWAY (siRNA AND miRNA)

The process begins with dsRNA, which a cellular nuclease cleaves to fragments

21–23 nt long. The nuclease remains bound to the fragments and uses an

ATPdependent RNA helicase to denature the dsRNAs. The nucleases that are

bound to the antisense 21–23-nt RNAs can hybridize to sites in the mRNA and

dictate cleavage of the mRNA at or near their ends, usually at a uracil residue.

The process begins with dsRNA, which initiates RNAi. The dsRNA may be

introduced into cells experimentally or by transcription of both strands of a

transgene. The next step is degradation of the dsRNA into short pieces, about

21–23 nt long. The RNase that clips dsRNA may be a member of the RNase III

family discussed earlier in this chapter. RNase III is the only well-studied

nuclease that cuts dsRNA specifically. The RNase that created the short

double-stranded pieces of RNA presumably remains bound to an RNA piece

and uses it as a template to find and degrade the corresponding mRNA. One

way it could do this is by employing an ATPdependent RNA helicase to unwind

the dsRNA (which would explain the ATP-dependence of the process). Then it

could remain bound to the antisense strand, which could hybridize to mRNA,

bringing the RNase to its target. What is the physiological significance of RNAi?

Double-stranded RNA does not normally occur in eukaryotic cells, but it does

occur during infection by certain RNA viruses that replicate through dsRNA

intermediates. So one important function of RNAi may be to inhibit the



replication of viruses by degrading their mRNAs. But Fire and other

investigators have also found that some of the genes required for RNAi are also

required to prevent certain transposons from transposing within the genome.

Thus, RNAi may have utility even in cells that are not infected by a virus.

Conclusion: Posttranscriptional gene silencing, or RNA interference, occurs when a cell

encounters dsRNA or an added transgene. The added dsRNA, or one derived

from the transgene, is degraded into 21–23-nt fragments by a nuclease. The

nuclease then presumably associates with the dsRNA, perhaps denaturing it

with a helicase activity. The 21–23-nt fragment can then dictate the sites to

attack on the corresponding mRNA.

All the best - THE IMPRINT TEAM

Documents

Biochemistry scanner THE IMPRINT - WordPress.com · DNA replication in prokaryotes and viruses (The rolling circle and M13 ... of RNA replication and reverse transcription establishes