Molecular definition of a gene
A gene is the entire nucleic acid sequence that is necessary for the controlled production of its final product (RNA or Protein)
In eukaryotes, genes lie amidst a large expanse of noncoding DNA with unknown function and genes may also span regions of DNA unrelated to the gene
If a gene is incapable of producing a final gene-product = pseudogene
Exons =coding region (ORF)
Introns
Basal Promoter
Regulatory region (enhancer/repressor)
PolyA site
Splice site
TSS
Bacterial operons produce polycistron mRNA while most eukaryotic mRNAs are monocistronic and contain introns
Transcriped into 5 separate proteins
Simple and complex transcription units are found in eukaryotic genomes
The final mRNA contains a continous ORFflanked by a 3’UTR (incl polyA site) and a 5’UTR (incl Cap and Kozak sequence)
Many genes encode several different variations leading to different protein isoforms
Alternative splicing of one pre-mRNA
Alternative termination giving 2 different pre-mRNA
Alternative TSS giving 2 different pre-mRNA
Protein coding genes can be solitairy or duplicated
Solitary genes (~50%): occur only once in the genomeDuplicated genes:* Genefamilies: very similar (not identical) genes (example: containing very similar globular modules in the protein structure)
* Multicop y: identical (or nearly identical) copies of genes encoding products needed in high quantity (example: histones)
Genomes of higher eukaryotes contain a lot of “nonfunctional” DNA
protein -coding
satellites and single repeats
mob
ile (t
rans
posa
ble)
DNA
ele
men
ts
Repetitious DNA - Satellite DNA (~6-7% of the human genome)
Microsatellites: Tandem repeats of up to 150 repeats of 1-13 bp
Presumably generated by backward slippage
The number of repeats within a microsatellite is very variable between individuals
Microsatellites for PCR-based STR analysis
DNA is isolated and purified
PCR is performed on the DNA with primers flanking known microsatellite areas.
The amplified DNA is size-separated withgel-electrophoresis
The resulting pattern of size-fractionated DNA-bands is compared to other patterns to determine similarity.
Usually each area is chosen on a separate chromosome.
13 primer pairs (microsatellite areas) are enough to ensure that the actual probability that 2 random persons have the same STR-pattern is only 1 in 3 trillions
Mobile (transposable) DNA (up to 50% of mammalian genomes)
Moderately-repeated, mobile DNA sequences are interspersed throughout the genomes of prokaryotes, higher plants and animalsThese sequences range in size from hundreds to a few thousand base pairsThe sequences are inserted into a new site in the genome by the process of transpositionOnce mobile DNA was termed “selfish” DNA, but actually it may have contributed to our genetic diversity through “exon-shuffling”
Classes of Mobile DNA* DNA transposons* Retrotransposons (via RNA intermediates)
*LTR-retrotransposons (similar to retroviral provirus)*Non-LTR-retrotransposons (LINEs AND SINEs)
Reverse transcriptase + special primer (copies RNA into cDNA)
Integrase (integrates the cDNA into a new genomic site)
LTR: contains promoter region to initate transcription of the retrotransposal genes
LTR retrotransposons (~8% of human genome)
cDNA copy of the RNA
Reverse transcriptase + special primer
Integrase mediates integration of the copy into new genome location
LTR retrotransposons resemble retroviruses
Non-LTR retrotranposons (~40% of human genome)
SINEs (100-400bp) are similar to LINEs but have lost the ORFs and can only transpose if they can use LINEs enzymes
There are ~900.000 LINEs and ~ 1,6 million SINEs in the average genome
~1.1 million SINEs are Alu sites (cleaved by the AluI restriction enzyme)
Mobile DNA elements probably had a significant influence on evolution
Spontaneous mutations may result from the insertion of a mobile DNA element into or near a transcription unit
Mobile DNA elements may contribute to gene duplication and other rearrangements, including
duplication of exons (generating gene-families)recombination of exons to create new genes (“exon shuffling”) altered control of gene expression (copying gene-regulatory elements between different promoters)
mRNA sequences can be reinserted into the genome to form processed pseudogenes
Mobile DNA elements are a possible tool for inserting therapeutic genes into patients (gene-therapy)
Extra-nuclear DNAs (=organelle DNAs)
Mitochondrial DNA (mtDNA) stems from ancestral endocytic bacteriamtDNA is inherited cytoplasmaticallyHuman mtDNA encodes for 13 protein-coding genes, its own rRNA and tRNAProducts of mitochondria are not exportedMutations in mtDNA can cause diseases
Structural genome organization
Prokaryotic: Most bacterial genomes are carried in one circular chromosomeStable replication requires one replication origin (ORI) The Genome is packed with polyamines (stabilizing proteins)
Eukaryotic: The genome is distributed over several linear chromosomesStable replication occurs from several replication originswithin each chromosome, and additionally requires:
Centromeres (for equal distribution between daughter cells during mitosis)Telomeres (to protect the chromosome ends against shortening during replication)
Eukaryotic DNA associates with many different proteins to form chromatin (including histones and scaffold proteins)These DNA/protein complexes are termed chromatin
Overview: structure of genes and chromosomes
10nm fiber
Transcriptionally active form
Nucleosome:
147 bp DNA-helix wound 2.66 times around 8 histone proteins (2 of each of H2A, H2B, H3, and H4)
Chromatin exists in extended (=euchromatin) and condensed (=heterochromatin) forms
Nucleosome
10 nM fiber (beads-on-a-string) 30 nM fiber (condensed bead structure)
Non-condensed (interphase) chromosomes are organized into chromosome territories
nucleus
FISH (fluorescent in situ hibridization) of fixed human fibroblasts in interphase
Stained chromosomes have characteristic banding patterns
Giemsa staining of metaphase chromosomes reveal the DNA dense areas as dark bands
Reading the histone code
Protruding tails of histones can be modified (like many other proteins)
The histone tails are acidic and link the nucleosomes together in a tighter structure
Acetylation neutralizes the ends -> looser structure -> gene activation
Methylation -> prohibits acetylation -> gene silencing
Additionally histones can be phosphorylated or ubiquitilated
Reading the histone code
The study of the histone code and how it determines which genes are actively transcribed in a cell is called Epigenetics