34
Chapter 6 Genomic Architecture Molecular Structure of Genes and Chromosomes

Chapter 6 Genomic Architecture - Bioinformatics Grazgenome.tugraz.at/MolecularBiology/WS11_Chapter06_3.pdfNon-LTR retrotranposons (~40% of human genome) SINEs (100-400bp) are similar

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Chapter 6 Genomic ArchitectureMolecular Structure of Genes and Chromosomes

Molecular definition of a gene

A gene is the entire nucleic acid sequence that is necessary for the controlled production of its final product (RNA or Protein)

In eukaryotes, genes lie amidst a large expanse of noncoding DNA with unknown function and genes may also span regions of DNA unrelated to the gene

If a gene is incapable of producing a final gene-product = pseudogene

Exons =coding region (ORF)

Introns

Basal Promoter

Regulatory region (enhancer/repressor)

PolyA site

Splice site

TSS

Bacterial operons produce polycistron mRNA while most eukaryotic mRNAs are monocistronic and contain introns

Transcriped into 5 separate proteins

Simple and complex transcription units are found in eukaryotic genomes

The final mRNA contains a continous ORFflanked by a 3’UTR (incl polyA site) and a 5’UTR (incl Cap and Kozak sequence)

Many genes encode several different variations leading to different protein isoforms

Alternative splicing of one pre-mRNA

Alternative termination giving 2 different pre-mRNA

Alternative TSS giving 2 different pre-mRNA

Protein coding genes can be solitairy or duplicated

Solitary genes (~50%): occur only once in the genomeDuplicated genes:* Genefamilies: very similar (not identical) genes (example: containing very similar globular modules in the protein structure)

* Multicop y: identical (or nearly identical) copies of genes encoding products needed in high quantity (example: histones)

The birth of new genes

(=exon-shuffling)

Genomes of higher eukaryotes contain a lot of “nonfunctional” DNA

protein -coding

satellites and single repeats

mob

ile (t

rans

posa

ble)

DNA

ele

men

ts

Known nonprotein-coding RNAs and their functions

Repetitious DNA - Satellite DNA (~6-7% of the human genome)

Microsatellites: Tandem repeats of up to 150 repeats of 1-13 bp

Presumably generated by backward slippage

The number of repeats within a microsatellite is very variable between individuals

Microsatellites for PCR-based STR analysis

DNA is isolated and purified

PCR is performed on the DNA with primers flanking known microsatellite areas.

The amplified DNA is size-separated withgel-electrophoresis

The resulting pattern of size-fractionated DNA-bands is compared to other patterns to determine similarity.

Usually each area is chosen on a separate chromosome.

13 primer pairs (microsatellite areas) are enough to ensure that the actual probability that 2 random persons have the same STR-pattern is only 1 in 3 trillions

Mobile (transposable) DNA (up to 50% of mammalian genomes)

Moderately-repeated, mobile DNA sequences are interspersed throughout the genomes of prokaryotes, higher plants and animalsThese sequences range in size from hundreds to a few thousand base pairsThe sequences are inserted into a new site in the genome by the process of transpositionOnce mobile DNA was termed “selfish” DNA, but actually it may have contributed to our genetic diversity through “exon-shuffling”

Classes of Mobile DNA* DNA transposons* Retrotransposons (via RNA intermediates)

*LTR-retrotransposons (similar to retroviral provirus)*Non-LTR-retrotransposons (LINEs AND SINEs)

Transposons

Cut’n paste Copy and paste

Copy number increase of DNA transposons

Reverse transcriptase + special primer (copies RNA into cDNA)

Integrase (integrates the cDNA into a new genomic site)

LTR: contains promoter region to initate transcription of the retrotransposal genes

LTR retrotransposons (~8% of human genome)

cDNA copy of the RNA

Reverse transcriptase + special primer

Integrase mediates integration of the copy into new genome location

LTR retrotransposons resemble retroviruses

Non-LTR retrotranposons (~40% of human genome)

SINEs (100-400bp) are similar to LINEs but have lost the ORFs and can only transpose if they can use LINEs enzymes

There are ~900.000 LINEs and ~ 1,6 million SINEs in the average genome

~1.1 million SINEs are Alu sites (cleaved by the AluI restriction enzyme)

Mobile DNA elements probably had a significant influence on evolution

Spontaneous mutations may result from the insertion of a mobile DNA element into or near a transcription unit

Mobile DNA elements may contribute to gene duplication and other rearrangements, including

duplication of exons (generating gene-families)recombination of exons to create new genes (“exon shuffling”) altered control of gene expression (copying gene-regulatory elements between different promoters)

mRNA sequences can be reinserted into the genome to form processed pseudogenes

Mobile DNA elements are a possible tool for inserting therapeutic genes into patients (gene-therapy)

Extra-nuclear DNAs (=organelle DNAs)

Mitochondrial DNA (mtDNA) stems from ancestral endocytic bacteriamtDNA is inherited cytoplasmaticallyHuman mtDNA encodes for 13 protein-coding genes, its own rRNA and tRNAProducts of mitochondria are not exportedMutations in mtDNA can cause diseases

Structural genome organization

Prokaryotic: Most bacterial genomes are carried in one circular chromosomeStable replication requires one replication origin (ORI) The Genome is packed with polyamines (stabilizing proteins)

Eukaryotic: The genome is distributed over several linear chromosomesStable replication occurs from several replication originswithin each chromosome, and additionally requires:

Centromeres (for equal distribution between daughter cells during mitosis)Telomeres (to protect the chromosome ends against shortening during replication)

Eukaryotic DNA associates with many different proteins to form chromatin (including histones and scaffold proteins)These DNA/protein complexes are termed chromatin

Overview: structure of genes and chromosomes

10nm fiber

Transcriptionally active form

Nucleosome:

147 bp DNA-helix wound 2.66 times around 8 histone proteins (2 of each of H2A, H2B, H3, and H4)

Chromatin exists in extended (=euchromatin) and condensed (=heterochromatin) forms

Nucleosome

10 nM fiber (beads-on-a-string) 30 nM fiber (condensed bead structure)

Nucleosomes are complexes of histones

Gold: H2A, Red: H2B, Blue: H3, Green: H4

The structure of the 30nm fibre

Heterochromatin consists of chromosome regions that do not uncoil

Non-condensed (interphase) chromosomes are organized into chromosome territories

nucleus

FISH (fluorescent in situ hibridization) of fixed human fibroblasts in interphase

Condensation of chromosomes in metaphase

A model for chromatin packing in metaphase chromosomes

Metaphase chromosome (electron micrograph)

Stained chromosomes have characteristic banding patterns

Giemsa staining of metaphase chromosomes reveal the DNA dense areas as dark bands

Stained chromosomes have characteristic banding patterns

Figure 9-38

Chromosome painting distinguishes each homologous pair by color

Reading the histone code

Protruding tails of histones can be modified (like many other proteins)

The histone tails are acidic and link the nucleosomes together in a tighter structure

Acetylation neutralizes the ends -> looser structure -> gene activation

Methylation -> prohibits acetylation -> gene silencing

Additionally histones can be phosphorylated or ubiquitilated

Reading the histone code

The study of the histone code and how it determines which genes are actively transcribed in a cell is called Epigenetics

Transgenerational maintenance of chromain marks