48
Human Genetics Human Genetics Weibin Shi Weibin Shi Michele Sale Michele Sale

Human Genetics Weibin Shi Michele Sale. Contact Information Shi: [email protected]; 243-9420 [email protected] Sale: [email protected]; 982-0368

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Human GeneticsHuman Genetics

Weibin ShiWeibin ShiMichele SaleMichele Sale

Contact InformationContact Information

Shi: Shi: [email protected]@virginia.edu; 243-9420; 243-9420

Sale: Sale: [email protected]@Virginia.EDU; 982-0368 ; 982-0368

Recommended textbooksRecommended textbooks

Medical GeneticsMedical Genetics-Jorde, Carey, Bamshad & White-Jorde, Carey, Bamshad & White

• Mosby, ISBN 13: 978-0-323-04035-8Mosby, ISBN 13: 978-0-323-04035-8

Human Molecular GeneticsHuman Molecular Genetics

- Strachan T, Read A- Strachan T, Read A

Garland Science,ISBN-10: 0815341822 Garland Science,ISBN-10: 0815341822

Overview of course contentOverview of course content

1: Organization of the human genome1: Organization of the human genome 2: Genetic variation2: Genetic variation 3. Patterns of inheritance3. Patterns of inheritance 4: Population genetics4: Population genetics 5: linkage disequilibrium5: linkage disequilibrium 6: Genetic epidemiology6: Genetic epidemiology 7: Applied research in human genetics7: Applied research in human genetics

Organization Organization of the of the human human

genomegenome

Human genome sequence published - February 2001February 2001

Genes are found in the Genes are found in the nucleus and mitochondrianucleus and mitochondria

Nuclear genome packaged with proteins to form

chromatin

Human chromosomesHuman chromosomes

23 pairs

46 chromosomes

22 pairs – autosomes

1 pair sex chromosomes

46,XYNormal male

46,XX

Normalfemale

Human chromosomesHuman chromosomes

A little more basic terminology

Human genome =

nuclear genome + mitochondrial genome

NUCLEAR GENOME24 distinct chromosomes (22 autosomal + X + Y)3,200 Mbp25,000 genes

Mitochondrial genome16,569 bp37 genes

Small (16.5 kb) circular DNASmall (16.5 kb) circular DNA

rRNA, tRNA and protein encoding genes (37)rRNA, tRNA and protein encoding genes (37)

1 gene/0.45 kb1 gene/0.45 kb

Very few repeatsVery few repeats

No intronsNo introns

93% coding93% coding

Genes are transcribed as multimeric Genes are transcribed as multimeric transcriptstranscripts

Maternal inheritanceMaternal inheritance

Human Mitochondrial GenomeHuman Mitochondrial Genome

24 of 37genes are RNA coding24 of 37genes are RNA coding 22 tRNA22 tRNA 2 ribosomal RNA (23S, 16S)2 ribosomal RNA (23S, 16S)

13 of 37 genes are protein coding13 of 37 genes are protein coding

some subunits of some subunits of respiratory complexesrespiratory complexes

and and oxidative phosphorylationoxidative phosphorylation enzymes enzymes

What are the mitochondrial genes?What are the mitochondrial genes?

mt encodedmt encoded nuclearnuclearNADH dehydrogenase NADH dehydrogenase 7 subunits 7 subunits 35 subunits35 subunitsCytochrome b-c1 compCytochrome b-c1 comp 1 subunit 1 subunit 10 subunits10 subunitsCytochrome C oxidase Cytochrome C oxidase 3 subunits 3 subunits 10 subunits10 subunitsATP synthase complexATP synthase complex 2 subunits 2 subunits 14 subunits14 subunits

Limited autonomy of Limited autonomy of mitochondrial genomemitochondrial genome

Two independent ATG located in Frame-shift to each other, second stop codon is derived from TA + A (from poly-A)

Two overlapping genes encoded Two overlapping genes encoded by same strand of mt DNA by same strand of mt DNA

(unique example)(unique example)

Mitochondrial codon tableMitochondrial codon table

3,200 Mb3,200 Mb23 (XX) or 24 (XY) linear chromosomes23 (XX) or 24 (XY) linear chromosomes25,000 genes25,000 genes1 gene/120kb1 gene/120kbIntrons in the most of the genesIntrons in the most of the genes1.5 % of DNA is coding1.5 % of DNA is codingGenes are transcribed individuallyGenes are transcribed individuallyRepetitive DNA sequences (45%)Repetitive DNA sequences (45%)Inherited from both parentsInherited from both parents

Human Nuclear GenomeHuman Nuclear Genome

Human Nuclear GenomeHuman Nuclear Genome

In human nuclear genome In human nuclear genome gene-rich regions are separated by gene gene-rich regions are separated by gene

desertsdeserts

Chr. 19 has the highest gene densityChr. 19 has the highest gene density

Chr. 13 & Y show the lowest gene densityChr. 13 & Y show the lowest gene density

Human genome base contentHuman genome base content

41% CG in average41% CG in average38% CG for Chr. 4 and Chr. 1338% CG for Chr. 4 and Chr. 1349% for Chr. 1949% for Chr. 19

Regions with wide swings in CG content Regions with wide swings in CG content (e.g. from 33.1% to 59.3%)(e.g. from 33.1% to 59.3%)

Gene density correlates with higher CG content Gene density correlates with higher CG content

CpG dinucleotide depletionCpG dinucleotide depletion

Expected frequencyExpected frequency is 4.2% is 4.2% Observed frequencyObserved frequency is five times lower is five times lower

Location of CpG islands in the Location of CpG islands in the genegene

CpG islands in the regulatory areas of human genesCpG islands in the regulatory areas of human genes

Human nuclear genomeHuman nuclear genome

Gene density Gene density varies widely varies widely Averagely Averagely 9 exons per gene9 exons per gene 363 exons in titin gene363 exons in titin gene Certain genes are intronslessCertain genes are intronsless Largest intron is 800 kb (WWOX gene)Largest intron is 800 kb (WWOX gene) Smallest introns – 10 bpSmallest introns – 10 bp Average 5’ UTRAverage 5’ UTR 0.2-0.3 kb 0.2-0.3 kb Average 3’ UTRAverage 3’ UTR 0.77 kb 0.77 kb Largest protein: titin: 38,138 aaLargest protein: titin: 38,138 aa

Gene density varies substantially between chromosomal regions

Genes vary in size and exon content

INTRONLESS GENESINTRONLESS GENES

Interferon genesInterferon genes Histone genesHistone genes Many ribonuclease genesMany ribonuclease genes Heat shock protein genesHeat shock protein genes Many G-protein coupled receptorsMany G-protein coupled receptors Various neurotransmitters receptors and Various neurotransmitters receptors and

hormone receptorshormone receptors

Genes within genes

Classical gene families: Classical gene families: members members exhibit a high degree of sequence similarityexhibit a high degree of sequence similarity

alpha-albumin serum albumin vitamin D-binding protein

four placenta-specific genes, primates only

CS = chorionic somatomammotropin

Gene families: gene products bearing Gene families: gene products bearing short conservative amino acid motifsshort conservative amino acid motifs

DEAD box proteins are involved in mRNA splicing

and translation initiation; DEAD box (Asp-Glu-Ala-Asp)

WD proteins take part in a variety of regulatory functions, GH (Gly-His) should be at 23-41 aa distance from WD (Trp-Aps)

Gene superfamily: Proteins that are functionally related in a general sense, but show only weak homology

Functionally similar genes are occasionally clustered, but usually dispersed throughout the genome

Non-coding RNA genesNon-coding RNA genes

Code for functional RNACode for functional RNA ncRNA represent 98% of all transcripts in a ncRNA represent 98% of all transcripts in a

mammalian cellmammalian cell ncRNA can be:ncRNA can be:

StructuralStructural CatalyticCatalytic RegulatoryRegulatory

How many genes in the How many genes in the nuclear genome?nuclear genome?

~3000 RNA genes in the nuclear genome ~3000 RNA genes in the nuclear genome

~10% of human gene count~10% of human gene count

have not been taken into account in gene have not been taken into account in gene countscounts

Non-coding RNANon-coding RNA

tRNA – transfer RNA: involved in tRNA – transfer RNA: involved in translationtranslation

rRNA – ribosomal RNA: structural rRNA – ribosomal RNA: structural component of ribosome, where translation component of ribosome, where translation takes placetakes place

snoRNA – small nucleolar RNA: snoRNA – small nucleolar RNA: functional/catalytic in rRNA maturationfunctional/catalytic in rRNA maturation

Antisense RNA: gene regulation/silencing Antisense RNA: gene regulation/silencing

microRNAmicroRNA

A new class of non-coding RNA geneA new class of non-coding RNA gene Products are 19~25 nt RNAsProducts are 19~25 nt RNAs Precursors are 70-100 nt.Precursors are 70-100 nt. Block translation or result in degradation of Block translation or result in degradation of

target mRNA target mRNA

Tandem repeats and Tandem repeats and interspersed repeatsinterspersed repeats

Satellite DNA is repetitive DNA that Satellite DNA is repetitive DNA that could be separated by centrifugationcould be separated by centrifugation

Equilibrium density gradient

centrifugation

Sheared DNA in Cesium Chloride

gradient

Satellite DNA Satellite DNA

Alpha –satellite(Centromere DNA)

Microsatellite Minisatellite

MicrosatelliteMicrosatellite

di-, tri-, and tetra-nucleotide repeats di-, tri-, and tetra-nucleotide repeats

~10% of the nuclear genome~10% of the nuclear genome

TGCTCATCATCATCAGCTGCTCATCA------GC

TGCCACACACACACACACAGCTGCCACACACACA------GC TGCTCAGTCAGTCAGTCAGGC

TGCTCAGTCAG--------GC

MinisatellitesMinisatellites

1 tgattggtct ctctgccacc gggagatttc cttatttgga ggtgatggag gatttcagga 61 attttttagg aattttttta atggattacg ggattttagg gttctaggat tttaggatta 121 tggtatttta ggatttactt gattttggga ttttaggatt gagggatttt agggtttcag 181 gatttcggga tttcaggatt ttaagttttc ttgattttat gattttaaga ttttaggatt 241 tacttgattt tgggatttta ggattacggg attttagggt ttcaggattt cgggatttca 301 ggattttaag ttttcttgat tttatgattt taagatttta ggatttactt gattttggga 361 ttttaggatt acgggatttt agggtgctca ctatttatag aactttcatg gtttaacata 421 ctgaatataa atgctctgct gctctcgctg atgtcattgt tctcataata cgttcctttg

Repeat: AGGAATTTTT

• 6-64 bp repeating pattern

αα-Satellite repeat-Satellite repeat

• 171 bp sequence repeat

Interspersed repetitive DNA Interspersed repetitive DNA

SINE (Short interspersed nuclear elements): SINE (Short interspersed nuclear elements): Alu, Alu, ~0.3 kb, ~10,7% of human DNA (1,200, 000 copies)~0.3 kb, ~10,7% of human DNA (1,200, 000 copies) MIR, ~0.13 kb, 3% of human DNA (500,000 copies)MIR, ~0.13 kb, 3% of human DNA (500,000 copies)

LINE (Long interspersed nuclear elements): LINE (Long interspersed nuclear elements): ~0.8 kb, ~21% of human DNA (~1,00,000 copies)~0.8 kb, ~21% of human DNA (~1,00,000 copies)

Chromosomal location of repeatsChromosomal location of repeats

Non-functional copy of a geneNon-functional copy of a gene Non-processed pseudogene Non-processed pseudogene

• Nonfunctional copies of the Nonfunctional copies of the genomicgenomic DNA sequence of a gene DNA sequence of a gene • Contain exons, intron, and flanking sequencesContain exons, intron, and flanking sequences

Processed pseudogeneProcessed pseudogene• Nonfunctional copies of the Nonfunctional copies of the exonic sequencesexonic sequences of a gene of a gene • Reverse-transcribed from an RNA transcriptReverse-transcribed from an RNA transcript• No 5’ promoterNo 5’ promoter• No intronsNo introns• Often includes polyA tailOften includes polyA tail

Both include events that make the gene non-functionalBoth include events that make the gene non-functional• FrameshiftFrameshift• Stop codonsStop codons

Could be as high as 20-30% of all Genomic sequence Could be as high as 20-30% of all Genomic sequence predictions could be pseudogenepredictions could be pseudogene

We assume pseudogenes have no function, but we really don’t We assume pseudogenes have no function, but we really don’t know!know!

PseudogenesPseudogenes

Human Genome OrganizationHUMAN GENOME

Genes and gene-related sequences

Extragenic DNA

Nuclear genome3,200 Mb

25,000 genes

Mitochondrial genome16.5 kb

37 genes

Coding DNA

Noncoding DNA

Unique or low copy number

Moderate to highly

repetitive

Pseudogenes Gene fragments

Introns,untranslated

sequences, etc.

Tandemly repeated

Interspersedrepeats

Unique or moderately repetitive

Two rRNAgenes

22 tRNAgenes

13 polypeptide-encoding genes