66
The genomes of living organisms vary enormously in size The genomes of living organisms vary enormously in size

The genomes of living organisms vary enormously in size

  • Upload
    makani

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

The genomes of living organisms vary enormously in size. Four classes of DNA polymorphisms. Single nucleotide polymorphism (SNP). Single base-pair substitutions Arise by mutagenic chemicals or mistakes in replication Biallelic – only two alleles 2001 – over 5 million human SNPs identified - PowerPoint PPT Presentation

Citation preview

Page 1: The genomes of living organisms vary enormously in size

The genomes of living organisms vary enormously in sizeThe genomes of living organisms vary enormously in size

Page 2: The genomes of living organisms vary enormously in size

Four classes of DNA polymorphismsFour classes of DNA polymorphisms

Page 3: The genomes of living organisms vary enormously in size

Single nucleotide polymorphism (SNP)Single nucleotide polymorphism (SNP)

Single base-pair substitutionsSingle base-pair substitutions Arise by mutagenic chemicals or mistakes in Arise by mutagenic chemicals or mistakes in

replicationreplication Biallelic – only two allelesBiallelic – only two alleles 2001 – over 5 million human SNPs identified2001 – over 5 million human SNPs identified Most occur at anonymous lociMost occur at anonymous loci Useful as DNA markersUseful as DNA markers

Page 4: The genomes of living organisms vary enormously in size

Fig. 11.2

Page 5: The genomes of living organisms vary enormously in size

MicrosatellitesMicrosatellites

1 every 30,000 bp1 every 30,000 bp Repeated units 2 – 5 Repeated units 2 – 5

bp in lengthbp in length Mutate by Mutate by

replication errorreplication error Useful as highly Useful as highly

polymorphic DNA polymorphic DNA markersmarkers

Fig. 11.3

Page 6: The genomes of living organisms vary enormously in size

MinisatellitesMinisatellites

Repeating Repeating units 20-100 bp units 20-100 bp longlong

Total length of Total length of 0.5 – 20 kb0.5 – 20 kb

1 per 100,000 1 per 100,000 bp, or about bp, or about 30,000 in 30,000 in whole genomewhole genome

Fig. 11.4

Page 7: The genomes of living organisms vary enormously in size

Deletions, duplications, and insertionsDeletions, duplications, and insertions

Expand or contract the length of Expand or contract the length of nonrepetitive DNAnonrepetitive DNA

Small deletions and duplications arise by Small deletions and duplications arise by unequal crossing overunequal crossing over

Small insertions can also be caused by Small insertions can also be caused by transposable elementstransposable elements

Much less common than other Much less common than other polymorphismspolymorphisms

Page 8: The genomes of living organisms vary enormously in size

Figure 11.5

Page 9: The genomes of living organisms vary enormously in size

Formation of haplotypes over time

Page 10: The genomes of living organisms vary enormously in size

SNP detection using SNP detection using southern blotssouthern blots Restriction fragment Restriction fragment

length polymorphisms length polymorphisms (RFLPs) are size changes (RFLPs) are size changes in fragments due to the in fragments due to the loss or gain of a loss or gain of a restriction siterestriction site

Fig. 11.6

Page 11: The genomes of living organisms vary enormously in size

SNP detection by SNP detection by PCRPCR

Must know sequence Must know sequence on either side of on either side of polymorphismpolymorphism Amplify fragmentAmplify fragment Expose to restriction Expose to restriction

enzymeenzyme Gel electrophoresisGel electrophoresis

e.g., sickle-cell e.g., sickle-cell genotyping with a PCR genotyping with a PCR based protocolbased protocol

Fig. 11.7

Page 12: The genomes of living organisms vary enormously in size

SNP detection by ASOSNP detection by ASO

Very short probes (<21 bp) that hybridize to one allele or otherVery short probes (<21 bp) that hybridize to one allele or other Such probes are allele-specific oligonucleotides (ASOs)Such probes are allele-specific oligonucleotides (ASOs)

Fig. 11.8

Page 13: The genomes of living organisms vary enormously in size

ASOs can ASOs can determine determine

genotype at any genotype at any SNP locusSNP locus

Fig. 11.9 a-c

Page 14: The genomes of living organisms vary enormously in size

Hybridized and labeled with ASO for allele 1

Hybridized and labeled with ASO for allele 2

Fig. 11.9 d, e

Page 15: The genomes of living organisms vary enormously in size

Preimplantation embryo diagnosis of CF Preimplantation embryo diagnosis of CF using ASO analysisusing ASO analysis

Fig. 11.1

Page 16: The genomes of living organisms vary enormously in size

Fig. 11.1

Page 17: The genomes of living organisms vary enormously in size

Fig. 11.1

Page 18: The genomes of living organisms vary enormously in size

High-throughput High-throughput instrumentsinstruments

e.g, microarrayse.g, microarrays

Fig. 10.24

Page 19: The genomes of living organisms vary enormously in size

Large-scale multiplex ASO analysis with Large-scale multiplex ASO analysis with microarrays can detect microarrays can detect BRCA1BRCA1 mutations mutations

Each column contains an ASO differing only at the Each column contains an ASO differing only at the nucleotide position under analysisnucleotide position under analysis

BRCA1 DNA from any one allele can only be one of BRCA1 DNA from any one allele can only be one of four ASOs in a columnfour ASOs in a column

Heterozygotes are easily detetedHeterozygotes are easily deteted Fig. 11.10

Page 20: The genomes of living organisms vary enormously in size

Primer extension to detect SNPsPrimer extension to detect SNPs

Page 21: The genomes of living organisms vary enormously in size

Mass spectrometerMass spectrometer

Fig. 10.27

Page 22: The genomes of living organisms vary enormously in size

Microsatellite allele Microsatellite allele detectiondetection

analysis of size analysis of size differencesdifferences

Fig. 11.12

Page 23: The genomes of living organisms vary enormously in size

Huntington’s Huntington’s disease is an disease is an example of a example of a microsatellite microsatellite triplet repeat triplet repeat in a coding in a coding

regionregion

Fig. 11.13

Page 24: The genomes of living organisms vary enormously in size

Minisatellite detection and DNA Minisatellite detection and DNA fingerprintingfingerprinting

1985 – Alec Jeffreys made two key findings1985 – Alec Jeffreys made two key findings Each minisatellite locus is highly polymorphicEach minisatellite locus is highly polymorphic Most minisatellites occur at multiple sites Most minisatellites occur at multiple sites

around the genomearound the genome DNA fingerprint – pattern of simultaneous DNA fingerprint – pattern of simultaneous

genotypes at a group of unlinked locigenotypes at a group of unlinked loci Use restriction enzymes and southern blots to Use restriction enzymes and southern blots to

detect length differences at minisatellite locidetect length differences at minisatellite loci Most useful minisatellites have 10 – 20 sites Most useful minisatellites have 10 – 20 sites

around genome and can be analyzed on one gelaround genome and can be analyzed on one gel

Page 25: The genomes of living organisms vary enormously in size

Minisatellite Minisatellite analysisanalysis

Fig. 11.14

Page 26: The genomes of living organisms vary enormously in size

DNA fingerprints can DNA fingerprints can identify individuals and identify individuals and determine parentagedetermine parentage

E.g., DNA fingerprints E.g., DNA fingerprints confirmed Dolly the confirmed Dolly the sheep was cloned from sheep was cloned from an adult udder cellan adult udder cell

Donor udder (U), cell Donor udder (U), cell culture from udder (C), culture from udder (C), Dolly’s blood cell DNA Dolly’s blood cell DNA (D), and control sheep (D), and control sheep 1-121-12

Fig. 11.15

Page 27: The genomes of living organisms vary enormously in size
Page 28: The genomes of living organisms vary enormously in size

Human KaryotypeHuman Karyotype

(a) complete set of (a) complete set of human human chromosomes chromosomes stained with stained with Giemsa dye shows Giemsa dye shows bandsbands

(b) Ideograms show (b) Ideograms show idealized banding idealized banding patternpattern

Fig. 10.5 a

Page 29: The genomes of living organisms vary enormously in size

Chromosome 7 at three levels of resolutionChromosome 7 at three levels of resolution

Fig. 10. 5 b

Page 30: The genomes of living organisms vary enormously in size

FISH protocol for top-down approachFISH protocol for top-down approach

Page 31: The genomes of living organisms vary enormously in size

DNA hybridization and restriction mapping – a DNA hybridization and restriction mapping – a bottom-up approachbottom-up approach

Fig. 10.7

Page 32: The genomes of living organisms vary enormously in size

Identifying and isolating a set of overlapping fragments from a libraryIdentifying and isolating a set of overlapping fragments from a library

Two approachesTwo approaches Linkage maps used to derive a physical mapLinkage maps used to derive a physical map

set of markers less than 1 cM apartset of markers less than 1 cM apart Use markers to retrieve fragments from library by hybridizationUse markers to retrieve fragments from library by hybridization Construct contigs – two or more partially overlapping cloned Construct contigs – two or more partially overlapping cloned

fragmentsfragments Chromosome walk by using ends of unconnected contigs to probe Chromosome walk by using ends of unconnected contigs to probe

library for fragments in unmapped regionslibrary for fragments in unmapped regions Physical mapping techniquesPhysical mapping techniques

Direct analysis of DNADirect analysis of DNA Overlapping clones aligned by restriction mappingOverlapping clones aligned by restriction mapping Sequence tag segments (STSs)Sequence tag segments (STSs)

Page 33: The genomes of living organisms vary enormously in size
Page 34: The genomes of living organisms vary enormously in size

High density linkage mapping to High density linkage mapping to build overlapping set of genomic build overlapping set of genomic

clonesclones

Fig. 10.8

Page 35: The genomes of living organisms vary enormously in size
Page 36: The genomes of living organisms vary enormously in size
Page 37: The genomes of living organisms vary enormously in size

Physical mapping of overlapping Physical mapping of overlapping genomic clones without linkage genomic clones without linkage

informationinformation

Fig. 10.10

Page 38: The genomes of living organisms vary enormously in size

Physical mapping by analysis of STSsPhysical mapping by analysis of STSs

Each STS represents a unique segment of the genome amplified by PCR.

Fig. 10.11

Page 39: The genomes of living organisms vary enormously in size

Sequence maps show the order of nucleotides in a Sequence maps show the order of nucleotides in a cloned piece of DNAcloned piece of DNA

Two strategies for sequence human genomeTwo strategies for sequence human genome Hierarchical shotgun approachHierarchical shotgun approach Whole-genome shotgun approachWhole-genome shotgun approach

Shotgun – randomly generated overlapping Shotgun – randomly generated overlapping insert fragments insert fragments Fragments from BACsFragments from BACs Fragments from shearing whole genomeFragments from shearing whole genome

Shearing DNA with sonicationShearing DNA with sonication Partial digestion with restriction enzymesPartial digestion with restriction enzymes

Page 40: The genomes of living organisms vary enormously in size

Hierarchical shotgun strategyHierarchical shotgun strategyUsed in publicly funded effort to sequence human genomeUsed in publicly funded effort to sequence human genome

Shear 200 kb BAC clone Shear 200 kb BAC clone into ~2 kb fragmentsinto ~2 kb fragments

Sequence ends 10 timesSequence ends 10 times Need about 1700 plasmid Need about 1700 plasmid

inserts per BAC and about inserts per BAC and about 20,000 BACs to cover 20,000 BACs to cover genomegenome

Data from linkage and Data from linkage and physical maps used to physical maps used to assemble sequence maps assemble sequence maps of chromosomesof chromosomes

Significant work to create Significant work to create libraries of each BAC and libraries of each BAC and physically map BAC physically map BAC clonesclones

Fig. 10.12

Page 41: The genomes of living organisms vary enormously in size

Whole-genome shotgun sequencingWhole-genome shotgun sequencing

Private company Celera used to sequence whole human genomePrivate company Celera used to sequence whole human genome

Whole genome randomly Whole genome randomly sheared three timessheared three times Plasmid library constructed Plasmid library constructed

with ~ 2kb insertswith ~ 2kb inserts Plasmid library with ~10 kb Plasmid library with ~10 kb

insertsinserts BAC library with ~ 200 kb BAC library with ~ 200 kb

insertsinserts Computer program assembles Computer program assembles

sequences into chromosomessequences into chromosomes No physical map constructionNo physical map construction Only one BAC libraryOnly one BAC library Overcomes problems of repeat Overcomes problems of repeat

sequencessequences

Fig. 10.13

Page 42: The genomes of living organisms vary enormously in size

Sequencing of the human genomeSequencing of the human genome

Most of draft took place during last year of Most of draft took place during last year of projectproject Intruments improvements – 345,600 bp/dayIntruments improvements – 345,600 bp/day Automated factory-like production line Automated factory-like production line

generated sufficient DNA to supply sequencers generated sufficient DNA to supply sequencers on a daily basison a daily basis

Large sequencing centers with 100-300 Large sequencing centers with 100-300 instruments – 103,680,000 bp/day (10-fold instruments – 103,680,000 bp/day (10-fold coverage in 30 days)coverage in 30 days)

Page 43: The genomes of living organisms vary enormously in size

High-throughput DNA sequencingHigh-throughput DNA sequencing

Fig. 10.23

Page 44: The genomes of living organisms vary enormously in size

Integration of linkage, physical, and sequence mapsIntegration of linkage, physical, and sequence maps

Provides check on the correct order of each Provides check on the correct order of each map against other twomap against other two

SSR and SNP DNA linkage markers readily SSR and SNP DNA linkage markers readily integrated into physical map by PCR integrated into physical map by PCR analysis across insert clones in physical mapanalysis across insert clones in physical map

SSR, SNP (linkage maps), and STS markers SSR, SNP (linkage maps), and STS markers (physical maps) have unique sequences 20 (physical maps) have unique sequences 20 bp or more allowing placement on sequence bp or more allowing placement on sequence mapmap

Page 45: The genomes of living organisms vary enormously in size

Cloning human genesCloning human genes

A pedigree of the royal family descended from Queen VictoriaA pedigree of the royal family descended from Queen VictoriaIn which hemophilia A is segregatingIn which hemophilia A is segregating

Fig. 11.16 a

Page 46: The genomes of living organisms vary enormously in size

Blood-clotting cascade in which vessel damage causes a Blood-clotting cascade in which vessel damage causes a cascade of inactive factors to be converted to active factorscascade of inactive factors to be converted to active factors

Fig. 11.16 b

Page 47: The genomes of living organisms vary enormously in size

Blood tests determine if active form of each Blood tests determine if active form of each factor in the cascade is presentfactor in the cascade is present

Fig. 11.16 c

Page 48: The genomes of living organisms vary enormously in size

Techniques used to purify Factor VIII and Techniques used to purify Factor VIII and clone the geneclone the gene

Fig. 11.16 d

Page 49: The genomes of living organisms vary enormously in size

Positional Cloning – Step 1Positional Cloning – Step 1

Find extended families in which disease is Find extended families in which disease is segregatingsegregating

Use panel of polymorphic markers spaced Use panel of polymorphic markers spaced at 10 cM intervals across all chromosomesat 10 cM intervals across all chromosomes About 300 markers totalAbout 300 markers total

Determine genotype for all individuals in Determine genotype for all individuals in families for each DNA markerfamilies for each DNA marker

Look for linkage between a marker and Look for linkage between a marker and disease phenotypedisease phenotype

Page 50: The genomes of living organisms vary enormously in size

Once region of Once region of chromosome is chromosome is identified, a high identified, a high resolution resolution mapping is mapping is performed with performed with additional additional markers to markers to narrow down narrow down region where region where gene may liegene may lie

Fig. 11.17

Page 51: The genomes of living organisms vary enormously in size
Page 52: The genomes of living organisms vary enormously in size

Positional cloning – Step 2 identifying Positional cloning – Step 2 identifying candidate genescandidate genes

Once region of chromosome has been narrowed Once region of chromosome has been narrowed down by linkage analysis to 1000 kb or less, all down by linkage analysis to 1000 kb or less, all genes within are identifiedgenes within are identified

Candidate genes Candidate genes Usually about 17 genes per 1000 kb fragmentUsually about 17 genes per 1000 kb fragment Identify coding regionsIdentify coding regions

Computational analysis to identify conserved sequences Computational analysis to identify conserved sequences between speciesbetween species

Computational analysis to identify exon-like sequences by Computational analysis to identify exon-like sequences by looking for codon usage, ORFs, and splice siteslooking for codon usage, ORFs, and splice sites

Appearance in one or more EST databasesAppearance in one or more EST databases

Page 53: The genomes of living organisms vary enormously in size

Computational analysis of genomic sequences Computational analysis of genomic sequences to identify candidate genesto identify candidate genes

Fig. 11.19

Page 54: The genomes of living organisms vary enormously in size

Gene expression patterns can pinpoint Gene expression patterns can pinpoint candidate genescandidate genes

Look in public database of EST sequences Look in public database of EST sequences representing certain tissuesrepresenting certain tissues

Northern blotNorthern blot RT-PCRRT-PCR

Page 55: The genomes of living organisms vary enormously in size

Northern blot example showing Northern blot example showing SRYSRY candidate for testes determining candidate for testes determining factor is expressed in testes, but not lung, ovary, or kidneyfactor is expressed in testes, but not lung, ovary, or kidney

Fig. 11.20

Page 56: The genomes of living organisms vary enormously in size

Positional cloning – Step 3Positional cloning – Step 3

Find the gene responsible for the phenotypeFind the gene responsible for the phenotype Expression patterns in affected individualsExpression patterns in affected individuals

RNA expression assayed by Northern blot or RT-PCR with RNA expression assayed by Northern blot or RT-PCR with primers specific to candidate transcriptprimers specific to candidate transcript

Look for misexpression (no expression, underexpression, Look for misexpression (no expression, underexpression, overexpression)overexpression)

Sequence differencesSequence differences Missense mutations identified by sequencing coding region of Missense mutations identified by sequencing coding region of

candidate gene from normal and abnormal individualscandidate gene from normal and abnormal individuals Transgenic modification of phenotypeTransgenic modification of phenotype

Insert the mutant gene into a model organismInsert the mutant gene into a model organism

Page 57: The genomes of living organisms vary enormously in size

Transgenic analysis can prove candidate gene Transgenic analysis can prove candidate gene is disease locusis disease locus

Fig. 11.21

Page 58: The genomes of living organisms vary enormously in size

Example: Positional Cloning of Cystic Example: Positional Cloning of Cystic Fibrosis GeneFibrosis Gene

Linkage analysis places CF on chromosome 7Linkage analysis places CF on chromosome 7

Fig. 11.22 a

Page 59: The genomes of living organisms vary enormously in size

Northern blot analysis reveals only one of candidate Northern blot analysis reveals only one of candidate genes is expressed in lungs and pancreasgenes is expressed in lungs and pancreas

Fig. 11.22 b

Page 60: The genomes of living organisms vary enormously in size

Every CF patient has a mutated allele of the Every CF patient has a mutated allele of the CFTR gene on both chromosome 7 CFTR gene on both chromosome 7

homologshomologsLocation and number of mutations indicated Location and number of mutations indicated

under diagram of chromosomeunder diagram of chromosome

Fig. 11.22 c

Page 61: The genomes of living organisms vary enormously in size

CFTR is a membrane protein. TMD-1 and CFTR is a membrane protein. TMD-1 and TMD-2 are transmembrane domains.TMD-2 are transmembrane domains.

Fig. 11.22 d

Page 62: The genomes of living organisms vary enormously in size

Proving CFTR is the right geneProving CFTR is the right gene

Phenotype eliminates gene functionPhenotype eliminates gene function Cannot use transgenic technologyCannot use transgenic technology

Instead perform CFTR gene “knockout” in Instead perform CFTR gene “knockout” in mouse to examine phenotype without CFTR mouse to examine phenotype without CFTR genegene Targeted mutagenesisTargeted mutagenesis

Page 63: The genomes of living organisms vary enormously in size

Genetic dissection of complex traitsGenetic dissection of complex traits

Page 64: The genomes of living organisms vary enormously in size

Incomplete penetrance – when a mutant genotype Incomplete penetrance – when a mutant genotype does not always cause a mutant phenotypedoes not always cause a mutant phenotype

No environmental factor associated with No environmental factor associated with likelihood of breast cancerlikelihood of breast cancer

Positional cloning identified BRCA1 as one gene Positional cloning identified BRCA1 as one gene causing breast cancer.causing breast cancer. Only 66% of women who carry BRCA1 mutation Only 66% of women who carry BRCA1 mutation

develop breast cancer by age 55develop breast cancer by age 55 Incomplete penetrance hampers linkage mapping Incomplete penetrance hampers linkage mapping

and positional cloningand positional cloning Solution – exclude all nondisease individuals form Solution – exclude all nondisease individuals form

analysisanalysis Requires many more families for studyRequires many more families for study

Page 65: The genomes of living organisms vary enormously in size

PhenocopyPhenocopy Disease phenotype is not caused by any Disease phenotype is not caused by any

inherited predisposing mutationinherited predisposing mutation Decreases power to detect correlation between Decreases power to detect correlation between

inheritance of disease locus and expression of the inheritance of disease locus and expression of the diseasedisease

Genetic heterogeneityGenetic heterogeneity Mutations at more than one locus cause same Mutations at more than one locus cause same

phenotypephenotype Multiple families used in most studiesMultiple families used in most studies If different families have different gene mutations, If different families have different gene mutations,

power of statistics to detect linkage will drop power of statistics to detect linkage will drop significantlysignificantly

Page 66: The genomes of living organisms vary enormously in size

Polygenic inheritancePolygenic inheritance Two or more genes interact in the expression of Two or more genes interact in the expression of

phenotypephenotype QTLs, or quantitative trait lociQTLs, or quantitative trait loci

Unlimited number of transmission patterns for QTLsUnlimited number of transmission patterns for QTLs Discrete traits – penetrance may increase with number of Discrete traits – penetrance may increase with number of

mutant locimutant loci Expressivity may vary with number of lociExpressivity may vary with number of loci

Many other factors complicate analysisMany other factors complicate analysis Some mutant genes may have large effectSome mutant genes may have large effect Mutations at some loci may be recessive while others are Mutations at some loci may be recessive while others are

dominant or codominantdominant or codominant