Eukaryotic Chromosome Structure 1

  • Upload
    oonemoo

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    1/16

    Eukaryotic Chromosome Fine Structure I

    Our concept of chromosomes comes from three different avenues:

    1) The genetic chromosome, derived from studies of the inheritance of traits.

    2) The morphological chromosome, derived from the cytological examination ofchromosomes.

    In the early days of genetics, large amounts of cytogenetic work on the effects ofchromosome structure on phenotype (and vice versa) were done as this was theonly subcellular level accessible to investigation. Cytogenetics is still animportant field: visible alterations in chromosome structure are again veryimportant in genetic mapping and in understanding diseases. Also, chromosomemanipulation is used in particular in plant breeding. Finally, by looking at changes

    in chromosome structure, it is often possible to infer genetic changes that tookplace in evolution.

    3) The molecular chromosome, derived from analysis of the DNA ofchromosomes.

    Now, besides microscopy techniques, there is a whole panorama of molecularbiology techniques that can be used to study chromosome structure and function.

    A lot of what I am going to talk about as far as molecular chromosome structureis still not thoroughly understood. Furthermore, there are exceptions to almost

    every generalization.

    What is a chromosome?

    This term historically has been applied to any structure known to contain genes:Prokaryotic structures (genophores): bacteria, mitochondria, chloroplasts,

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    2/16

    viruses. DNA molecules in these organisms or organelles are usually circular,double stranded, and supercoiled, and much simpler in structure and regulationthan in eukaryotes.

    Eukaryotes:

    1) Have true chromosomal structures with centromeres and telomeres.

    2) Their chromosomes undergo mitosis and meiosis.

    3) There are complex interactions between proteins and nucleic acids in the

    chromosomes that regulate gene and chromosomal function.

    Total information stored on chromosomes of an organism is called the genome.

    E. coli - 4.7 x 106 bp

    Humans - 3 x 109 bp present in 24 chromosomes (22 autosomes + X& Y)

    A chromosome consists of a single molecule of DNA and itsassociated proteins.

    The total complement of DNA in the nucleus of a eukaryote is separated intoindividual molecules of DNA that can be tens or hundreds of millions ofnucleotide pairs in length. Each DNA molecule forms one chromosome and canencode the genetic information for many proteins.

    What is the evidence that only a single molecule of DNA is involved?

    1) Chromosomal replication in the presence of bromodeoxyuridine showsdiscrete partitioning of the Budr entirely into one or the other of the sisterchromatids, except where crossing over has occurred, producing a harlequin

    pattern.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    3/16

    Illustration provided by Dr. Jonathan Wolfe, The Galton Laboratory, London

    2) When pulsed-field gel electrophoresis (PFGE), which separates out highmolecular weight fragments of DNA, is performed on certain organisms withrelatively small genomes and chromosomes (e.g. yeast), the number offragments observed is equal to the number of chromosomes and the fragmentscorrespond in length to the estimated lengths of the chromosomes.

    3) The complete nucleotide sequence of the yeast and other eukaryoticgenomes have been determined.

    Note that these are fairly modern proofs.

    What are the implications of a chromosome being composed of a singlemolecule of DNA?

    Individual human chromosomes, stretched out, may be three inches long,containing 250 million DNA bases. The length of DNA contained in a cell is fargreater than the diameter of the cell itself. For example, a typical nucleus is only6 micrometers in diameter. The total length of DNA in the human genome is 1.8meters.

    This leads to many questions - How is DNA packed into the nucleus? How can

    such an enormously long molecule be maintained with integrity? How can it becoiled tightly enough to function properly in cell division? How can the coiledDNA be made available for transcription and DNA replication?

    DNA in its simplest form in the chromosome consists of theDNA molecule with which you are already familiar.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    4/16

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    5/16

    Chromatin proteins involved can be divided into two types:

    A. Histones

    Histone proteins are the primary structural elements that function in folding and

    coiling chromosomes in the cell nucleus. Histones have many basic amino acids,which shield the negatively charged DNA backbone. There are 5 primary types:H4, H3, H2A H2B, and H1. The histones have been highly conserved throughevolution; H3 and H4 (2 differences in aa sequence between peas and cows inH4) are the most conserved, while H1 is the least conserved. However, all of thehistones have subtypes that probably function in gene regulation. Thus, thehistones function in production of basic chromosome structure and also in theregulation of chromosome function. Together the histones constitute about 45%of the total mass of a chromosome with 60 million molecules of each type percell.

    B. Non-histone proteins (NHPs, acidic proteins, nonhistone chromosomalproteins, NHC proteins).

    Non-histone proteins, although evidently involved in a number of other biologicalprocesses, may primarily help regulate DNA transcription and replication. Thereare at least 30 types of these proteins and they are a very heterogeneous group.They include the HMGs (high mobility group proteins), scaffold and otherstructural proteins, e.g. topoisomerase II, and regulatory proteins such as helix-turn-helix, zinc finger, and leucine zipper proteins.

    Eukaryotic genomes are complex and DNA amounts andorganization vary widely between species.

    DNA is usually thought of as coding for specific proteins. However, it alsocontains sequences for controlling elements and regulators that affect geneexpression and higher order chromatin organization.

    Where are genes (or other sequences) located on the chromosomes? Do theyoccur equally spaced across the chromosomes, in small groups, all on one

    chromosome, or what?

    In prokaryotes we know that the genetic map is densely packed, and that genesfrequently occur in clusters (operons).

    Eukaryotic genomes are different; the amount and composition of DNA does notnecessarily reflect the complexity of an organism.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    6/16

    A. Chromosome number: There is only a very loose correlation (if any) betweenDNA content and chromosome number. Also, there is no relationship betweenthe number of chromosomes and the presumed evolutionary complexity of anorganism.

    CommonName

    Genus andSpecies

    DiploidChromosomeNumber

    Buffalo Bison bison 60Cat Felis catus 38

    CattleBos taurus, B.indicus

    60

    Dog Canis familiaris 78Donkey E. ascinus 62Goat Capra hircus 60

    Horse Equus callabus 64Human Homo sapiens 46Pig bb Sus scrofa 38Sheep Ovis aires 64

    B. Genome size: An organism can also be described by the amount of DNA in ahaploid cell. This is usually expressed as the amount of DNA per haploid cell(usually expressed in picograms) or the number of kilobases per haploid cell andis called the C value. Organisms with the highest DNA content are notnecessarily most complex. e.g. some plants have more DNA than humans(humans have 700x more than E. coli, some plants have 30x more than

    humans), as do some lower vertebrates. Further, two apparently closely relatedorganisms can have very different DNA contents (different amphibian speciesvary 100x in DNA content). This is stated as the C value paradox: the amount ofDNA in the haploid cell of an organism is not related to its evolutionarycomplexity.

    Estimates of the number of genes encoded by the human genome have beenestimated from the observed frequency of mutations (assuming mutation ratesare similar to other organisms) and estimates center around 40 - 50,000 genes.This would allow for 60 kbp per gene, which is higher than expected, and higherthan has been seen in the majority of cloned genes. There is thus a relative

    'excess' of DNA in the human genome and even more in plants.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    7/16

    A dramatic example of the range of C values can be seen in the plant kingdomwhere Arabidopsis represents the low end and lily (1.0 x 10 8 kb/haploid genome)the high end of complexity. In weight this is 0.07 picograms per haploidArabidopsis genome and 100 picograms per haploid lily genome.

    C Values of Organisms Used in Genetic Studies

    There are different classes of eukaryotic DNA based onsequence complexity.

    Sequence complexity has been analyzed in eukaryotes by reassociation kinetics(Cot values), by restriction mapping, and by sequencing of DNA, e.g. the Human

    Genome Project.

    Reassociation kinetics: DNA from a particular species (or of a particular type)is sheared into fragments of about 300 - 500 bp and then denatured by heat.The mixture is then incubated for varying times at a temperature allowingreannealing. The fraction of the DNA not reassociated in each sample ismeasured and plotted. Simple sequences that are repeated many times willanneal (associate with their complementary sequences) more quickly than will

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    8/16

    complex sequences or ones that are present in one or a few copies of thegenome.Co is the concentration of single-stranded DNA at the beginning of thereassociation reaction. It is measured in moles of nucleotide units per liter.t is the time of the reassociation reaction, in seconds.

    The amount of single-stranded DNA left (C/Co) is plotted against the product ofthe above two terms, Cot.

    A Cot 0.5 value as shown below relates time (t) for 50% reassociation toconcentration (Co) of that fraction and repeat length and number of copies can becalculated.

    Illustration provided by Dr. Jonathan Wolfe, The Galton Laboratory, London

    If the same experiment is carried out using DNA purified from a complexeukaryote, such as a human, then a simple sigmoidal curve is not produced.Instead a curve is produced that is the sum of reannealings of many differentcomponents but has 3 main plateaus. These represent three main classes ofDNA molecules. The first component to reanneal (at a C0t0.5 of 10-2) iscomposed of highly repeated DNA sequences, with an average repetition ofabout 50,000 times per haploid genome but which includes sequences which are

    repeated at least 500,000 times. The second component, moderately repeatedDNA sequences (C0t)0.5 = 1) is made up of DNA sequences that are representedfrom 50 to 5,000 times in the genome. The final component, "single copy DNA"or "non-repetitive DNA" includes all of the DNA that is present in just one copyper genome but also includes many sequences that are present in low numbers.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    9/16

    Illustration provided by Dr. Jonathan Wolfe, The Galton Laboratory, London

    What are the relationships between these different classes of DNA, how are theyorganized on the chromosome, and what are their functions? More about thislater.

    1What structures are necessary to a functional eukaryoticchromosome?23A. Primary constriction centromere

    The primary constriction of chromosomes is the centromere. This is the regionwhere the spindle fibers attach and is therefore involved in the movement ofchromosomes during cell division. It usually appears as a constriction in thechromatin at metaphase and anaphase of cell division. It is the last point ofseparation of sister chromatids during cell division.

    Chromatin at the centromere is permanently contracted, i.e. it isheterochromatin. However, the centromere is still an active entity, being thesite of the kinetochore, the point of attachment of the spindle fibers, which areimportant for chromosome separation during nuclear division.

    The kinetochore is a proteinaceous entity that can be detected by antikinetochoreantibodies found in the serum of patients with the autoimmune diseasescleroderma CREST. The kinetochore is the anchor point for the attachment ofmicrotubules - composed of tubulin and actin - at the chromosome end of thespindle during cell division. Two kinetochores are present on each chromosometo face each pole. Centromeric proteins (CENPs) are critical for proper function.Centromere sequences have been identified and studied, with much difficulty.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    10/16

    Centromeres are characterized by particular repeat sequences, often designatedCen1, Cen2, etc, In yeast, there are 3 regions of the CEN sequences that areimportant for their function:

    CDE-1 - 8 or 9 bp consensus sequence

    CDE-11 - conserved length of 80 - 90 bp, but no sequenceconservation except that they are AT rich

    CDE-111 - 11 bp highly conserved region

    CBF3 complex thought to include a microtubule-dependent motor -used to move the CEN along the mitotic spindle.

    Thus the full centromere is specified by a 125bp DNA segment. There is norepetitive DNA. The entire kinetochore region incorporates a DNA segmentof

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    11/16

    even differ between individuals in a population, yeilding polymorphisms that canbe detected cytogenetically. This interchromosome variation suggests that ahigher order structure rather than a primary sequence may be the feature thatdefines a centromere. It also implies that centromere sequences from onespecies may not function in another species.

    Most species have a single discrete centromere per chromosome (monocentric).A few lack a primary constriction and have kinetochores that fall along the entirelength of the chromosome (polycentric) or do not form discrete kinetochores atall (diffuse centromeres).

    Abnormal chromosomes such as isochromosomes may have 2 centromeres,but if it is a stable entity, only one of the centromeres will function (i.e. have akinetochore).

    B. Telomeres

    End caps of chromosomes: If a chromosome is not formed as a ring, it must have2 ends. These caps must act to prevent chromosome shortening at each roundof cell division.

    End replication problem RNA polymerase can synthesize a strand of DNA denovo, but DNA polymerase can only extend an existing strand. DNA polymerasescan extend nucleic acids only in the 5' to 3' direction. Therefore when DNAreplication takes place, in the 5' to 3' direction, result that one strand is

    synthesized continuously (leading strand). The other strand must be synthesizeddiscontinuously (lagging strand). A short RNA primer is synthesized by a primaseThe distance between primers is about 100 nucleotides. Then DNA elongates thenew primer in the 5' to 3' direction until it reaches the 5' end of a neighboringfragment. The newly synthesized DNA is called an Okazaki fragment. Then DNAligase joins adjacent Okazaki fragments. However, this strand will have anunsynthesized section at the end where the DNA polymerase was primed, theRNA was removed and the bases necessary for complete replication cannot beadded. Thus, a short single-stranded region would be left at the end of thechromosome (in humans between 50 and 100 nucleotides). This region would besusceptible to enzymes that degrade single-stranded DNA. The result would be

    that chromosome length would become shorter after each cell division.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    12/16

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    13/16

    This doesn't always happen. Why? Telomeres are formed. The telomeres ofmost organisms' chromosomes consist of short sequence-asymmetric repeatedsequences thqt are GC-rich. Lengths are typically less than 350 repeats in

    Arabidopsis and 300 to 500 bp in Saccharomyces. In many species, telomeresconsist of multiple tandem repeats of short sequences. The consensus sequence

    (TTAGGG) is highly conserved across phylogenies, but there are manyexceptions. A Drosophila chromosome, an exception, has a transposableelement at the end of one of its chromosomes.

    Tetrahymena,Paramecium

    CCCCAA

    Oxytrichia, Euplotes CCCCAAAATrypanosoma,Leishmania

    CCCTA

    Physarum CCCTA

    Saccharomyces C1-3AArabidopsis CCCTAAAHomo CCCTAAACaenorhabditis CCCTAAA

    Drosophilatransposableelement

    The DNA of telemeres is complexed with protein. The protein involved may bevery conserved.

    The action of the telomere terminal transferase (telomerase) enzyme is

    necessary for telomere formation. Telomere terminal transferase or telomerase isa ribonucleoprotein enzyme (composed of both RNA and proteins) that uses itsinternal RNA component (complementary to the telomeric single strandedoverhang) as a template in order to synthesize telomeric DNA (TTAGGG)n,directly onto the ends of chromosomes, thus compensating for the continuederosion of telomeres that occurs in its absence. After adding six bases, theenzyme is thought to pause while it repositions (translocates) the template RNAfor the synthesis of the next six base pair repeat. This extension of the 3' DNAtemplate end in turn permits additional replication of the 5' end of the laggingstrand, thus compensating for the end replication problem.

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    14/16

    Telomerase is expressed in embryonic cells and in adult male germline cells, butis undetectable in normal somatic cells except for proliferative cells of renewaltissues (e.g. hematopoietic stem cells and activated lymphocytes, basal cells ofthe epidermis and intestinal crypt cells). In normal somatic cells, progressivetelomere shortening is observed, eventually leading to greatly shortened

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    15/16

    telomeres and to a limited replicative capacity.

    Telomere shortening has been suggested to be a "clock" that regulates howmany times an individual cell can divide (the Hayflick limit). At birth, asdetermined by terminal restriction fragment (TRF) analysis, telomeres in humans

    consist of about 15,000 base pairs of repeated TTAGGG DNA sequences, whichbecome shorter with each cell division owing to the end replication problem.Every time a cell divides it loses 25-200 DNA base pairs off the telomere ends.Once this pruning has occurred about 100 times a cell senesces (or ages) anddoes not continue dividing.

    The telomere-telomerase hypothesis also targets cancer; based on the findingsthat most human tumors have telomerase activity while normal human somaticcells do not. As if aging, cancer, and AIDS weren't enough, connections betweentelomerase activity (and/or telomere length) and a variety of other diseases anddevelopmental processes have been made: arteriosclerosis, progeria, Down's

    syndrome, and failed bone marrow transplants, to name a few.

    "The excitement over telomerase continues to mount as evidence accumulatesthat makes the connection between telomere length and cell lifespan likely to bemore than a coincidence. The most recent findings show that the age span ofcultured cells, normally limited to around 50 cell doublings--the so-called Hayflicklimit, named for the scientist who first observed that the lifespan of cultured cellswas finite--can be more than doubled by transfecting them with telomerasegenes (A.G. Bodnar et al., Science, 279:349-52, 1998). These findings come onthe heels of a series of observations correlating the loss of telomerase activityand/or the shortening of the ends of chromosomes ( telomeres) with the loss of

    proliferative capacity, an observation that holds true in a number of situations:somatic (limited proliferative capacity) as compared to germ cells (largerproliferative capacity); normal tissue (limited) versus malignant tumors(unlimited); and normal T cells versus HIV-infected T cells, whose telomeresresemble those of aged individuals. And the list goes on. What exactly istelomerase and what has it to do with aging? Telomerase is a novelribonucleoprotein, or reverse transcriptase (RT), that adds nucleotides to theends of chromosomes during DNA replication. Unlike other known reversetranscriptases, all of which are associated with viruses, the telomeric RT is theonly one to date associated with a normal genome."

    C. Origins of replication

    The third necessity for a functional chromosome is an origin of replication. Theseare where replication begins and there are usually a large number of these oneach eukaryotic chromosome.A circular DNA bearing the LEU2selectable gene transforms Saccharomyces

  • 8/8/2019 Eukaryotic Chromosome Structure 1

    16/16

    cerevisiae leu2cell to leucine independence with very low efficiency. Randomfragments ofS. cerevisiae DNA were ligated into the circular DNA. The productswere used to transform leu2yeast cells. Many colonies were obtained. CircularDNA isolated from the colonies transformed the leu2cells to leucineindependence with high efficiency. Deletion analysis of the inserts in the high

    efficiency transforming DNAs suggested that the region of the inserts responsiblefor the high efficiency is limited to about 50 bp. These sequences are called ARS(autonomously replicating sequences). It is estimated that there areapproximately 400 ARSs spread over the 17 chromosomes of yeast.

    Mutation analysis has identified a region of about 50 bp that is required for properARS function. Comparison of the essential sequences from several such DNAs

    revealed that they all contain an 11 bp sequence (ARS consensus) or asequence closely related to it. Mutation in this 11 bp sequence abolishes ARSfunction, the sequence cannot act on its own as an ARS. The additionalsequences show no similarity between ARSs.

    Random fragments of DNA from other organisms when cloned into the yeastcircular DNA can also function as ARS sequences.

    A complex of 6 proteins, called the origin recognition complex, or ORC, binds tothe ARS. Their function is not known, but it is guessed that they act in unwindingthe DNA.

    Replication origins that have been characterized contain internal repeats, and arerich in A-T base pairs. Since A-T base-pairing is weaker than G-C base-pairing,A-T rich helices make it easier for helicases to open the helix, allowing primasesand other enzymes access to each strand.The particularly good characterization of yeast chromosomes has allowed theconstruction of yeast artificial chromosomes. We will talk about these later inconjunction with mapping.