Genomes and Their Evolution

5/29/13 Genomes and their evolution

session.masteringbiology.com/myct/assignmentPrintView?assignmentID=1937208 1/23

Genomes and their evolution

Due: 3:00pm on Monday, May 20, 2013

Note: You will receive no credit for late submissions. To learn more, read your instructor's Grading Policy

Shotgun Approach to Whole-Genome Sequencing

In the shotgun approach to whole-genome sequencing (shotgun sequencing), random DNA fragments of a chromosome are sequenced. The fragmentsequences are then assembled into a continuous sequence that represents the DNA of the entire chromosome.

Part A - Steps in shotgun sequencing

What are the steps in the shotgun approach to whole-genome sequencing?

Drag the labels to their appropriate locations on the flowchart. Only some labels will be used.

Hint 1. Understanding why the chromosome is broken into fragments

Sequencing machines cannot analyze sequences that are more than about 8001,000 bases long. Therefore, the chromosome must bebroken into fragments before any sequencing can take place.

Hint 2. DNA cloning using plasmids

A typical DNA sequencing reaction requires about 1 microgram of DNA, so the amplification of DNA through cloning is a crucial step inshotgun sequencing.One type of DNA cloning involves plasmids. A plasmid is a small, circular DNA molecule found in bacteria in addition to the bacterialchromosome. Each time a bacterium reproduces, it replicates each of its plasmids.To clone DNA using plasmids, molecular biologists insert DNA fragments into plasmids and then introduce the plasmids into bacteria.Because bacteria reproduce so rapidly, they can make more than a million copies of a DNA fragment in less than 24 hours.

Hint 3. Why is overlap between the fragment sequences important?

Why must the fragment sequences overlap?

ANSWER:

ANSWER:

Biol 1002 - Spring 2013

Genomes and their evolution Resources

Overlap enables the computer to match up the fragments and determine how they fit together.

Overlap enables the computer to sequence each fragment.

Overlap enables the computer to determine the length of each fragment.

Signed in as Nora Trejos Help Close



Correct

In shotgun sequencing, the DNA from many copies of an entire chromosome is cut into fragments. The fragments are inserted into plasmidsand cloned in bacteria. Plasmid DNA is isolated from the bacteria, purified, and sequenced. Finally, a computer assembles the fragmentsequences into the continuous sequence of the whole chromosome, based on overlap between the fragments.

Part B - Assembling a complete sequence from fragment sequences

In the last step of shotgun sequencing, a computer analyzes a large number of fragment sequences to determine the DNA sequence of a wholechromosome. Given the following fragment sequences, what is the overall DNA sequence?



Sequences of DNA fragments

GATGAC

CGATGCG

GGCGTCAG

GACATGGC

TCAGTCGA

Enter the complete DNA sequence, which should contain 24 bases.

Hint 1. How to approach the problem

Examine the ends of each fragment. Find pairs of fragments in which the sequence on the right end of one fragment is the same as thesequence on the left end of another fragment. These regions of identical sequence are where neighboring fragments overlap.

A fragment whose left end does not overlap with another fragment represents the left end of the complete sequence. A fragment whose rightend does not overlap with another fragment represents the right end of the complete sequence.In the following example, five fragment sequences overlap to form a complete sequence.

Fragment 1 GGACTTA

Fragment 2 TTACAATT

Fragment 3 ATTGCAA

Fragment 4 CAAATGCC

Fragment 5 GCCTAA

Complete sequence GGACTTACAATTGCAAATGCCTAA

Hint 2. How do fragments overlap?

Two of the following DNA sequences have a 3-base overlap with the DNA fragment CAGTACTCA.

Select the two DNA sequences that have a 3-base overlap with the fragment CAGTACTCA.

ANSWER:

Hint 3. Can you determine a complete sequence based on simpler fragment sequences?

Given the following three fragment sequences, what is the complete sequence?

Sequences of DNA fragments

CCTACT

AATGAT

ACTAAT

Enter the complete DNA sequence.

ANSWER:

Hint 4. Which fragment sequences overlap with at least one other fragment sequence?

Which fragment sequences have a 3-base or 4-base overlap with another fragment sequence?

Select all that apply.

ANSWER:

GATGAC

GGCGTCAG

TCAGTCGA

CGATGCG

CCTACTAATGAT



ANSWER:

Correct

The five fragment sequences can be arranged to form the complete sequence:

Fragment GATGAC

Fragment GACATGGC

Fragment GGCGTCAG

Fragment TCAGTCGA

Fragment CGATGCG

Complete sequence GATGACATGGCGTCAGTCGATGCG

In shotgun sequencing, a computer program takes millions of bases into consideration when determining the sequence of an entirechromosome. The program arranges the fragment sequences so there is a maximum amount of overlap.

Activity: The Human Genome Project: Genes on Human Chromosome 17

Click here to complete this activity.

Then answer the questions.

Part A

DNA fragment A consists of _____ base pairs.

ANSWER:

CGATGCG

GGCGTCAG

GACATGGC

TCAGTCGA

CATACTAG

GATGACATGGCGTCAGTCGATGCG



Correct

Reading the calibration curve should give you a value of 1,268 base pairs for this DNA fragment.

Part B

Which of these genes are located on the q arm of chromosome 17?

ANSWER:

Correct

The q arm is the long arm of a chromosome.

Part C

The RP13 gene of chromosome 17 codes for a protein _____.

ANSWER:

Correct

The RP13 gene codes for a protein that plays a role in eye development.

Part D

The gene that codes for gastrin is located on the _____ of chromosome 17.

564

1,268

1,405

2,027

2,322

gastrin and GH1

MPO and GLUT4

BLMH and RP13

RP13 and GLUT4

TP53 and KRTHA1

involved in glucose transport

that is a component of hair and nails

in the regulation of blood pressure

involved in eye development

involved in the determination of personality



ANSWER:

Correct

The gene that codes for gastrin is located on the long arm of chromosome 17.

Part E

The TP53 gene of chromosome 17 codes for a protein _____.

ANSWER:

Correct

This is the function of the TP53 protein.

Part F

Which of these genes codes for a protein that plays a role in growth?

ANSWER:

Correct

"GH" stands for growth hormone.

Part G

Which of these genes codes for a protein that plays a role in white blood cell function?

ANSWER:

q2

q arm

p2

p arm

centromere

that plays a role in the digestive process

that, in a particular variant, may play a role in Alzheimer's disease

involved in glucose transport

involved in the regulation of the cell cycle

that is like a white blood cell protein

gastrin

DCP1

SCLC6A4

KRTHA1

GH1



Correct

This gene codes for a protein that plays a role in white blood cell function

Chapter 21 Question 1

Part A

For mapping studies of genomes, most of which were far along before 2000, the three-stage method was often used. Which of the following is theusual order in which the stages were performed, assuming some overlap of the three?

ANSWER:

Correct


Part A

How is a physical map of the genome of an organism achieved?

ANSWER:

Correct


Part A

Which of the following most correctly describes the whole-genome shotgun technique for sequencing a genome?

ANSWER:

DCP1

KRTHA1

MPO

GLUT4

RP13

linkage map, physical map, sequencing of fragments

cytogenetic linkage, sequencing, physical map

physical map, linkage map, sequencing

sequencing of entire genome, physical map, genetic map

genetic map, sequencing of fragments, physical map

using sequencing of nucleotides

using recombination frequency

using very high-powered microscopy

using restriction enzyme cutting sites

using DNA fingerprinting via electrophoresis



Correct


Part A

What is metagenomics?

ANSWER:

Correct

Using BLAST: What Can a Protein Sequence Reveal about Cancer?

Many genes whose normal function is to control cell division can becomecancer-causing genes, or oncogenes, through mutation. The normal(nonmutated) versions of these genes are referred to as proto-oncogenes.

In this tutorial, you will investigate a mutation that converts a proto-oncogeneinto an oncogene. The resulting protein is involved in chronic myelogenousleukemia (CML), a form of cancer that causes the unregulated growth ofmyeloid cells, bone marrow cells that give rise to white blood cells.To examine this CML-associated protein, you will use BLAST (Basic LocalAlignment Search Tool), a publicly available program for searching knownnucleotide and amino acid sequences from several bioinformatics databasesorganized by the National Center for Biotechnology Information (NCBI). TheBLAST algorithm statistically ranks the similarity between an input sequence(called the query) and all other sequences present in a database. The DNA andprotein sequences are linked to information about

genes and gene familiesthe organism the sequence is derived fromthe function of the protein (known or predicted)protein structural information

Thus, BLAST provides a powerful tool for linking gene or protein sequenceinformation with function and biological origin.

Part A - Search for the CML-associated protein sequence in BLAST

The amino acid sequence of the CML-associated protein is shown here:

1 psmafrvhsr ngksytflis sdyeraewre nireqqkkcf rsfsltsvel qmltnscvkl

61 qtvhsiplti nkegeklrvl gynhngewce aqtkngqgwv psnyitpans lekhswyhgp

121 vsrnaaeyll ssgingsflv resesspgqr sislryegrv yhyrintasd gklyvssesr

181 fntlaelvhh hstvadglit tlhypapkrn kptvygvspn ydkwemertd itmkh

This is one of several standard methods used to show the amino acid sequence of a protein; most use the same single-letter abbreviation for eachamino acid (e.g., p = proline, s = serine, m = methionine, etc.). Note that in this example, the amino acid sequence is arranged in groups of 10 and

cloning fragments from many copies of an entire chromosome, sequencing the fragments, and then ordering the sequences

cloning the whole genome directly, from one end to the other

genetic mapping followed immediately by sequencing

cloning large genome fragments into very large vectors such as YACs, followed by sequencing

physical mapping followed immediately by sequencing

the sequencing of only the most highly conserved genes in a lineage

the sequence of one or two representative genes from several species

genomics as applied to an entire phylum

sequencing DNA from a group of species from the same ecosystem

genomics as applied to a species that most typifies the average phenotype of its genus



that the numbers at the beginning of each line denote the position of the first amino acid in that line in the overall sequence.

This CML-associated protein is known to be the product of a mutation event. Now you will enter this sequence into BLAST and look for similaramino acid sequences in the NCBI databases in order to identify the type of mutation and the genes that are affected.Go to the BLAST web site by clicking the Launch BLAST button. Then follow the instructions below.

BLAST Search Instructions

1. In the middle of the page under the Basic BLAST heading, click protein blast. A new page will appear.2. Copy the amino acid sequence from above (BLAST ignores numbers and spaces) and paste it in the Enter Query Sequence

box at the top of the new page.3. In the Choose Search Set box, in the pull-down menu to the right of Database, choose the database Non-redundant protein

sequence (nr).4. In the text box to the right of Organism, type "Homo sapiens," and then click on Homo sapiens (taxid: 9606).5. In the Program Selection box, choose blastp (protein-protein BLAST).6. In the lower left of the screen (you may need to scroll down), click the BLAST button. Wait for BLAST to complete the search.

Initial information (Conserved Domains) may appear quickly, but display of the full results may take 30 seconds. (If yoursearch returns a screen that displays No significant similarity found, open Hint 1 to see what might have gone wrong.)

7. A new screen displays your search results. Briefly scroll down to look at the three main sections, and notice the type ofinformation that is provided in each section. Note that in each of the three sections, similar sequences, or hits, are listedbeginning with the best statistical match to your query sequence.

The Graphic Summary gives a color-based summary of the sequence alignment between your query sequenceand the most similar sequences (hits). (For this exercise, you can ignore the conserved domains information atthe top.)The Descriptions section provides the unique accession number of each hit, a brief description of thesequence, and several scores that quantify the similarity between each hit and your query sequence.The Alignments section shows the alignment between each hit and your query sequence, amino acid by aminoacid.

8. Scroll to the top of the Graphic Summary section, but ignore the conserved domains information. Place your cursor over thetop red bar that represents the first hit.

9. Look at the information in the small text box immediately above the Graphic Summary figure. If your cursor is over the top redbar, the text in this box should read

CAM33009 bcr-abl1 e13a3 chimeric protein [Homo sapiens] S=491 E=7.5e-139For an explanation of what appears in the text box, see Hint 2.

10. Scroll down to the first entry in the Descriptions section. Notice that the information from the text box above the GraphicSummary matches the information in the Accession, Description, Max score (the same as the S value in the text box), and Evalue columns.

11. Scroll back up to the Graphic Summary. Again, place your cursor over the top red bar and click on it. This links you to theinformation in the Alignments section for this hit. Your hit will appear at the top of the screen beginning with its accessionnumber and description.

Based on the information given in the three different sections, which of the following statement(s) correctly describe(s) the hit that ismost similar to the query sequence? Select all that apply.

Hint 1. What to do if you end up with no similar sequences

If you receive the message "No significant similarity found," there are two likely reasons:

You searched the incorrect database. In the results screen, click on the Edit and Resubmit link in the upper left. In themiddle of the search page in the box labeled Choose Search Set, make sure that you have chosen the Non-redundantprotein sequence (nr) database. Then, make sure that you have specified Homo sapiens (taxid: 9606) as the organism. Inthe Program Selection box, make sure that that you have selected blastp (protein-protein BLAST). Scroll to the bottom ofthe page and click BLAST.You incorrectly entered the amino acid sequence into the Enter Query Sequence box at the top of the search page. After youhave entered your sequence into the text box, it should look something like this:

If this is not what the Enter Query Sequence box looks like on your screen, recopy the entire sequence and paste it into thebox again. Then scroll down to the bottom of the page and click BLAST.

Hint 2. How to interpret the descriptive information in the Graphic Summary

When you place your cursor over the top red bar representing the best hit (most similar sequence to your query), the information in the textbox above the figure should read

CAM33009 bcr-abl1 e13a3 chimeric protein [Homo sapiens] S=491 E=7.5e-139



CAM33009 is the unique accession number assigned when this sequence was submitted to the NCBI database.bcr-abl1 chimeric protein [Homo sapiens] is the description information for this hit.S=491 is the alignment score for this hit (also called the Max score in the Descriptions section).E=7.53e-139 is the expect value for this hit.

The S and E values are statistical indicators of how closely each hit matches your query sequence. Notice how those numbers change asyou go down the list of hits.

Hint 3. How to interpret the information in the Alignments section

The Alignments section of the results provides a comparison, amino acid by amino acid, between your query sequence and the hitsequence you have selected.

The amino acid sequence of the Query is shown in the top line, and the hit (Sbjct) sequence is shown in the bottom line. The middle lineshows the comparison between the query and hit sequences as follows:

If the amino acids at the aligned positions of the query and hit sequences are identical, the letter for that amino acid appearsin the center line.If there is a gap (no corresponding amino acid in either the query or hit sequence), a dash appears in the query or hitsequence that contains the gap, and the comparison line is blank.If amino acids in the query and hit sequences are similar but not identical (e.g., if both amino acids are small andhydrophobic), a + appears.

ANSWER:

It contains a total of 491 amino acids.

It is identical to the query sequence in length and amino acid sequence.

Its accession number is CAM33009.1.

It is a chimeric protein.



Correct

The query sequence you entered in BLAST is the exact sequence of one of several forms of the protein that is associated with CML. Youknow that the top hit is an exact match to your query sequence because in the Alignments section, you see that there is an exact match ateach amino acid position between the query sequence and the hit (Sbjct) sequence. This is also expressed in the statistics provided in theheader, with 100% Identities.

The second hit in the list (accession number CAM33010.1) is another variant of the CML-associated protein. Both the first and the second hitsare described as chimeric proteins, a term that refers to a protein that is made up of parts from two or more other proteins.

Part B - Identify categories of hit sequences

Scroll back to the Graphic Summary. The thick red bar directly underneath the color key for alignment scores represents the query sequence. Thenumbers below this line represent the positions of amino acids in the query sequence beginning at the N-terminal end (1) and ending at the C-terminal end (in this case, 235).The colored bars below the red query sequence bar show the hit sequences with their regions of similarity to the query sequence. In this search,only a few of the hits show similarity to the query sequence over its entire length. Most are similar to only parts of the query sequence.

Notice that most of the hits in your search can be organized into three general categories based on the region of the query sequence thatstatistically aligns with the hit sequences.

Identify the three main categories of hits in terms of similarity to the query sequence. (For help approaching this question, see Hint 1.)

Hint 1. How to approach this question

Keep in mind that the thick red bar directly underneath the color key for alignment scores represents the query sequence. Each bar belowthat represents a hit. The vertical position of each bar represents the relative similarity between the hit and the query sequences (mostsimilar at the top and least similar at the bottom).The length of each bar represents the portion of the query sequence that aligns with each hit sequence. For example, you can see thatmany of the purplish-pink bars near the bottom of the Graphic Summary span approximately the first 75 amino acids of the query sequence.Therefore, you can conclude that those hit sequences are similar to the first 75 amino acids of the query sequence.

ANSWER:

sequences that are similar to the last 170 amino acids of the query sequence

sequences that are similar to the first 170 amino acids of the query sequence

sequences that are similar to the entire length of the query sequence

sequences that are similar to the first 75 amino acids of the query sequence

sequences that are similar to the last 75 amino acids of the query sequence



Correct

Recall that the query sequence is a chimeric protein, a protein made up of parts of two or more other proteins. The Graphic Summary allowsyou to visualize that the chimeric protein associated with CML is made up of parts from two different proteins.The N-terminal 75 amino acids of the query sequence align closely with one set of hit sequences, and the C-terminal 170 amino acids alignwith a different set of hit sequences. Only a few of the hits align with all or most of the query sequence, and these are variants in a group ofclosely related proteins associated with CML.

Part C - Parts of which two proteins make up the chimeric CML-associated protein?

Recall that each hit sequence has an accession number and short description associated with it. The text that describes each accession issubmitted by the investigator who determines the gene or protein sequence and is usually a permanent part of the accession record; it is notupdated as new information about the gene or protein is discovered. As a result, there is often a great deal of variability associated withdescriptions of the same gene or protein. It is not uncommon to see descriptions that include unnamed or unknown protein.For example, scroll to the Graphic Summary and mouse over the hits that are similar only to the C-terminal end (amino acids 70-235) of the querysequence. Notice the range of descriptions in the text box above the Summary figure. These include (among several others)

tyrosine-protein kinase ABL1 isoform aunnamed protein productv-abl Abelson murine leukemia viral oncogene homolog 12FO0_A Chain A, Organization Of The Sh3-Sh2 Unit

Surprisingly, these all describe a small group of proteins with nearly identical sequences and functions. The common name for these proteins isAbelson murine leukemia protein, or ABL. They are all tyrosine-protein kinases, enzymes that regulate the activity of other proteins and arecommon in signal transduction pathways.

Examine the descriptions for the group of hits that align with the N-terminal end of the query sequence. Which of the following termsare found among these descriptions? Select all that apply.

ANSWER:

Correct

The shorter, N-terminal segment of the CML-associated protein aligns with the BCR protein, short for "breakpoint cluster region." The functionof this protein is unknown. The C-terminal segment of the CML-associated protein is the Abelson murine leukemia, or ABL, protein. Its normalfunction is the regulation of cell division. Mutations in the ABL protein are associated with CML. Therefore, the ABL gene is a proto-oncogene,and the gene encoding the BCR-ABL chimeric protein (the query sequence) is an oncogene.

Part D - On which chromosomes are the ABL and BCR proteins encoded?

Information on the chromosomal location of genes is found in gene and protein reports linked to each accession number in the NCBI database. Adirect link to these reports is available from the accession number in either the Descriptions or Alignments section.To find the chromosome where the normal ABL gene is located, follow these instructions:

12. Scroll to the top of the Descriptions section.

gene bcr

unnamed protein product

breakpoint cluster region

BCR variant



13. Click on accession number NP_005148.2. (It should be among the first 5-6 accessions listed.)14. Now you should see the protein record for accession number NP_005148.2. Scroll down through the headings on the left-

hand side of the page until you find FEATURES.15. Under the subheading source, you will find the chromosome where the ABL gene is located. It should look like this:

16. Repeat these steps to find the chromosome where the normal BCR gene is located. (Use your browsers back button toreturn to the Descriptions section.) Use accession number EAW59564.1 to locate the appropriate protein report. (See Hint 1for more help.)

On which chromosome is the BCR gene normally located?

Hint 1. How to uses your browsers Find function to locate a specific accession number

When searching for a specific accession number, or for specific text in a protein report, one of the most efficient techniques is to use yourbrowsers Find function. It is usually under the Edit menu in your browser. On PCs, you can use the keyboard shortcut Ctrl+F. On Macs,CMD+F.

ANSWER:

Correct

Now you know that the normal ABL gene is encoded on chromosome 9 and the normal BCR gene is encoded on chromosome 22.

Part E - What type of mutation produces the BCR-ABL chimeric protein?

Only mutations that involve chromosomal rearrangements can result in the fusion of two different genes, which could code for a chimeric proteinsuch as the BCR-ABL protein. Simple point mutations--insertion, deletion, or replacement of a single nucleotide--do not result in the fusion of twodifferent genes.

Which type of chromosomal rearrangement accounts for the creation of the gene that encodes the BCR-ABL chimeric protein? (For areview of the four different types of chromosomal rearrangements, see Hint 1.)

Hint 1. How to distinguish the four types of chromosomal rearrangements

Mutations that result in rearrangement of segments of DNA require the DNA molecule to break in at least two places and for the segment(s)that are formed by these breaks to reattach in a different configuration. This not only changes the positions of genes on the chromosomesbut can also disrupt any gene in which a break occurs. The four kinds of chromosomal rearrangements are shown below.

1

9

20

22



Notice that deletions, duplications, and inversions all involve rearrangement of DNA segments on a single chromosome, whereastranslocations involve rearrangements of segments from two or more chromosomes.

ANSWER:

Correct

The normal BCR and ABL genes are located on chromosomes 22 and 9, respectively. The only type of mutation that can lead to the fusion ofgenes from two different (nonhomologous) chromosomes is a translocation.

Part F - Why is the BCR-ABL chimeric protein associated with cancer?

Chromosomal translocations rarely result from breaks in the DNA at an exact gene boundary. Rather, the breaks often occur in the middle ofgenes, resulting in only part of each gene contributing to the chimeric protein. This is the case for the BCR-ABL chimeric protein.

The function of the normal BCR protein is not known, but the fragment of the BCR protein in the chimeric protein is not thought to contributedirectly to its oncogenic function. On the other hand, the ABL protein functions in the regulation of cell division, and mutations in such proteinsoften are related to cancer. Loss of part of the ABL gene in the translocation mutation converts the normal ABL proto-oncogene into the chimericBCR-ABL oncogene.

To find the segment of the normal ABL protein that is part of the BCR-ABL chimeric protein, follow these instructions:

17. Scroll to the Alignments section.18. Find accession number NP_005148.2.19. From the alignment, identify which portion of the normal ABL protein is found in the BCR-ABL chimeric protein. Open Hint 1 if

inversion

translocation

duplication

deletion



youre not sure where to find that information.

Which segment of the normal ABL protein aligns with the query sequence? Provide your answer in this format: number of first aminoacid, number of last amino acid. (For example, if the first amino acid is 23 and the last amino acid is 413, enter 23, 413.)

Hint 1. Where to find the information you need to answer this question

Recall that the Alignments section shows the amino acid-by-amino acid alignment of each hit sequence with the query sequence. A typicalalignment is shown below:

Each group of three lines is the comparison between the query sequence and the hit (Sbjct) sequence. The numbers at the beginning ofeach line indicate the first amino acid in the line in the query or hit (Sbjct) sequence. The numbers at the end of each line represent the lastamino acid in the line.

In this example, amino acids 419-585 of the hit (Sbjct) sequence are aligned with amino acids 68-235 of the query sequence.

ANSWER:

Correct

Amino acids 80-246 of the normal ABL protein align with amino acids 68-235 of the BCR-ABL chimeric protein. Another way to think aboutthis is that the DNA encoding the first 79 amino acids of the normal ABL protein is cut off during the translocation. As a result, those first 79amino acids are missing from the chimeric protein.The normal ABL protein (represented by the long gray bar in the figure below) contains three important domains that contribute to the proteinsrole in regulating cell division: SH3, SH2, and PTKc_Abl. The SH3 domain is important because it normally prevents the ABL protein fromfunctioning unless a specific signaling molecule binds to the SH3 region.The translocation mutation that leads to the formation of the BCR-ABL chimeric protein truncates the ABL protein, resulting in the loss of theportion of the SH3 domain between amino acids 66 and 80. Without a complete SH3 domain, the BCR-ABL chimeric protein continuallypromotes cell division in myeloid (bone marrow) cells, resulting in chronic myelogenous leukemia.


Part A

What is proteomics?

ANSWER:

80,246



Correct


Part A

What is gene annotation in bioinformatics?

ANSWER:

Correct

Chapter 21 Pre-Test Question 3

Part A

Which of the following statements about genome sizes is true?

Hint 1.

Refer to Concept 21.3 in your textbook.

ANSWER:

Correct

The only exceptions may be intracellular parasites whose genomes have undergone reduction because they are dependent on their hosts formany functions.


Part A

The number of genes correlates with _____.

the totality of the functional possibilities of a single protein

the study of how amino acids are ordered in a protein

the linkage of each gene to a particular protein

the study of the full protein set encoded by a genome

the study of how a single gene activates many proteins

assigning names to newly discovered genes

comparing the protein sequences within a single phylum

matching the corresponding phenotypes of different species

finding transcriptional start and stop sites, RNA splice sites, and ESTs in DNA sequences

describing the functions of noncoding regions of the genome

Species within a phylogenetic group such as flowering plants or insects have similar genome sizes.

Most eukaryotes have larger genomes than most prokaryotes.

The human genome is the largest and most complex.

Large animals have larger genomes than plants.

All of the above statements are true.



Hint 1.

Why do genome sizes vary?

ANSWER:

Correct

Prokaryotic genomes are compact, so the size of the genome accurately reflects the number of genes.


Part A

Which of the following is a representation of gene density?

ANSWER:

Correct


Part A

What type of noncoding DNA comprises the largest portion of multicellular eukaryotic genomes?

Hint 1.

What causes variation in genome size among related species?

ANSWER:

Correct

Transposons comprise the majority of noncoding DNA.

the size of the genome in prokaryotes

being higher in Archaea than in eukaryotes

being much higher among Eubacteria than in Archaea

and is approximately equal to the number of different proteins that a species can make

the size of the genome in eukaryotes

Humans have ~20,000 genes in 2,900 Mb.

C. elegans has ~20,000 genes.

Fritillaria has a genome 40 times the size of a human.

Humans have 2,900 Mb per genome.

Humans have 27,000 bp in introns.

centromeric sequences

pseudogenes

transposons

introns

gene regulatory sequences




Part A

How has gene duplication played a critical role in evolution?

Hint 1.

Evolution constantly requires new sources of genetic variation.

ANSWER:

Correct

The prevalence of multigene families attests to the importance of gene duplication.


Part A

Which of the following can be duplicated in a genome?

ANSWER:

Correct


Part A

A multigene family is composed of

ANSWER:

It almost always introduces immediate benefits for the organism.

It increases the likelihood of viral infection in cells.

It produces redundant copies of existing genes, which are then free to mutate and adopt new functions.

It increases the number of pseudogenes in the genome.

It increases the amount of DNA in the genome.

entire chromosomes only

sequences, chromosomes, or sets of chromosomes

entire sets of chromosomes only

DNA sequences above a minimum size only

DNA sequences below a minimum size only

genes whose sequences are very similar and that probably arose by duplication.

a gene whose exons can be spliced in a number of different ways.

a highly conserved gene found in a number of different species.

multiple genes whose products must be coordinately expressed.

the many tandem repeats such as those found in centromeres and telomeres.



Correct


Part A

Use the following figure to answer the next question.

Figure shows a diagram of blocks of genes on human chromosome 16 and the locations of blocks of similar genes on four chromosomes of themouse.

The movement of these blocks suggests that

ANSWER:

Correct


Part A

When does exon shuffling occur?

ANSWER:

Correct


Part A

What is the goal of comparative genomic studies?

chromosomal translocations have moved blocks of sequences to other chromosomes.

during evolutionary time, these sequences have separated and have returned to their original positions.

higher mammals have more convergence of gene sequences related in function.

DNA sequences within these blocks have become increasingly divergent.

sequences represented have duplicated at least three times.

during meiotic recombination

during post-translational modification of proteins

during DNA replication

during splicing of DNA

during faulty DNA repair



Hint 1.

The power to sequence and analyze whole genomes will accelerate discovery.

ANSWER:

Correct

This is the best answer.


Part A

In order to determine the probable function of a particular sequence of DNA in humans, what might be the most reasonable approach?

ANSWER:

Correct


Part A

Bioinformatics includes all of the following except

ANSWER:

Correct

Misconception Question 104

Part A

to study how genomes evolve

to identify genes that are important for evolution of a particular species

to study genetic variation within a species or a population

to identify homologues in model organisms for genes involved in human disease

All of the above are goals of comparative genomic studies.

Prepare a genetically engineered bacterial culture with the sequence inserted and assess which new protein is synthesized.

Look for a reasonably identical sequence in another species, prepare a knockout of this sequence in that species, and look for theconsequences.

Genetically engineer a mouse with a copy of this sequence and examine its phenotype.

Mate two individuals heterozygous for the normal and mutated sequences.

developing computer-based tools for genome analysis.

using computer programs to align DNA sequences.

using molecular biology to combine DNA from two different sources in a test tube.

using mathematical tools to make sense of biological systems.

analyzing protein interactions in a species.



Identify the correct statement(s) about transposable elements.

Select all that apply.

ANSWER:

Correct

Read about transposable elements and other repetitive DNA.


Part A

Use the following information to help you answer the next question.Multigene families include two or more nearly identical genes or genes sharing nearly identical sequences. A classical example is the set of genesfor globin molecules, including genes on human chromosomes 11 and 16.How might identical and obviously duplicated gene sequences have gotten from one chromosome to another?

ANSWER:

Correct


Part A

By what mechanism might transposons contribute to gene duplication?

Hint 1.

Gene duplication is an important driver of evolution.

ANSWER:

Correct

Illegitimate recombination between transposon sequences at different loci leads to chromosome rearrangements and gene deletions andduplications.

Transposable elements are called jumping genes because they detach from their location in DNA before reattaching in a differentlocation.

Telomeric simple sequence DNA is made up of transposable elements.

Transposable elements and related sequences make up 44% of the human genome.

by chromosomal translocation

by deletion followed by insertion

by transcription followed by recombination

by normal meiotic recombination

by normal mitotic recombination between sister chromatids

Transposon insertion may disrupt an exon.

Transposons may lead to slippage of DNA polymerase during DNA replication.

Transposons cause failure of chromosomes to segregate during mitosis.

Transposons may promote accidents in meiosis, leading to polyploidy.

Transposons may promote unequal crossing over during meiosis.




Part A

Unequal crossing over during prophase I can result in one sister chromosome with a deletion and another with a duplication. A mutated form ofhemoglobin, so-called hemoglobin Lepore, exists in the human population. Hemoglobin Lepore has a deleted series of amino acids. If this mutatedform was caused by unequal crossing over, what would be an expected consequence?

ANSWER:

Correct


Part A

Fragments of DNA have been extracted from the remnants of extinct woolly mammoths, amplified, and sequenced. These can now be used to

ANSWER:

Correct


Part A

A recent study compared the H. sapiens genome with that of Neanderthals. The results of the study indicated that there was a mixing of the twogenomes at some period in evolutionary history. The data that suggested this were

ANSWER:

Correct


Each of the genes in the hemoglobin gene family must show the same deletion.

There should also be persons whose hemoglobin contains two copies of the series of amino acids that is deleted in hemoglobin Lepore.

If it is still maintained in the human population, hemoglobin Lepore must be selected for in evolution.

The deleted region must be located in a different area of the individual's genome.

The deleted gene must have undergone exon shuffling.

clone live woolly mammoths.

understand the evolutionary relationships among members of related taxa.

study the relationships among woolly mammoths and other wool-producers.

introduce into relatives, such as elephants, certain mammoth traits.

appreciate the reasons why mammoths went extinct.

a number of modern H. sapiens with Neanderthal sequences.

mitochondrial sequences common to both groups.

Neanderthal Y chromosomes preserved in the modern population of males.

some Neanderthal sequences not found in humans.



Part A

One of the characteristics of retrotransposons is that

ANSWER:

Correct


Part A

What is the difference between a linkage map and a physical map?

ANSWER:

Correct


Part A

Which procedure is not required when the shotgun approach to sequencing is modified as sequencing by synthesis, in which many smallfragments are sequenced simultaneously?

ANSWER:

Correct

Score Summary:

Your score on this assignment is 85.0%.You received 25.51 out of a possible total of 30 points.

they generally move by a cut-and-paste mechanism.

they are found only in animal cells.

they contribute a significant portion of the genetic variability seen within a population of gametes.

they code for an enzyme that synthesizes DNA using an RNA template.

their amplification is dependent on a retrovirus.

The ATCG order and sequence must be determined for a linkage map but not for a physical map.

Markers are spaced by recombination frequency on a linkage map and by number of base pairs on a physical map.

There is no difference between the two except in the type of pictorial representation.

A linkage map shows how each gene is linked to every other gene, but a physical map does not.

Distances must be calculable in units such as nanometers on a physical map but not on a linkage map.

cloning each fragment into a plasmid

PCR amplification

use of restriction enzymes

ordering the sequences

sequencing each fragment

Documents

Genomes and Their Evolution