65
Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis and structure RNA synthesis and processing DNA replication Basics of transmission genetics Note: many of the figures used in this presentation are copyrighted. Most are taken from "Genetics: From Genes to Genomes" by Hartwell and colleagues (McGraw Hill)

Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Embed Size (px)

Citation preview

Page 1: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Basic Biology for Bioinformatics:genes as information

The central dogma of molecular geneticsDNA to RNA to protein to phenotype

Protein functions, synthesis and structureRNA synthesis and processingDNA replicationBasics of transmission genetics

Note: many of the figures used in this presentation are copyrighted. Most are taken from "Genetics: From Genes to Genomes" by Hartwell and colleagues (McGraw Hill)

Page 2: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Biology for bioinformatics:

Alignment of pairs of sequencesMultiple sequence alignmentPrediction of RNA secondary structurePhylogenetic predictionDatabase searching for sequencesGene predictionAnalysis of microarray expression dataProtein classificationProtein folding / structure predictionGenome analysis / databasesGenetic variation (haplotypes and allelic association)

Page 3: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 4: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

What is it about DNA that allows it to carry information?

Page 5: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 6: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 7: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 8: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA polymeraseAlberts et al. Fig. 6-36

Page 9: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Molecular genetics: genes as information

DNA -> RNA -> protein.

DNA is digital information.Each nucleotide carries 2 bits of information.

ImplicationsLow-error propagation.Complete representation in digital databases.

Aquisition of genetic information is the raw fuel behind the explosion of bioinformatics

Page 10: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Clelland et al. Nature 399:533. Hiding messages in DNA microdots.

Page 11: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

"For it is not cell nuclei, not even individual chromosomes, but certain parts of certain chromsomes from certain cells that must be isolated and collected in enormous quantities for analysis; that would be the precondition for placing the chemist in such a position as would allow him to analyze [the hereditary material] more minutely than the morphologists."

- Theodor Boveri 1904

If the information in DNA is contained in single molecules, how can we know about it?

We reduce the complexity of the DNA by amplification and use the power of complementarity to detect specific sequences by hybridization.

Determination of the chromosomal location of TGx in the human genome by fluorescent in situ hybridization.(from Daniel Aeschlimann's web site (Univ. of Wales)http://www.uwcm.ac.uk/study/dentistry/bds/staff/aeschlimann.htm

Page 12: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

from Konstantin V. Krutovskii and David B. Neale 2001"Forest Genomics for Conserving Adaptive Genetic Diversity"

Microarrays

ArrayScanVisualizeAnalyze

Page 13: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Photolithographic arrays(Affymetrix)

from www.affymetrix.com

Each spot has an oligo with a distinct sequence

Page 14: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Homologous proteins conserve elements of genetic information (sequence).

Page 15: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

New gene functions can arise from pre-existing gene functions

Page 16: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Related genes retain sequence similarity.

Page 17: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Proteins:enzymes alkaptonuria

phenylketonuriaphenylalanine buildup in the brain can cause mental retardation

Page 18: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Proteins:regulators

Page 19: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Structural proteins

Ehlers-Danlos syndrome (joint hypermobility) is one of the phenotypes associated with mutations in genes encoding collagen.

Page 20: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Proteins

What do they do?

see http://www.ncbi.nlm.nih.gov/cgi-bin/COG/palox?fun=all

Page 21: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Page 22: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 23: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 24: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 25: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Page 26: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Page 27: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

protein

Hydrogen bonds within the protein and the rigidity of the peptide bond are critical determinants of protein structure.

Page 28: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Molecular Biology of the Cell.1994. Figure 3-30 -helix

Page 29: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Molecular Biology of the Cell.1994. Figure 3-29

ß-sheet

Page 30: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

NCBI provides information about proteins

Page 31: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

GenBank flat file format for HA oxidase

Page 32: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

GenBank fasta file format for HA oxidase

Page 33: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Links to other information about HA oxidase

Page 34: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

The HA oxidase gene and its flanking region on chromosome 3q21

Page 35: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

OMIM: Alkaptonuria is caused by mutations in HA oxidase

Page 36: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Conserved Domains

Page 37: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Three-dimensionalstructure of the protein, if known, can be viewed.

Page 38: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Lectures 8 and 35 will cover types of mutation in detail

Page 39: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Gene density in selected genomesSpecies Genome size Gene # Ave. Size

(Mb.)

Eschericia coli 4.7 4,300 1.1 kb.

Saccharomyces cerevisiae 12.1 6,000 2.0 kb.

C. elegans 97 16,000 6.0 kb.

Arabidopsis 115 25,500 4.5 kb.

Drosophila melanogaster 120 13,600 8.8 kb.

Homo sapiens 3,200 75,000 ? 40.0 kb.30,000 100.0 kb.

CDS (coding sequence) sizes do not vary much at all, between 1.3 and 1.5 kb.

Page 40: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

What's in the genome besides genes:

introns

Page 41: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

What's in the genome besides genes:

remote regulatory DNA

Page 42: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Lecture 14 will cover transcription in detail

Page 43: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein to phenotype

Page 44: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 45: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA

Page 46: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 47: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 48: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

Page 49: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

DNA to RNA to protein

DNA must be maintained. Natural processes can degrade the information in the DNA

Page 50: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Molecular Biology of the Cell, third edition, panel 1-1

Cells and organelles

Page 51: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

4C 46 chromosomes

each with 2 duplexes

4C 92 chromosomes

2C 46 chromosomes

per cell

Page 52: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Mitosis: heterozygosity is maintained

Page 53: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Meiosis results in new combinations of alleles

Page 54: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Mendel's laws of segregation and independent assortment come from meiosis

Page 55: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

A

Aa

a

a

a

a

a

A

A

A

A

B

Bb

b

b

b

b

b

B

B

B

B

Page 56: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Recombination

A

Aa

a

B

bB

b

A

A

B

b

A

A

B

b

a

a

B

b

a

a

B

b

Page 57: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Measuring rates of recombination.

Page 58: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 59: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 60: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 61: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis
Page 62: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Formal definition of linkage disequilibrium

If two loci have alleles A1, A2 with frequencies p1, p2 and B1, B2 with frequencies q1, q2, there are four possible haplotypes (A1B1, A1B2, A2B1, and A2B2). Let these frequencies be f1,1, f1,2, f2,1, f2.2.

If there is no linkage disequilibrium, then f1,1 = p1 q1 , f1,2 = p1 q2 , and so on.

There are a number of measures of linkage disequilibrium. One of them is D = f1,1f2.2 - f1,2f2.1.

Page 63: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Interpreting allelic associationThe general case is described by an isolated population that has high frequencies (p and r respectively) of both a disease-causing allele D1 and an unlinked marker M1. The descendents of people who move from that population to a second population with different frequencies will show association between D1 and M1 even though they are not linked.

p = .02, r = .5p = .0001r = .1

The disease-causing allele is at a high frequency in a small village.

Affected people in a nearby city are more likely to have other alleles, such as M1, that are found in elevated frequencies in that village merely because they have ancestors from that village.

Page 64: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Biology for bioinformatics:

Alignment of pairs of sequencesMultiple sequence alignmentPrediction of RNA secondary structurePhylogenetic predictionDatabase searching for sequencesGene predictionAnalysis of microarray expression dataProtein classificationProtein folding / structure predictionGenome analysis / databasesGenetic variation (haplotypes and allelic association)

Page 65: Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis

Next time:more about the status of those problems and current state of the art methods.

Tutorial II: Monday, May 10, 2118 CSIC, 2:00 - 3:45