Bioinformatics Why Can’t It Tell Us Everything?

Preview:

DESCRIPTION

Bioinformatics Why Can’t It Tell Us Everything?. Bioinformatics What are our Data Sets?. Interested in information flow with cells Currently, the key information is mostly a matter of biological macromolecules - PowerPoint PPT Presentation

Citation preview

BioinformaticsWhy Can’t It Tell Us Everything?

BioinformaticsWhat are our Data Sets?

• Interested in information flow with cells

• Currently, the key information is mostly a matter of biological macromolecules

• Eventually, information of interest will also include flow of nutrients, energy, and impact of small molecules on macromolecular function

BioinformaticsWhat are our Questions?

• What is in there?• What does it do?• How similar is it to something else?• How does it fold?• Where does it go in a cell?• What does it interact with?• How it is regulated?• Level of confidence?

* Function of organism is determined by function of its cells  * Function of cells determined by chemical reactions that take place within them  * Chemical reactions occur or not according to presence and activity of enzymes * Enzymes are proteins  * Proteins are determined by genes  * Therefore, genes determine organismal function

BioinformaticsLogical Reasoning Behind Data Sets

Genomics

Proteomics

Central DogmaFlow of Information

Central DogmaDNA as the Blueprint for Life?

Central DogmaDNA as the Blueprint for Life?

Central Dogma

DNA RNA Protein

Genes & proteins are different molecular languages,

but they are colinear

DNA

Basic Unit (alphabet): Nucleotide (base) Only 4: A, T, G, and C

Double-stranded: A<>T and G<>C

5’..AGCTGCATGCTAGCTGACGTCA….3’ 3’..TCGACGTACGATCGACTGCAGT….5’

“Words” (genes) to encode proteins, RNA

Double helical

DNA Tower in Perth, AUS

DNAStructure Connected to Information

DNAReplication & Transcription as Algorithms

• With rare exceptions, all DNA is replicated

• Crucial tool is ability to go from one strand to another

• Transcription uses same base-pairing rules with U instead of T, but occurs in packets

Transcription = DNA to RNAWhere to Start is a Big Question

Protein

Alphabet: amino acids

There are 20 amino acids

Met Cys Ser Leu Ala Ala Val

ProteinsNumber of Possible 100-mer Peptides?20 possible residues at each

position

For 2-mers, 20 possible at position 1 and 20 possible at position 2, so 20 x 20 = 202 = 400

Same logic for 100-mers, 20100 = 2100 x 10100 =

(210) 10 x 10100 =

~ (103) 10 x 10100 = 10130

beta-pleated sheet

ProteinsFolding Starts Local

alpha-helix

ProteinsFolding Goes Global

ProteinsPredictive Protein Folding as Holy Grail

Protein

Alphabet: amino acids

There are 20 amino acidsEncoded by codons (triplets of nucleotides)

Met Cys

ATGTGCAGCCTAGCTGCCGTC

Ser

CTAGCTGCCGTC

Leu Ala Ala Val

Genetic Code Found on Earth:How Does It Work?

5’-UCGACCAUGGUUGACCAUUGAUUACCACG-3’

Genetic Code

• Triplet• Nonoverlapping• Comma-less• Redundant

Bioinformatics:Mining a Mountain of Data

Where are the putative genes?

Recommended