82
Computational Biology: Molecular Biology Primer es and slides taken from various sources on the int ding, www.bioalgorithms.info (sources mostly acknow

Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including, (sources

Embed Size (px)

Citation preview

Page 1: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Computational Biology:Molecular Biology Primer

Figures and slides taken from various sources on the internetincluding, www.bioalgorithms.info (sources mostly acknowledged)

Page 2: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Cells

• Fundamental working units of every living system. • Every organism is composed of one of two radically different types of cells: prokaryotic cells or eukaryotic cells.• Prokaryotes and Eukaryotes are descended from the same primitive

cell. – All extant prokaryotic and eukaryotic cells are the result of a total of

3.5 billion years of evolution.

Page 3: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Evolutionary Tree of Life

Njsas.org

Page 4: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Bacterial Cell (Prokaryote)

Uccs.edu

Page 5: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Animal Cell (Eukaryotic)

Faculty.southwest.tn.edu

Page 6: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Prokaryotes and Eukaryotes

Prokaryotes Eukaryotes

Single cell Single or multi cell

No nucleus Nucleus

No organelles Organelles

One piece of circular DNA Chromosomes

No mRNA post transcriptional modification

Exons/Introns splicing

Page 7: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Prokaryotic and Eukaryotic CellsChromosomal differences

Prokaryotes The genome of E.coli contains

amount of 4X106 base pairs > 90% of DNA encode protein

Lacks a membrane-bound nucleus. Circular DNA and supercoiled domain

Histones are unknown

Eukaryotes The genome of yeast cells contains 1.35x107 base pairs A small fraction of the total DNA

encodes protein. Many repeats of non-coding

sequences All chromosomes are contained in

a membrane bound nucleus DNA is divided between two or

more chromosomes A set of five histones

DNA packaging and gene expression regulation

Page 8: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Cells: Information and Machinery

• Cells store all information to replicate itself– Human genome is around 3 billions base pair long– Almost every cell in human body contains same

set of genes– But not all genes are used or expressed by those

cells

• Machinery:– Collect and manufacture components– Carry out replication– Kick-start its new offspring(A cell is like a car factory)

Page 9: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Some Terminology

• Genome: an organism’s genetic material

• Gene: a discrete units of hereditary information located on the chromosomes and consisting of DNA.

• Genotype: The genetic makeup of an organism

• Phenotype: the physical expressed traits of an organism

• Nucleic acid: Biological molecules(RNA and DNA) that allow organisms to reproduce;

Page 10: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

All Life depends on 3 critical molecules• DNAs

– Hold information on how cell works

• RNAs– Act to transfer short pieces of information to different parts of

cell

– Provide templates to synthesize into protein

• Proteins– Form enzymes that send signals to other cells and regulate gene

activity

– Form body’s major components (e.g. hair, skin, etc.)

Page 11: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Cells & DNA

Ogm-info.com

Page 12: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

DNA: The Code of Life

• The structure and the four genomic letters code for all living organisms • Adenine, Guanine, Thymine, and Cytosine which pair A-T and C-G on

complimentary strands.

Page 13: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Superstructure: DNA and Chromosome

Lodish et al. Molecular Biology of the Cell (5th ed.). W.H. Freeman & Co., 2003.

(length of 1 bp)(number of bp per cell)(number of cells in the body)(0.34 × 10-9 m)(3 × 109)(1013) = 1.0 × 1013 meters (35 trips to Sun!)

Page 14: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Chromosomes

Organism Number of base pair number of Chromosomes

---------------------------------------------------------------------------------------------------------

Prokayotic

Escherichia coli (bacterium) 4x106 1

Eukaryotic

Saccharomyces cerevisiae (yeast) 1.35x107 17

Drosophila melanogaster(insect) 1.65x108 4

Homo sapiens(human) 2.9x109 23

Zea mays(corn) 5.0x109 10

Page 15: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Building Blocks of Biological Systems: nucleotides and amino acids

DNA (nucleotides, 4 types): information carrier/encoder.RNA: bridge from DNA to protein.Protein (amino acids, 20 types): action molecules.

Page 16: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 17: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Four types of nucleic acids of DNADNA

Note that A pairs with T; and G pairs with C.

Page 18: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 19: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Primary Structure of DNA

atgaatcgta ggggtttgaa cgctggcaat acgatgactt ctcaagcgaa cattgacgac ggcagctgga aggcggtctc cgagggcgga ……

• Unbranched polymer

• Sequence of nucleotide bases

• Double stranded

Page 20: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 21: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

DNA, continued• DNA has a double helix

structure which composed of – sugar molecule– phosphate group– and a base (A,C,G,T)

• DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’3’ TAAATCCGG 5’

Page 22: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Processes

• Replication of DNA

• Transcription of gene (DNA) to messenger RNA (mRNA)

• Translation of mRNA into proteins

• Folding of proteins into 3D from

• Biochemical or structural functions of proteins

Page 23: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

DNA, RNA, and the Flow of Information

TranslationTranscription

Replication

Page 24: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Overview of DNA to RNA to Protein

• A gene is expressed in two steps1) Transcription: RNA synthesis2) Translation: Protein synthesis

Page 25: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

© 1999 The International Herpes Management Forum, all rights reserved.

Page 26: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

RNA• RNA is similar to DNA chemically. It is usually only a

single strand. T(hyamine) is replaced by U(racil)

• Some forms of RNA can form 3D structures by “pairing up” with itself.

DNA and RNA

can pair with

each other.

http://www.cgl.ucsf.edu/home/glasfeld/tutorial/trna/trna.gif

tRNA linear and 3D view:

Page 27: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 28: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

RNA, continued• Several types exist, classified by function

• mRNA – this is what is usually being referred to when a Bioinformatician says “RNA”. This is used to carry a gene’s message out of the nucleus.

• tRNA – transfers genetic information from mRNA to an amino acid sequence

• rRNA – ribosomal RNA. Part of the ribosome which is involved in translation.

Page 29: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 30: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Terminology for Splicing• Exon: A portion of the gene that appears in

both the primary and the mature mRNA transcripts.

• Intron: A portion of the gene that is transcribed but excised prior to translation.

• Lariat structure: The structure that an intron in mRNA takes during excision/splicing.

• Spliceosome: A organelle that carries out the splicing reactions whereby the pre-mRNA is converted to a mature mRNA.

Page 31: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Definition of a Gene

• Regulatory regions: up to 50 kb upstream of +1 site

• Exons: protein coding and untranslated regions (UTR)1 to 178 exons per gene (mean 8.8)8 bp to 17 kb per exon (mean 145 bp)

• Introns: splice acceptor and donor sites, junk DNAaverage 1 kb – 50 kb per intron

• Gene size: Largest – 2.4 Mb (Dystrophin). Mean – 27 kb.

Page 32: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Splicing

Page 33: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Splicing and other RNA processing

• In Eukaryotic cells, RNA is processed between transcription and translation.

• This complicates the relationship between a DNA gene and the protein it codes for.

• Sometimes alternate RNA processing can lead to an alternate protein as a result. This is true in the immune system.

Page 34: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Splicing (Eukaryotes)• Unprocessed RNA is

composed of Introns and Extrons. Introns are removed before the rest is expressed and converted to protein.

• Sometimes alternate splicings can create different valid proteins.

• A typical Eukaryotic gene has 4-20 introns. Locating them by analytical means is not easy.

Page 35: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Translation: Proteins• Codon: The sequence of 3 nucleotides in DNA/RNA that

encodes for a specific amino acid. • mRNA (messenger RNA): A ribonucleic acid whose sequence is

complementary to that of a protein-coding gene in DNA.• Ribosome: The organelle that synthesizes polypeptides under the

direction of mRNA• rRNA (ribosomal RNA):The RNA molecules that constitute the

bulk of the ribosome and provides structural scaffolding for the ribosome and catalyzes peptide bond formation.

• tRNA (transfer RNA): The small L-shaped RNAs that deliver specific amino acids to ribosomes according to the sequence of a bound mRNA.

Page 36: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

© 1999 The International Herpes Management Forum, all rights reserved.

Page 37: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Purpose of tRNA

• The proper tRNA is chosen by having the corresponding anticodon for the mRNA’s codon.

• The tRNA then transfers its aminoacyl group to the growing peptide chain.

• For example, the tRNA with the anticodon UAC corresponds with the codon AUG and attaches methionine amino acid onto the peptide chain.

Page 38: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Translation: Universal Genetic Code

• Translation form nucleotide code to amino acid code.

atgaatcgta ggggtttgaa cgctggcaat acgatgactt ctcaagcgaa cattgacgac ggcagctgga aggcggtctc cgagggcgga ……

MNRRGLNAGNTMTSQANIDDGSWKAVSEGG …

Page 39: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Uncovering the code

• Scientists conjectured that proteins came from DNA; but how did DNA code for proteins?

• If one nucleotide codes for one amino acid, then there would be 41 amino acids

• However, there are 20 amino acids, so at least 3 bases codes for one amino acid, since 42 = 16 and 43 = 64– This triplet of bases is called a “codon”

– 64 different codons and only 20 amino acids means that the coding is degenerate: more than one codon sequence code for the same amino acid

Page 40: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Genetic Code

uccs.edu

Page 41: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Translation

• The process of going from RNA to polypeptide.

• Three base pairs of RNA (called a codon) correspond to one amino acid based on a fixed table.

• Always starts with Methionine and ends with a stop codon

Page 42: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 43: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 44: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Proteins: Workhorses of the Cell• 20 different amino acids

– different chemical properties cause the protein chains to fold up into specific three-dimensional structures that define their particular functions in the cell.

• Proteins do all essential work for the cell– build cellular structures– digest nutrients – execute metabolic functions– Mediate information flow within a cell and among

cellular communities. • Proteins work together with other proteins or nucleic acids as

"molecular machines" – structures that fit together and function in highly

specific, lock-and-key ways.

Page 45: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Sequence of Amino Acids: Protein

• Unbranched polymer

• Peptide backbone

• Twenty side chain types

• 3D structure the key

Page 46: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Amino Acid

Page 47: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 48: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Polypeptide Chain

Page 49: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Torsion Angles

Page 50: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Primary Structure

• Unbranched polymer

• 20 side chains

MNRRGLNAGNTMTSQANIDDGSWKAVSEGG …

Page 51: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

3D: Higher Order Protein Structures

• Higher order structures– Secondary: local in sequence– Tertiary: 3D fold of one polypeptide chain– Quaternary: Chains packing together

Page 52: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Structure• Proteins are composed of twenty

amino acids

• Amino acids are composed of a

central carbon atom attached to four

groups of atoms.

• Amino acids (AA) combine in

sequence to form a polymer

(polypeptide chain).

• A chain folds into a specific 3D

shape by the laws of nature.

• A protein is made up of one or more

chains.

Page 53: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Structure Hierarchy

Primary structure: The linear sequence of amino acids comprising a protein:

AGVGTVPMTAYGNDIQYYGQVT…AGVGTVPMTAYGNDIQYYGQVT…Secondary structure: local ordered sub-structures two common types: α-Helix and β-Sheet

Beta sheet

Page 54: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Structure HierarchyTertiary structure:

"global" folding of a single chain in 3D form.

Page 55: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

1BGS-E (BARNASE) 1BGS-A (BARSTAR)

1BGS-AE

Page 56: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Folding

• Proteins the action molecules of life

• 3D Structure critical to function

• Many protein sequences fold spontaneously to native 3D fold

Page 57: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Folding• The structure that a

protein adopts is vital to it’s chemistry

• Its structure determines which of its amino acids are exposed carry out the protein’s function

• Its structure also determines what substrates it can react with

Page 58: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Folding

•A folding pathway is defined as a time sequence of folding events.

•If protein folding were a random process, folding would be very slow.

Page 59: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Folding Pathways

•Early events in the folding pathway speed folding by eliminating other possible pathways.

Page 60: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein FunctionsBinding Catalysis

Switching Structural Proteins

(petsko & ringe; protein structure & function)

initiate transcription bind oxygen

GDP-bound: Off GTP-bound: OnMuscle Fibers

Page 61: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Functions• Proteins as molecular machines

– Change shape, open/close, twist/turn

• Proteins as catalysts– enzymes — proteins that act as catalysts — make life possible

• Proteins work together in pathways — sequences of chemical reactions — to perform a wide variety of functions

• Metabolism — mediate chemical reactions (oxygen to burn food). • Signaling — Signaling pathways within a cell & between cells. • Regulation — control reactions; gateways in cellular membranes. • Cellular structure — define cell shape and form. • Transportation — move other proteins, oxygen, sugar, nutrients and wastes

into and around cells. • Movement - contract muscles and move cells. • DNA Transcription — turn genes on and off. • Immune System Functions - identify germs and other foreign substances

and mark them for destruction.

Page 62: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Systems Biology

• Study of Biological Networks– Gene Networks– Protein Networks– Metabolic Networks– Signaling Networks

Page 63: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Genomics to Proteomics

Gene mRNA Protein

transcription translation

Genome Transcriptome Proteome

static dynamic

One gene Many transcripts Many proteins

(alternative splicing) (post-translational modifications)

traditional

new

Page 64: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Proteomics

• The Proteome is the complete set of proteins in the cell under a set of conditions. It is dynamic and complex, and characterized in terms of:– Structure – shape, electrostatics– Abundance – protein expression– Localization - subcellular location– Modifications – post translational modifications – Interactions – protein-protein interactions (interactome)

Page 65: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Interaction Network(Raul et al, Nature, Oct 2005)

Page 66: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources
Page 67: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Metabolic Network

Scb.su.se

Page 68: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Signaling Pathways: Control Gene Activity

• Instead of having brains, cells make decision through complex networks of chemical reactions, called pathways– Synthesize new materials– Break other materials down for spare parts– Signal to eat or die

Page 69: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Example of cell signaling

Page 70: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Signaling Networks

Gene-regulation.com

Page 71: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Why Computational Biology or Bioinformatics?

• Computational Biology is the combination of biology and computing.

• DNA sequencing technologies have created massive amounts of information that can only be efficiently analyzed with computers.

• So far 70 species sequenced– Human, rat chimpanzee, chicken, and many others.

• As the information becomes ever so larger and more complex, more computational tools are needed to sort through the data. – Computational Biology to the rescue!!!

Page 72: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

What is Computational Biology?

• Computational Biology is generally defined as the analysis, prediction, and modeling of biological data with the help of computers

Page 73: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Bio-Information

• Since discovering how DNA acts as the instructional blueprints behind life, biology has become an information science

• Now that many different organisms have been sequenced, we are able to find meaning in DNA through comparative genomics, not unlike comparative linguistics.

• Slowly, we are learning the syntax of DNA

Page 74: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Sequence Analysis

• Some algorithms analyze biological sequences for patterns– RNA splice sites– ORFs– Amino acid propensities in a protein– Conserved regions in

• AA sequences [possible active site]• DNA/RNA [possible protein binding site]

• Others make predictions based on sequence– Protein/RNA secondary structure folding

Page 75: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Human Genome Composition

Page 76: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

It is Sequenced, What’s Next?• Tracing Phylogeny

– Finding family relationships between species by tracking similarities between species.

• Gene Annotation (cooperative genomics)

– Comparison of similar species.• Determining Regulatory Networks

– The variables that determine how the body reacts to certain stimuli.

• Proteomics

– From DNA sequence to a folded protein.

Page 77: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Modeling

• Modeling biological processes tells us if we understand a given process

• Because of the large number of variables that exist in biological problems, powerful computers are needed to analyze certain biological questions

Page 78: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Protein Modeling

• Quantum chemistry imaging algorithms of active sites allow us to view possible bonding and reaction mechanisms

• Homologous protein modeling is a comparative proteomic approach to determining an unknown protein’s tertiary structure

• Predictive tertiary folding algorithms are a long way off, but we can predict secondary structure with ~80% accuracy.

Page 79: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Regulatory Network Modeling

• Micro array experiments allow us to compare differences in expression for two different states

• Algorithms for clustering groups of gene expression help point out possible regulatory networks

• Other algorithms perform statistical analysis to improve signal to noise contrast

Page 80: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Systems Biology Modeling

• Predictions of whole cell interactions.– Organelle processes, expression modeling

• Currently feasible for specific processes (eg. Metabolism in E. coli, simple cells)

Flux Balance Analysis

Page 81: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

Example: Mining the Omics Graph

Page 82: Computational Biology: Molecular Biology Primer Figures and slides taken from various sources on the internet including,  (sources

The future…

• Computational Biology and Bioinformatics is still in it’s infancy

• Much is still to be learned about how proteins can manipulate a sequence of base pairs in such a peculiar way that results in a fully functional organism.

• How can we then use this information to benefit humanity without abusing it?