50
Introduction to Introduction to Bioinformatics Bioinformatics

Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Embed Size (px)

Citation preview

Page 1: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Introduction to BioinformaticsIntroduction to Bioinformatics

Page 2: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

What is BioinformaticsWhat is Bioinformatics

Easy AnswerUsing computers to solve molecular biology

problems; Intersection of molecular biology and computer science

Hard AnswerComputational techniques (e.g. algorithms, artificial

intelligence, databases) for management and analysis of biological data and knowledge

Page 3: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

BioinformaticsBioinformatics

Bioinformatics = Biology + Information

Biology is becoming an information science

Computation methods are necessary to analyze the massive amount of information that coming out of the genome projects

Page 4: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Bioinformatics is Another Bioinformatics is Another Revolution in BiologyRevolution in Biology

Page 5: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Three concepts, which remain Three concepts, which remain central to Bioinformaticscentral to Bioinformatics

Data representation

A complex, dynamic, three-dimensional molecule a simple string of characters

Page 6: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Three concepts, which remain Three concepts, which remain central to Bioinformaticscentral to Bioinformatics

The concept of similarity– Evolution has operated on every sequence– In biomolecular sequences (DNA, RNA or amino acid

sequences). High sequence similarity usually implies significant functional or structural similarity.

– The opposite is not true– Algorithms for comparing sequences and finding

similar regions are at the heart of bioinformatics

Page 7: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Three concepts, which remain Three concepts, which remain central to Bioinformaticscentral to Bioinformatics

Bioinformatics is not a theoretical science; it is driven by the data, which in turn is driven by the needs of biology.

Sequences

Microarray technologies

Page 8: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

GenBank GrowthGenBank Growth

Page 9: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Moore’s LawMoore’s Law

Page 10: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

What do you need to know?What do you need to know?

It all depends on your background

Are you a …? Biologist with some computer knowledge, or Computer scientist with some biology

background

Few do both well

Page 11: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

BackgroundBackground

Biology for Computer Scientists

Computer Science for Biologists

Page 12: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Biological Information FlowBiological Information FlowGenome Introns/Exons

Gene Sequence

Protein Sequence

Protein Functions

Protein Structure

Cellular Pathways

Bioinformatics attempts to model this pathway

Page 13: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Living ThingsLiving Things

Entropy (the tendency to disorder) always increase

Living organisms have low entropy compared with things like soil

They are relatively orderly…

The most critical task is to maintain the distinction between inside and outside

Page 14: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Living ThingsLiving Things

In order to maintain low entropy, living organisms must expend energy to keep things orderly.

They figured out how to do this 4 billion years ago

The functions of life, therefore, are meant to facilitate the acquisition and orderly expenditure of energy

Page 15: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Living ThingsLiving Things

The compartments with low entropy are separated from “the world.”

Cells are the smallest unit of such compartments.

Bacteria are single-cell organismsHumans are multi-cell organisms

Page 16: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology
Page 17: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

The “living things” have the The “living things” have the following tasks:following tasks:

Gather energy from environment Use energy to maintain inside/outside distinction Use extra energy to reproduce Develop strategies for being successful and

efficient at the above tasks– Develop ways to move around– Develop signal transduction capabilities (e.g. vision)– Develop methods for efficient energy capture (e.g.

digestion)– Develop ways to reproduce effectively

Page 18: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

How to accomplish…?How to accomplish…?

Living compartments on earth have developed three basic technologies– Ability to separate inside from outside (lipids)– Ability to build three-dimensional molecules

that assist in the critical functions of life (Protein, RNA)

– Ability to compress the information about how (and when) to build these molecules in linear code (DNA)

Page 19: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Bioinformatics Schematic of a Bioinformatics Schematic of a CellCell

Page 20: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

LipidsLipids

Made of hydrophilic (water loving) molecular fragment connected to hydrophobic fragments

Spontaneously form sheets (lipid membranes) in which all the hydrophilic ends align on the outside, and hydrophobic ends align on the inside

Creates a very stable separation, not easy to pass through except for water and a few other small atoms/molecules

Page 21: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

What is Nucleotide?What is Nucleotide? Pentose, base, phosphate group

Page 22: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Pentose: RNA and DNAPentose: RNA and DNA

Page 23: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

BaseBase

Adenine (A), Cytosine (C), Guanine (G), Thymine (T),

Uracil (U).

Page 24: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Nucleic Acid ChainNucleic Acid Chain

Condensation reaction Orientation From 5’ to 3’ In DNA or RNA, a nucleic

acid chain is called “Strand”– DNA: double-stranded– RNA: a single strand

The number of bases– Base pair (bp) in DNA

Page 25: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

DNA StructureDNA Structure

Page 26: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

DNA StructureDNA Structure

Page 27: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

DNA StructureDNA Structure

Page 28: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

RNA Structure and FunctionRNA Structure and Function

• The major role of RNA is to participate in protein synthesis

•Messenger RNA (mRNA)

•Transfer RNA (tRNA)

•Ribosomal RNA (rRNA)

Page 29: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

mRNAmRNA

Page 30: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

The Genetic CodeThe Genetic Code

Page 31: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

What is gene?What is gene?

A gene includes the entire nucleic acid sequence necessary for the expression of its product.

Such sequence may be divided into– Regulatory region– Transcriptional region: exons and introns

Exons encode a peptide or functional RNA Introns will be removed after transcription

Page 32: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

GeneGene

Page 33: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

GenomeGenome

The total genetic information of an organism.

For most organisms, it is the complete DNA sequence

For RNA viruses, the genome is the complete RNA sequence

Page 34: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Genes and ControlGenes and Control

Human genome has 3,000,000,000 bps divided into 23 liner segments (chromosome)

A gene has an average 1340 DNA bps, thus specifying a protein of about ? (how many) amino acids

Humans have about 35,000 genes = 40,000,000 DNA bps = 3% of total DNA in genome

Human have another 2,960,000,000 bps for control information. (e.g. when, where, how long, etc…)

Page 35: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Gene ExpressionGene Expression

An organism may contain many types of cells, each with distinct shape and function

However, they all have the same genome

The genes in a genome do not have any effect on cellular functions until they are “expressed”

Different types of cells express different sets of genes, thereby exhibiting various shapes and functions

Page 36: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Gene ExpressionGene Expression

The production of a protein or a functional RNA from its gene

Several steps are required– Transcription– RNA processing– Nuclear transport– Protein synthesis

Page 37: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Gene ExpressionGene Expression

Page 38: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Central DogmaCentral Dogma

DNA RNA Protein

Next …Next …

Protein Structure and FunctionProtein Structure and Function

Page 39: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

An Amino AcidAn Amino Acid

An amino acid is defined as the molecule containing an amino group (NH2), a carboxyl group (COOH) and an R group.

R-CH(NH2)-COOH

The R group differs among various amino acids. In a protein, the R group is also call a sidechain.

Page 40: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

An Amino AcidAn Amino Acid

Page 41: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

The Twenty Amino Acids of The Twenty Amino Acids of ProteinsProteins

Page 42: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

The Twenty Amino Acids of The Twenty Amino Acids of ProteinsProteins

Page 43: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

ProteinProtein

Peptide ― a chain of amino acids linked together by peptide bonds.

Polypeptides ― long peptides

Oligopeptides ― short peptides (< 10 amino acids)

Protein are made up of one or more polypeptides with more than 50 amino acids

Page 44: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Protein StructureProtein Structure Primary Structure

– Refers to its amino acid sequence

Page 45: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Secondary structureSecondary structure

Regular, repeated patterns of folding of the protein backbone.

Two most common folding patterns– Alpha helix– Beta sheet

Page 46: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Tertiary StructureTertiary Structure

The overall folding of the entire polypeptide chain into a specific 3D shape

Page 47: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Quaternary StructureQuaternary Structure

Many proteins are formed more than one polypeptide chain

Describe the way in which the different subunits are packed together to form the overall structure of the protein

Hemoglobin molecule

Page 48: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

Quaternary StructureQuaternary Structure

Page 49: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

EvolutionEvolution

Mutation ― rare events, sometimes single base changes, sometimes larger events

Recombination ― how your genome was constructed as a mixture of your two parents

Through Natural Selection Homology (similarity): different species are

assumed to have common ancestors The genetic variation between different people is

…(surprisingly ..)

Page 50: Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology

ReferencesReferences

http://www.biology.arizona.edu/biochemistry/problem_sets/large_molecules/

http://helix-web.stanford.edu/bmi214/index2004.html

http://www.web-books.com/MoBio/http://www.cs.sunysb.edu/~skiena/549/