37
Bioinformati Bioinformati cs cs page 12, 529-530 + page 12, 529-530 + 659-660 +part of ch. 659-660 +part of ch. 21 21 Cell and Mol Cell and Mol Biol Lab Biol Lab

Bioinformatics page 12, 529-530 + 659-660 +part of ch. 21 Cell and Mol Biol Lab

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

BioinformatiBioinformaticscs

page 12, 529-530 + page 12, 529-530 + 659-660 +part of ch. 21659-660 +part of ch. 21

Cell and Mol Biol Cell and Mol Biol LabLab

Tremendous amounts of sequence data; the gene Tremendous amounts of sequence data; the gene is made up of sequence of A, T, G and Csis made up of sequence of A, T, G and Cs

Small change in one of these “nucleotide bases” Small change in one of these “nucleotide bases” can make a major change in the genecan make a major change in the gene

3.2 billion bases in the human genome3.2 billion bases in the human genome New field emerged: New field emerged: Bioinfomatics thatBioinfomatics that Combines biology, math and computer scienceCombines biology, math and computer science Our campus has a program in this field…Our campus has a program in this field… Study the genome and the proteome (the ~35,000 Study the genome and the proteome (the ~35,000

proteins that result from genes; 3-D structure as proteins that result from genes; 3-D structure as we studied in the earlier lab)we studied in the earlier lab)

For the sequences…the For the sequences…the GenomeGenome

Where are the genes (only 1-2% of DNA Where are the genes (only 1-2% of DNA is for genes, a bit is involved in is for genes, a bit is involved in regulation, the majority is “junk” regulation, the majority is “junk” DNA)?DNA)?

How do the genes differ?How do the genes differ? When is the gene on?When is the gene on? In what tissues is the gene on?In what tissues is the gene on? What kind of protein does the gene What kind of protein does the gene

code for?code for? How do the proteins function? The How do the proteins function? The

PROTEOMEPROTEOME

VOCABULARY:VOCABULARY:1.1. THE CELLTHE CELL2.2. CENTRAL DOGMA (THE CODE…) CENTRAL DOGMA (THE CODE…) 3.3. DNA STRUCTUREDNA STRUCTURE4.4. mRNA: TRANSCRIPTION, mRNA: TRANSCRIPTION,

TRANSCRIPTION FACTORSTRANSCRIPTION FACTORS5.5. GENE ACTIVITY: NORTHERN BLOT GENE ACTIVITY: NORTHERN BLOT

AND HIGH THROUGHPUT ARRAY AND HIGH THROUGHPUT ARRAY ANALYSIS ANALYSIS

6.6. PROTEIN: TRANSLATION, STRUCTURE, PROTEIN: TRANSLATION, STRUCTURE, 2-D GELS AND REGULATION BY 2-D GELS AND REGULATION BY PHOSPHORYLATIONPHOSPHORYLATION

7.7. BIOCHEMICAL PATHWAYSBIOCHEMICAL PATHWAYS

NUCLEUS

(DNA HERE)

CYTOPLASM(PROTEINS MADE HERE)

PROTEINS CARRY OUT FUNCTIONS OF CELL

Fig. 4-5

CENTRAL CENTRAL DOGMADOGMA

FLOW OF INFORMATION FLOW OF INFORMATION FROM DNA TO FROM DNA TO mRNAmRNA TO TO

PROTEIN. PROTEIN THEN PROTEIN. PROTEIN THEN MAKES RED HAIR.MAKES RED HAIR.

INFORMATION: CODE FOR INFORMATION: CODE FOR RED HAIR, BODY SHAPE, RED HAIR, BODY SHAPE,

DISEASE, ETC. DISEASE, ETC.

Fig. 21-1; Fig. 21-1; Know vocab Know vocab

listlist

STORE INFO IN NUCLEUS IN DNA

TRANSFER INFO TO CYTOPLASM

MAKE PROTEIN IN CYTOPLASM

TRANSCRIPTION AND TRANSLATION

DNA STRUCTUREDNA STRUCTURE

CODE OR INFO IS IN SEQUENCE OF G, C, T, OR A

CODE IS IN SEQUENCE OF NUCLEOTIDE BASES (ATGC)IN THE DNA (OR DOUBLE HELIX)

HERE IS PART OF 1 HERE IS PART OF 1 GENE:GENE:ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT

ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCATATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT ATTGCTAGGAAATTCGCCAT

Our genome is unique…Our genome is unique… We are all unique: 0.3% of the base We are all unique: 0.3% of the base

sequence in you is different from others, sequence in you is different from others, This is amounts to 0.3%=0.003 x 3.2 This is amounts to 0.3%=0.003 x 3.2

billion = 10 million changes in the billion = 10 million changes in the nucleotide base sequencenucleotide base sequence

Each change is known as a “single Each change is known as a “single nucleotide polymorphism” (poly is many, nucleotide polymorphism” (poly is many, morphism is form) or snp’s --pronounced morphism is form) or snp’s --pronounced “snips”“snips”

In the future, Physicians will find your In the future, Physicians will find your snp’s, and base their treatment (dose, snp’s, and base their treatment (dose, type of medicine) on your snp’stype of medicine) on your snp’s

Snp’s might lead to certain diseases Snp’s might lead to certain diseases

3 BASES ON DNA/mRNA MAKE UP ONE UNIT

AND CORRESPOND TO ONE AMINO ACID IN THE PROTEIN

Fig. 21-8

ONE WRONG AMINO ACID

Transcription –making Transcription –making mRNA- video & vocab:mRNA- video & vocab:

Gene runs from promoter to the Gene runs from promoter to the terminator (think of AHHNOLD)terminator (think of AHHNOLD)

RNA polymerase makes mRNARNA polymerase makes mRNA Off of one strand of DNA called Off of one strand of DNA called

template strandtemplate strand Note matching up of code on DNA Note matching up of code on DNA

as mRNA is made- this carries the as mRNA is made- this carries the protein infoprotein info

D:\cell mol lab\D:\cell mol lab\bioinformbioinform lab protein struc\17-06-Transcription. lab protein struc\17-06-Transcription.mov mov

Translation; making the Translation; making the protein from mRNAprotein from mRNA

Note how 3 nucleotides (codon) pair Note how 3 nucleotides (codon) pair up with the transfer RNA that brings up with the transfer RNA that brings in a certain amino acid in a certain amino acid

So correct amino acids are addedSo correct amino acids are added Protein has correct amino acid Protein has correct amino acid

sequencesequence D:\cell D:\cell biolbiol 3611\protein 3611\protein synthsynth

sorting\TRANSLATION.MOV sorting\TRANSLATION.MOV

Fig. 21-2

Problem….Problem…. So, the various So, the various exonsexons in the DNA are in the DNA are

used for making a proteinused for making a protein The The intronsintrons are not; they can have other are not; they can have other

regulatory functions (e.g., site of regulatory functions (e.g., site of transcription factor binding)transcription factor binding)

The introns are spliced out of the Pre- The introns are spliced out of the Pre- mRNA (in a process called Processing)mRNA (in a process called Processing)

Problem for scientists: exons can become Problem for scientists: exons can become introns (and vice versa), pre RNA introns (and vice versa), pre RNA processing cuts out differing sectionsprocessing cuts out differing sections

So, one gene, many proteins possibleSo, one gene, many proteins possible

Fig. 21-26 Note Fig. 21-26 Note that what is an that what is an

exon can exon can change from change from

one time to the one time to the next. Also, next. Also,

processing of processing of the Pre-mRNA the Pre-mRNA can change, can change,

both producing both producing different different

proteins. Note proteins. Note relationship relationship

between exons between exons and domainsand domains

GENE ACTIVITY:GENE ACTIVITY:IS THE GENE “ON” OR “OFF”?IS THE GENE “ON” OR “OFF”?

If GENE is “ON”, it is MAKING mRNA

This is transcription (transcribing the code from DNA to mRNA).

Regulation of transcription OR Gene Activity is by

“TRANSCRIPTION FACTORS”

OLD METHOD:OLD METHOD:NORTHERN BLOT FOR NORTHERN BLOT FOR ONEONE

GENEGENEIF GENE X IS ON, mRNA FROM THIS

GENE WILL BE PRODUCED.ADD INSULIN TO CELL,

GENE X IS TURNED ON

NO INSULIN, GENE X OFF

DETECT mRNA FROM GENE X

Newer Method: RT-PCRNewer Method: RT-PCR

Isolate RNA from a cellIsolate RNA from a cell Only the genes that are on will Only the genes that are on will

be making mRNAbe making mRNA Add Reverse Transcriptase (RT) Add Reverse Transcriptase (RT)

to make cDNA from mRNAto make cDNA from mRNA Clone (make many copies) of Clone (make many copies) of

one particular cDNA with use of one particular cDNA with use of primers and PCRprimers and PCR

NEW METHOD: HIGH NEW METHOD: HIGH THROUGHPUT “ARRAY THROUGHPUT “ARRAY

ANALYSIS”ANALYSIS”

ANALYZE 10,000 OR MORE GENES ALL AT ONCE.

WHAT GENES ACT IN CONCERT WHEN YOU ADD INSULIN TO A CELL?

WHAT GENES TURN ON IN A CANCER CELL?

(mouse click to play)

One Problem: if there are One Problem: if there are about 25,000 genes, why are about 25,000 genes, why are

there about 200,000 to 1 there about 200,000 to 1 million different proteins?million different proteins? Answer 1: different sections of one Answer 1: different sections of one gene can be used to produce gene can be used to produce different proteins (e.g., exons can different proteins (e.g., exons can become introns, and vice versa)become introns, and vice versa)

Answer 2: one Pre- mRNA is cut up Answer 2: one Pre- mRNA is cut up differently (or processed differently, differently (or processed differently, called “alternative splicing of the called “alternative splicing of the RNA”), producing different proteins RNA”), producing different proteins from one original Pre- mRNA.from one original Pre- mRNA.

USING COMPUTAIONAL USING COMPUTAIONAL TECHNIQUES to handle the TECHNIQUES to handle the large amount of data, study large amount of data, study

the Proteome:the Proteome: Mass Spec Mass Spec 3-D PROTEIN STRUCTURE3-D PROTEIN STRUCTURE GEL ELECTROPHORESISGEL ELECTROPHORESIS TO TO IDENTIFY WHAT PROTEINS ARE IDENTIFY WHAT PROTEINS ARE PRESENTPRESENT HIGH-THROUGHPUT: 2-D GEL HIGH-THROUGHPUT: 2-D GEL ELECTROPHORESISELECTROPHORESIS PROTEIN ARRAYS (place protein on PROTEIN ARRAYS (place protein on glass slide, not nucleic acid, see what glass slide, not nucleic acid, see what binds to the protein)binds to the protein)

Study the Proteome- Study the Proteome- Mass SpecMass Spec Use electrophoresis to separate the Use electrophoresis to separate the

various size proteins (separate based various size proteins (separate based on size) on size)

Purified Protein is cut up into Purified Protein is cut up into different size fragments by a proteasedifferent size fragments by a protease

The exact size of each peptide The exact size of each peptide determined by determined by Mass SpectrometryMass Spectrometry

From the DNA sequence, predict the From the DNA sequence, predict the pattern of peptide fragments – find pattern of peptide fragments – find that your protein comes from a new that your protein comes from a new genegene

Study the Proteome: 3-D PROTEIN Study the Proteome: 3-D PROTEIN STRUCTURESTRUCTURE

What Proteins are Made? What Proteins are Made?

(I.E., ~What genes are (I.E., ~What genes are active)active) SEPARATE AND IDENTIFY SEPARATE AND IDENTIFY

PROTEINS USING GEL PROTEINS USING GEL ELECTROPHORESIS:ELECTROPHORESIS:

OBTAIN A MIXTURE OF OBTAIN A MIXTURE OF PROTEINS FROM A LIVER CELLPROTEINS FROM A LIVER CELL

USE 1-D GEL USE 1-D GEL ELECTROPHORESIS TO ELECTROPHORESIS TO CRUDELY FIND OUT WHAT CRUDELY FIND OUT WHAT PROTEINS ARE PRESENTPROTEINS ARE PRESENT

IS INSULIN MADE IN THIS CELL?

MIXTURE OF PROTEINS FROM ONE CELL

IS INSULIN MADE IN THIS CELL?

1-D ELECTROPHOESIS

(SEPARATES BY SIZE)

(WESTERN BLOTTING USED HERE)

2-D GEL ELECTROPHORESIS

ANALYZE DISTANCE BETWEEN SPOTS (PATTERN ANLYSIS)TO IDENTIFY SPOTS

PROBLEM: THERE ARE THOUSANDS OF SPOTS; EACH 2-D GEL RUNS A LITTLE DIFFERENTLY, SO IT CAN BE DIFFICULT TO ID EACH SPOT

HIGH THROUGHPUT; ANALYZE THOUSANDS OF PROTEINS

POST-TRANSLATIONAL POST-TRANSLATIONAL MODIFICATIONMODIFICATION

ONCE MADE (POST-TRANSLATION), THE PROTEIN CAN BE MODIFIED.

ONE MODIFICATION IS THE ADDITION OF PHOSPHATE

TO A PROTEIN ADDITION OF PHOSPHATE MAY TURN ON (OR OFF) A PROTEIN

DETECT ADDITION OF PHOSPHATE BY “MASS SPEC”

Web sites for Web sites for BioinfomaticsBioinfomatics NCBI http://www.ncbi.nlm.nih.gov/

PubMed (National Library of Medicine, 2004)

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi LocusLink (Pruitt and Maglott, 2001)

http://www.ncbi.nlm.nih.gov/LocusLink/ OMIM (NCBI, 2000)

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db¼OMIM Psi-Phi BLAST (Altschul et al., 1997)

http://www.ncbi.nlm.nih.gov/BLAST/ ClustalW (Thompson et al., 1994)

http://www.ebi.ac.uk/clustalw/index.html KEGG (Kanehisa, 1997; Kanehisa and Goto, 2000)

http://www.genome.ad.jp/kegg/ ExPASy http://us.expasy.org/ DeepView (Guex and Peitsch, 1997) http://us.expasy.org/spdbv/ SwissProt (Boeckmann et al., 2003) http://us.expasy.org/sprot/ Protein Data Bank (Berman et al., 2000) http://www.rcsb.org/pdb/ Sequence Manipulation Suite (Stothard, 2000)

http://bioinformatics.org/sms/ PSIPRED (McGuffin et al., 2000), MEMSTAT (Jones, 1999)

http://bioinf.cs.ucl.ac.uk/psipred/

VOCABULARY:VOCABULARY:1.1. THE CELLTHE CELL2.2. CENTRAL DOGMA CENTRAL DOGMA 3.3. DNA STRUCTUREDNA STRUCTURE4.4. mRNA: TRANSCRIPTION, mRNA: TRANSCRIPTION,

TRANSCRIPTION FACTORSTRANSCRIPTION FACTORS5.5. GENE ACTIVITY: NORTHERN BLOT GENE ACTIVITY: NORTHERN BLOT

AND HIGH THROUGHPUT ARRAY AND HIGH THROUGHPUT ARRAY ANALYSIS ANALYSIS

6.6. PROTEIN: TRANSLATION, STRUCTURE, PROTEIN: TRANSLATION, STRUCTURE, 2-D GELS AND REGULATION BY 2-D GELS AND REGULATION BY PHOSPHORYLATIONPHOSPHORYLATION

7.7. BIOCHEMICAL PATHWAYSBIOCHEMICAL PATHWAYS