View
218
Download
0
Category
Preview:
Citation preview
Protein & Peptide Analysis
Linda BreciChemistry Mass Spectrometry Facility
University of Arizona
MS Summer Workshop
Using mass spectrometry for the measurement and/or identification of proteins
• Measuring whole proteins– Information about proteins is available on the internet– Limits due to instrument resolution, protein mass, matrix– method: MALDI/TOF – method: ESI + various analyzers
• Measuring peptides from proteins by MS– peptide mass mapping
• Gel separation steps to prepare for protein identification by MS/MS
• Identifying proteins from peptides by MS/MS
Overview
Proteins versus peptides
Enzyme
Protein Peptides
Analysis of whole proteinsGood news & bad news
• MALDI-TOF = measure with 1 or 2 protons
• ESI-Ion Trap = measure with many protons (high charge state)
• Result = mass accuracy not good enough to identify protein (but still useful!)– Mass accuracy decreases as size increases
Same protein, 2 ionization methods
MALDI/TOF – whole protein detected
10000 20000 30000 40000
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
16000000
[2M+H]+
14318.68
[M+2H]2+
7157.18
[M+H]+
14318.68
Inte
nsi
ty
m/z
ESI: Protein MW can be calculated from a protein’s charge distribution
1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000
0
25
50
75
100
5000 10000 15000
0
25
50
75
100
14306.0
Inte
nsi
ty
m/z
+81789.00
+91590.33
+131101.40
+101431.47
+111301.53
+121193.20
Re
lativ
e In
ten
sity
m/z
CalculatedMass Spectrum
We measure ISOTOPES (not averages)Example: Carbon is 12.000 (not 12.0107)
For every 12C there is 1.1% 13C
10 carbons
100
10.8
0.50
20
40
60
80
100
120
1 2 3 4 5 6 7
100 carbons
92.5
100
53.5
18.8
51 0.2
0
20
40
60
80
100
120
1 2 3 4 5 6 7
Peak broadening in high mass measurement
Theoretical isotope distribution for a small protein
9th Isotope14313.906
1st Isotope14304.885
Isotope # m/z % Maximum0 14304.885 0.21 14305.888 1.22 14306.891 4.63 14307.893 12.84 14308.896 26.95 14309.898 46.36 14310.900 67.67 14311.902 86.38 14312.904 97.79 14313.906 100.010 14314.908 93.511 14315.910 80.412 14316.912 64.213 14317.914 47.814 14318.916 33.415 14319.918 21.816 14320.920 13.217 14321.922 7.518 14322.924 3.919 14323.925 1.820 14324.927 0.721 14325.929 0.2
Peak broadening in high mass measurement
Resolution = Mass Accuracy
MASS RANGE Resolution Accuracy (Error)
m/z (at m/z 1,000) (at m/z 1,000)
2,000 (full scan)
10,000 (zoom scan)
0.006% (60 ppm) Ext. Cal.
0.003% (30ppm) Int.Cal.
INSTRUMENT
to 4,000FTICR
MALDI/TOF to 400,000
LCQ (Ion Trap)
15,000 (reflectron)
0.0001% (1ppm)
0.03% (300 ppm)to 2,000
500,000
610)(
lMWTheoretica
MeasuredMWlMWTheoreticappm
Peak broadening in high mass measurement
No reflectron for high masses = reduced resolution
Only multiply charged proteins observed(more peaks/mass unit)
Examples of post translational modificationsabbreviation monoisotopic average
Acetylation ACET 42.0106 42.0373Amidation AMID -0.9840 -0.9847Beta-methylthiolation BMTH 45.9877 46.0869Biotin BIOT 226.0776 226.2934Carbamylation CAM 43.0058 43.0250Citrullination CITR 0.9840 0.9848C-Mannosylation CMAN 162.0528 162.1424Deamidation DEAM 0.9840 0.9847N-acyl diglyceride cysteine (tripalmitate) DIAC 788.7258 789.3202Dimethylation DIMETH 28.0314 28.0538FAD FAD 783.1415 783.5420Farnesylation FARN 204.1878 204.3556Formylation FORM 27.9949 28.0104Geranyl-geranyl GERA 272.2504 272.4741Gamma-carboxyglutamic acid GGLU 43.9898 44.0098O-GlcNAc GLCN 203.0794 203.1950Glucosylation (Glycation) GLUC 162.0528 162.1424Hydroxylation HYDR 15.9949 15.9994Lipoyl LIPY 188.0330 188.3027Methylation METH 14.0157 14.0269Myristoylation MYRI 210.1984 210.3598Palmitoylation PALM 238.2297 238.4136Phosphorylation PHOS 79.9663 79.9799Pyridoxal phosphate PLP 229.0140 229.1290Phosphopantetheine PPAN 339.0780 339.3234Pyrrolidone carboxylic acid PYRR -17.0266 -17.0306Sulfation SULF 79.9568 80.0642Trimethylation TRIMETH 42.0471 42.0807
Mass changes are difficult to identify in high mass measurements
Computer Exercises -- Goals
• Exercise #2, Whole protein analysis– Explore Expasy information available for a protein– Find the theoretical MW of a protein– Find the amino acid sequence of a protein in FASTA format
for use in another exercise– Explore the X-ray crystal structure of a protein
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Proteins versus peptides
Enzyme
Protein Peptides
Identification of proteins from peptide analysis
Separate by 2-D (or 1-D) Gel
Remove protein from gel after cutting into peptides with an enzyme (trypsin)
We can identify hundreds of proteins in one experiment
Extracting and Separating proteins
• Extracting proteins from biological organisms– Results in complex mixture of proteins– May require detergents, etc. that complicate Mass Spec analysis– Remove contaminants (filtration, dialysis, SPE, etc.)
• Separating proteins– 1D SDS-PAGE
• Cross linking controls MW separated• Low resolution technique, spot can contain 10's to 100's of
proteins – 2D SDS-PAGE
• Best method for complex protein mixtures (IEF + SDS-PAGE)– Preparative isolectric focusing (IEF)– Reverse phase HPLC– Size exclusion chromotography– Ion exchange chromatography– Affinity chromatography
2-D Electrophoresis
• 1st Dimension: Isoelectric Focusing (IEF)– Requires maximal resolution of a
target group of proteins – Uses Immobiline DryStrip gels
(various lengths and pH gradients)
– IPGphor programmed to hydrate and separate proteins by pI (i.e. overnight)
• 2nd Dimension: Gel Separation– Apply the Immobiline DryStrip to
the top of a gel– Separation by molecular weight is
rapid (6-10 hours)
#1) pI
#2) MW
2-D Electrophoresis
• Standard Method:– Separate proteins on 2-dimensional gels– Spots (and changes) can be observed (manual or with
computer aid).– Method is reproducible (multiple runs required)– Cut out and identify spots of interest
• Gel Electrophoresis (DIGE)– Two or more samples for comparative analysis are labelled
with different fluorescent dyes, mixed together, run on the same 2D gel, and interrogated with a multi-wavelength fluorescent scanner
– Allows quantitation of subtle changes in protein expression levels between samples, without inter-gel variability - Very good for quantitation of subtle protein expression changes
– Following example: analysis of a Bordetella broncheoseptica enzyme knockout cell line, compared to wild type.
10/29/03 Gel 2: Multiplexed gel imagepH 3 pH 7
sma
ll
Mo
lecu
lar
We
igh
t
larg
e
10/29/03 Gel 2: Side-by-side Cropped Grayscale Images of WT (Cy3) and ∆dnt (Cy5)
WT (Cy3) ∆dnt (Cy5)
pH 3 pH 7 pH 3 pH 7
BB3856 – AZURINL.AAECSVDIAGTDQM#QFDK.KA.AEC*SVDIAGTDQM#QFDK.KK.QFTVNLK.HK.DGIAAGLDNQYLK.A
nanoLC-MS/MS identification of two
differentially expressed protein spots
∆dnt (Cy5)
BB3856 – AZURINK.TADMQAVEK.DK.VLGGGESDSVTFDVAK.LK.DGIAAGLDNQYLK.A
WT (Cy3)
In-Gel enzymatic digestion(trypsin most common)
Trypsin after K, RChymotrypsin after W, Y, F, before PGlu C (V8 protease) after E, D, before PLys C after K, before PAsp N after D
Proteases and Cleavage Specificities
Computer Exercises -- Goals
• Exercise #3, 2-D Gel Electrophoresis– Find a gel image online containing a protein spot of interest– Explore the gel images of various organisms
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Analysis of peptides from proteins
• MALDI-TOF = measure mass of peptides– peptide mass mapping
• ESI-Ion Trap = measure mass/charge of peptide– PLUS can select and fragment (MS/MS) for more information
• Result = possible to identify a protein, or identify SNP’s or modifications made to a protein
Peptide Mass Mapping using MALDI-TOF
MS of a peptide mixture by MALDI/TOF
m/z500 2500
90
0
D:\011003_500fmol\Bsaintcal\2Ref\pdata\1\1r (11:26 10/04/01)
x 4.0
Ref
Ref
Data Analysis for Peptide Mass Mapping
• Important data– multiple peaks– mass accuracy– confirming information
(pI, approx. mass, organism, etc.)
?MS
MS Peptide MWFound in Selected
DatabasesNDALYFPT...SWDLTAL...PTDLDVSY...
protein peptides identify
rank
Computer Exercises -- Goals
• Exercise #4, Peptide Mass Mapping– Identify a protein from a peptide mass list– Confirm this identity by producing a theoretical mass list– Optional (for the speedy ones) identify more unknowns from
mass lists
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Unknown proteins
• 66 = Bovine Serum Albumin
• 116 = beta-galactosidase from e.coli
• 55 = glutamic dehydrogenase from bovine liver
• 36 = glyceraldehyde-3-phosphate dehydrogenase from rabbit muscle
LC/LC-MS/MS for Complex Mixtures
SCX = Strong cation exchange
RP = Reverse Phase (C-18)
Alternate an increasing salt gradient (move some peptides onto RP)
Follow by RP gradient (separate peptides, send to mass spec)
SCX RP
MS/MS
peptides from many
proteinsResults in thousands of mass spectra
A computational challenge!
MS/MS Method Using ESI
Ion Currentover 60 min
MS
MS/MS
Peptide precursor ions observed by MS
MH+
m/z = 1141.3
[M+ 2H]2+
m/z = 571.2
calculation of MH+
571.2 m/z measured x 2 1,142.4 [M+2H] - 1.0 1,141.4 [M+H]
MS-MS of 571.2
895.25
Data Analysis for MS/MS Sequencing Method
?MS/MS
MS Peptide MWFound in Selected
DatabasesNDALYFPT...SWDLTAL...PTDLDVSY...
protein peptides identify
rank
200 400 600 800 1000 1200 1400 1600
Rel
ativ
e In
tens
ity
m/z
theoretical spectra200 400 600 800 1000 1200 1400 1600
0
20000
40000
60000
80000
100000
120000
Re
lativ
e I
nte
nsi
ty
m/z
compare
FPhe
GGly
TThr
DAsp
MMet
DAsp
NAsn
y series ions
V F G T D M D N S R
895.25
y3
376.2y4
491.1y4
622.2
y5y6y7y8
Peptide bond fragment ions
Peptide fragment ions
H2N CH C
H
O
HN CH C
H
O
HN CH C
H
O
HN CH C
H
OH
O
CH
R
H2N C
N
CH
R'O
H
Internal immonium ion Amino acid immonium ion
a2
b2
c2
x2
y2
z2
H2N CH
R
Peptide Sequencing
mass amino acid
Alanine ALA A 71.09
Arginine ARG R 156.19
Aspartic Acid ASP D 115.09
Asparagine ASN N 114.11
Cysteine CYS C 103.15
Glutamic Acid GLU E 129.12
Glutamine GLN Q 128.14
Glycine GLY G 57.05
Histidine HIS H 137.14
Isoleucine ILE I 113.16
Leucine LEU L 113.16
Lysine LYS K 128.17
Methionine MET M 131.19
Phenylalanine PHE F 147.18
Proline PRO P 97.12
Serine SER S 87.08
Threonine THR T 101.11
Tryptophan TRP W 186.12
Tyrosine TYR Y 163.18
Valine VAL V 99.14
C
O
HN CH C
CH3
O
HN CH C
CH2
O
C
OH
O
HN
71 u. 115 u.
Ala Asp
Computer Exercises -- Goals
• Exercise #5, Peptide Sequencing & Protein ID– Identify a peptide (and it’s protein) from an MS-MS mass list
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Homology search to find protein functionBLAST: Computer Exercise #6
• Peptide sequences found for an “unknown” protein by Sequest database searching
• Find a possible function of this protein
LocusSpectrum Count
Sequence Coverage Length Descriptive Name
CL001145.84_fgenesh_1_aa 1 3.20% 1013 Unknown
FileName XCorr DeltCN M+H+ Sequence
lb100404_01.2750.2750.2 2.7258 0.181 3290.86RLVVVNAKPTAASAVGLAGPGAADVLPFVEADLKKS
lb100404_01.1675.1675.2 3.6422 0.181 1658.85 RHFFAAAAGQPPPQY.L
Computer Exercises -- Goals
• Exercise #6, Blast Search– Perform a BLAST search for a peptide sequence that was
found in the previous exercise– Observe the other proteins with similar sequence– Not all organisms have full genomic information – homology
sequencing is useful for protein identification
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Computer Exercises -- Goals
• Exercise #7, Find an unknown protein– Use the same method of #4 to find an unknown peptide– Information provided:
• MS spectrum
• MS/MS spectrum
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Open source software for high-throughput proteomics: X Tandem
• Current trends to free software• The Global Proteome Machine http://www.thegpm.org/
– X Tandem– Sequenced peptide libraries– Software available to programmers
Computer Exercises -- Goals
• X Tandem identification of the same spectra
• Exercise #8, Find an unknown protein
Open Webpage: http://www.chem.arizona.edu/facilities/msf/BiochemLab/exercises.html
Recommended