46
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

  • View
    229

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Peptide Identification by Tandem Mass Spectrometry

Behshad Behzadi

April 2005

Page 2: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Outline

• Proteomics

• Tandem Mass Spectrometry

• Peptide Identification Problem

• Identification Via Database

• De novo peptide identification

Page 3: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Proteomics

• The systematic analysis of the proteins expressed by a cell or tissue.

• Identification, Quantification, intractions,…

• Tandem Mass spectrometry is an essential tool for identification (and quantification) of the proteins in a mixture.

Page 4: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Proteins

• Primary structure of the proteins is a sequence in an alphabet of size 20 of amino acids.

Page 5: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Amino Acids

Page 6: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

`

Page 7: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Tandem Mass Spectrum: An Example

Secondary Fragmentation

Ionized parent peptide

Page 8: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

What is the goal ?

• Spectrum Peptide sequence

Page 9: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Protein Backbone

H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH

Ri-1 Ri Ri+1

AA residuei-1 AA residuei AA residuei+1

N-terminus C-terminus

Page 10: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Breaking of Protein Backbone

H...-HN-CH-CO NH-CH-CO-NH-CH-CO-…OH

Ri-1 Ri Ri+1

AA residuei-1 AA residuei AA residuei+1

N-terminus C-terminus

H+

Page 11: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

How Does a Peptide Fragment?

m(y1)=19+m(A4)m(y2)=19+m(A4)+m(A3)m(y3)=19+m(A4)+m(A3)+m(A2)

m(b1)=1+m(A1)m(b2)=1+m(A1)+m(A2)m(b3)=1+m(A1)+m(A2)+m(A3)

Page 12: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 13: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 14: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

The identification Algorithms

• Database Search Algorithms (Sequest, Mascot, …)

• De novo Algorithms (Lutefisk, Peaks,…)

Page 15: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Database Search Algorithms

• Interpreting the tandem mass spectral data by searching a protein database.

• SEQUEST (Eng. et al. 1994)

• Mascot (Perkins et al. 1999)

• ProteinProspector (Clauser et al. 1999)

Page 16: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

SEQUEST (Eng et al. 94)

• Protein database is searched to identify the amino acid sequences with mass tolerance of 1.

• Produce the theoretical spectra for the candidates.

• Match the theoretical and experimental spectrum using a score function (Xcorr)

• Rank the candidates using this score.

Page 17: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Other probabilistic models for scores

• Qin et al. (1997)

• Danick et al. (2000)

• Bafna and Edwards (2001)

Page 18: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Why do we need de novo?

• Unknown genomes of certain organisms.

• The sequences in the protein database are not accurate.

• Modifications in Amino Acids: RNA editing, Post-Translational Modifications

Page 19: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Methods

• Tree Based Search ( Taylor et al. 97)

• Spectrum Graph Bases Search (Danick et al. 99)

• Dynamic Programming Algorithm (Chen et al. 2001)

• AuDeNS (Baginsky et al. 02)

• Sub-Optimal Algorithm (Lu and Chen 03)

• …

Page 20: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

De Novo Identification

• Given a spectrum S and a defined scoring function f(), find a peptide q sequence which maximizes f(S|q).

Page 21: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 22: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 23: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 24: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 25: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 26: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 27: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 28: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 29: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 30: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 31: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 32: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 33: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 34: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 35: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 36: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 37: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 38: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 39: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005
Page 40: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

AuDeNS

• Using Grass Mowers to preprocess the spectrum, and then employs the dynamic programming approach.

• Compute a relevance for peaks by using different mowers.

• Apply a weighted version of Chen et al. algorithm (DP).

Page 41: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Mowers

• Threshold Mower

• Window Mower

• Isotope Mower

• Intersection Mower

• Complement Mower

Page 42: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Summary: De novo Sequencing

S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6T: + c d Full ms2 638.00 [ 165.00 - 1925.00]

200 400 600 800 1000 1200 1400 1600 1800 2000

m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Rel

ativ

e A

bund

ance

850.3

687.3

588.1

851.4425.0

949.4

326.0524.9

589.2

1048.6397.1226.9

1049.6489.1

629.0

SequenceSequence

Page 43: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Intensities

• Intensities are the second dimension of the information in spectrum.

• Different factors play roles in determination of the intensities.

Page 44: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Intensities (2)

• Amino Acid dependent factors,

• Ion type factors,

• Position-based factors (peaks in the middle of the spectrum are higher)

Page 45: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Conclusion

• Tandem Mass Spectrometry is now the most important tool to identify the proteins.

• Many approaches have been developed but there is still a long way into extracting all information which can be obtained from the mass spectra.

Page 46: Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005

Research Themes

• A mixture of De Novo and Database method. (ex. Extracting tags)

• Using the intensities

• Dealing better with the PTMs. (200 types)

• High-throughput Experiences Clustering.

• Multi-Dimensional Interpretation.