Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
SPECTRA LIBRARY ASSISTEDDE NOVO PEPTIDESEQUENCING FOR HCD ANDETD SPECTRA PAIRS
1 Yan Yan
Department of Computer Science
University of Western Ontario Canada
OUTLINE
cent Background Tandem mass spectrometry Peptide sequencing methods
cent Proposed method Use of spectra libraries Spectra merging Peptide tags De novo sequencing model
cent Experiments and results Data Experiments and comparison
cent Conclusions2
cent Mass spectrometry (MS) An analytical technique measuring mass-to-charge ratio (mz) of
individual compoundscent Tandem mass spectrometry (MSMS)
It contains two or more mass analyzers It breaks the compounds into smaller fragment ions Fragmentation techniques
cent Collision-induced dissociation (CID)cent High-energy collisional dissociation (HCD)cent Electron transfer dissociation (ETD)
cent MSMS experiments Input protein samples Output tandem mass (MSMS) spectra
3
BACKGROUND
MSMS PROCESS
Precursor ions of interest
FragmentationCIDHCDETDhellip
MSMSMass analyzer 2
Ion source Mass analyzer 1
MS
Fragment ions
Protein digestion
Peptide sampleenzyme
Peptide separation
Protein sample
Detector
Sample preparation
4
cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary
ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)
Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry
BACKGROUND
5
6
TWO WAYS OF PEPTIDE SEQUENCING
Frank et al JPR 2006
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
OUTLINE
cent Background Tandem mass spectrometry Peptide sequencing methods
cent Proposed method Use of spectra libraries Spectra merging Peptide tags De novo sequencing model
cent Experiments and results Data Experiments and comparison
cent Conclusions2
cent Mass spectrometry (MS) An analytical technique measuring mass-to-charge ratio (mz) of
individual compoundscent Tandem mass spectrometry (MSMS)
It contains two or more mass analyzers It breaks the compounds into smaller fragment ions Fragmentation techniques
cent Collision-induced dissociation (CID)cent High-energy collisional dissociation (HCD)cent Electron transfer dissociation (ETD)
cent MSMS experiments Input protein samples Output tandem mass (MSMS) spectra
3
BACKGROUND
MSMS PROCESS
Precursor ions of interest
FragmentationCIDHCDETDhellip
MSMSMass analyzer 2
Ion source Mass analyzer 1
MS
Fragment ions
Protein digestion
Peptide sampleenzyme
Peptide separation
Protein sample
Detector
Sample preparation
4
cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary
ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)
Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry
BACKGROUND
5
6
TWO WAYS OF PEPTIDE SEQUENCING
Frank et al JPR 2006
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent Mass spectrometry (MS) An analytical technique measuring mass-to-charge ratio (mz) of
individual compoundscent Tandem mass spectrometry (MSMS)
It contains two or more mass analyzers It breaks the compounds into smaller fragment ions Fragmentation techniques
cent Collision-induced dissociation (CID)cent High-energy collisional dissociation (HCD)cent Electron transfer dissociation (ETD)
cent MSMS experiments Input protein samples Output tandem mass (MSMS) spectra
3
BACKGROUND
MSMS PROCESS
Precursor ions of interest
FragmentationCIDHCDETDhellip
MSMSMass analyzer 2
Ion source Mass analyzer 1
MS
Fragment ions
Protein digestion
Peptide sampleenzyme
Peptide separation
Protein sample
Detector
Sample preparation
4
cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary
ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)
Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry
BACKGROUND
5
6
TWO WAYS OF PEPTIDE SEQUENCING
Frank et al JPR 2006
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
MSMS PROCESS
Precursor ions of interest
FragmentationCIDHCDETDhellip
MSMSMass analyzer 2
Ion source Mass analyzer 1
MS
Fragment ions
Protein digestion
Peptide sampleenzyme
Peptide separation
Protein sample
Detector
Sample preparation
4
cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary
ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)
Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry
BACKGROUND
5
6
TWO WAYS OF PEPTIDE SEQUENCING
Frank et al JPR 2006
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary
ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)
Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry
BACKGROUND
5
6
TWO WAYS OF PEPTIDE SEQUENCING
Frank et al JPR 2006
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
6
TWO WAYS OF PEPTIDE SEQUENCING
Frank et al JPR 2006
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
BACKGROUND
cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries
cent chemdatanistgov Additional information help with de novo sequencing
7
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
cent Find all length 2 paths select middle peak
cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S
8
k
maa2
j
maa1
i
output
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
9
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
PROPOSED METHOD
cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores
cent Middle ion of the 2-tagscent Complementary ion pairs
Assign significant scores on peaks during selectioncent Peak I (mz charge score)
10
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
N-terminal
Tag1 Tag2
C-terminal
PROPOSED METHOD
11
cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)
Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
PROPOSED METHOD
12
cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap
(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT
Assign significant scores to tagscent Rank candidate peptides with significant scores
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment MSMS spectra dataset
Spectra librariescent human peptide spectral library (from chemdatanistgov) of
183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides
13
Dataset of
spectraSpectrum
chargeSelected
pairs
SCX_HCD_decon 1952+2 to +6 161
SCX_ETD_decon 612SCX_HCD_no_decon 2557
+2 to +5 249SCX_ETD_no_decon 1298
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Results Significant score calculated using spectra libraries
14
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Results Full length accuracy ndash top three candidates output
15
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Results Accuracy comparison with different output
cent SCX_HCD_decon and SCX_ETD_decon dataset pair
cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair
16
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Results Computational time
17
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
CONCLUSIONS
cent Spectra libraries adds additional information forbetter peptide sequencing performance
cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output
cent Computation time saves up to 40 when mergedlong peptide tags were used
18
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
THANK YOU19
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
FUTURE WORK PLAN
cent Experiments Compare with more methods Use more datasets
cent Algorithm improvement Other ways of selecting signal peaks to merge spectra
cent Spectrum specific features
cent Method development Multiple spectra sequencing
20
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent Peptides Peptides are organic compounds consisting of 2 or more amino
acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide
bond and losing a molecule of water
21
BACKGROUND
Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid
Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water
(H2O)
BACKGROUND
22
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent CIDHCD spectra Common ion types and mass calculation
BACKGROUND
23
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent ECDETD spectra Common ion types and mass calculation
BACKGROUND
24
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
cent Complimentary ion pairs In CIDHCD
bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD
ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H
BACKGROUND
25
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
PROPOSED METHOD
cent Spectra merging Peak selection (within each spectrum)
1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules
2 Find complimentary ion pairs and outputcent Ions consider different charges
Output peaks to form spectra S26
k
maa2
j
maa1
i
output
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing
Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)
Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )
cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type
(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step
Assembling tags and partial peptides from GMETs to be candidate peptides 27
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
28
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively
EXPERIMENTS AND RESULTS
cent Experiment Data
cent Select spectra pairs having the same peptide sequences
29
Results inferred by previous published two methods for HCD and ExD data respectively