View
215
Download
0
Category
Tags:
Preview:
Citation preview
A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data
Anoop Mayampurath, Chuan-Yih YuInfo-690 (Glycoinformatics) Final Project Presentation
Background
•
[1] Kyselova et al. “Alterations in the Serum Glycome Due to Metastatic Prostate Cancer “ Journal of Proteome Research, 2007, 6:1822-1832
[2] Tang et. al “Identification of N-Glycan Serum Markers Associated with Hepatocellular Carcinoma from Mass Spectrometry Data” Journal of Proteome Research, 2009, Article ASAP[3] Ressom et. al “Analysis of MALDI-TOF Mass Spectrometry Data for Discovery of Peptide and Glycan Biomarkers of Heptacelluar Carcinoma, Journal of Proteome Research, 2008, 7:603
Objective• Given a set of N mass spectra(disease and healthy),
develop an algorithm that identifies “significant” spectra and glycan peaks▫ From the significant glycan peaks
Nature of regulation between disease and healthy Study of effects such as fucosylation and linkage
▫ From the significant spectra A smaller set of spectra m << N that help in analysis Glycan annotation Check for overlapping glycans
• What is meant by “significant”?▫ Elements that exhibit coherent patterns and large variation
between disease and healthy• Datasets
▫ 151 MALDI TOF mass spectra : 73 cancer, 78 normal
•Details▫Background subtraction▫Peak Picking▫Identification of common glycans across all
151 spectra▫Filtering using Fit Coefficient cutoff > 0.5
30% of spectra has glycan fit coefficient greater that 0.5, then retain
•A Nxp matrix X is obtained (N : number of glycans, p: number of spectra)
Multi-PCA algorithm•Perform PCA
•Perform inner-product
•Sort glycans by inner product (which measure correlation)
•Shave off 10% of glycans with the lowest inner product score
•Repeat
1 2( , ..... )NX
[4] Hastie et. al ‘‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns’, Genome Biology 2000, 1(2):1-21
1,X
Multi-PCA AlgorithmX
1
Sort by inner product, shave of 10% of glycans
' s.t. | ' |X X X
1
-The algorithm was iterated until 10 glycan values were acquired. The glycans are supposed to be coherent in intensity changes while having high variance between cancer and no cancer- We also switched dimensions to shave off spectra. The algorithm was iterated until we got 6 spectra
Future Directions
•Fragmentation of glycans to study effect of linkage among glycans
•Glycan microarray•More detail on overlapping glycans
(substitute single score by combined score)
•Orthogonalize the data to see other patterns.
Recommended