View
43
Download
0
Category
Preview:
DESCRIPTION
Support for Systems Biology Data in IRD/ ViPR - Proteomics. Richard H. Scheuermann, Ph.D. November 5 , 2012. Projects with Host Factor Data. Four s ystems biology groups funded by NIAID, including: Systems Virology (Michael Katze group, Univ. Washington) - PowerPoint PPT Presentation
Citation preview
Richard H. Scheuermann, Ph.D.November 5, 2012
Support for Systems Biology Datain IRD/ViPR - Proteomics
Projects with Host Factor Data
• Four systems biology groups funded by NIAID, including:– Systems Virology (Michael Katze group, Univ. Washington)
• Influenza H1N1 and H5N1 and SARS Coronavirus• statistical models, algorithms and software, raw and processed gene expression data, and
proteomics data– Systems Influenza (Alan Aderem group, Institute for Systems Biology/Seattle Biomed)
• Various influenza viruses• microarray, mass spectrometry, and lipidomics data
• ViPR Driving Biological Projects– Abraham Brass, Mass. General Hospital
• Dengue virus host factor database from RNAi screen – Lynn Enquist / Moriah Szpara, Princeton University
• Deep sequencing and neuronal microarrays for functional genomic analysis of Herpes Simplex Virus
– Richard Kuhn, Purdue University• Metabolomics data of Dengue virus infection of human cells and mosquitos
– Mike Diamond, Washington University• Identification of inhibitory interferon-stimulated genes against flaviviruses and noroviruses using
shRNA knockdown• Determine the mechanism of action of individual inhibitory ISGs
• “Omics” data management (MIBBI vs MIBBI-DB)– Project metadata (1 template)
• Title, PI, abstract, publications– Experiment metadata (~6 templates)
• Biosamples, treatments, reagents, protocols, subjects– Primary results data
• Raw expression values– Data processing metadata (1 template)
• Normalization and summarization methods– Processed data
• Data matrix of fold changes and p-values– Data interpretation metadata (1 template)
• Fold change and p-value cutoffs used– Interpreted results (Host factor biosets)
• Interesting gene, protein and metabolite lists
• Visualize biosets in context of biological pathways and networks• Statistical analysis of pathway/sub-network overrepresentation
Strategy for Handling “Omics” Data
Data Submission Workflows
Study metadata
Experiment metadata
Primary results
Analysis metadata
Processed data matrix
Free text metadataGEO/PRIDE/PNNL/SRA/MetaboLights
ViPR/IRD/PATRIC
Host factor bioset
pointer
submission
submission
pointer
Systems Biology sites
Metadata Submission Template Examples
Host Factor Data
8 Studies To Date
Host Factor Bioset
Transcriptomics => Proteomics
• Metadata fields are largely re-usable, with some exceptions– Exp_sample_template (protein).xls
• Results data differences– Peptide-level and protein-level• IM005_Peptide_normalization_matrix.V2.xlsx• IM005_Protein Normalization matrix.xlsx
– Statistical measures• Results_matrix_ IM005_sig Protein_RM.xlsx
Metadata Field Changes
• GEO GSM ID => Primary Data Archive + Primary Data Archive ID
• Semi-structured Experiment Variable to Structured Experiment Variable– Free text (1 day) => value unit pairs in separate fields
(1/day; 10^4/plaque forming units)• Multiple processed data matrix files– Concatenated IDs separated by (; |)
• Reagents and protocols are different but should not require submission template changes
Normalized Data
• Archive at BRC (standard format?)– Peptide normalized data– Protein normalized data– Results matrix of significant proteins
• BRCs derive bioset lists from results matrix– Handling different significance measures• t-test flag, t-test p-value, g-test flag, g-test p-value,
log10 ratio
Host Factor Bioset
On Deck
• Metabolomics and lipidomics data• Integration of RNA expression, protein
abundance and metabolite abundance• Pathway/network visualization and analysis
Acknowledgement
• Lynn Law, U. Washington• Richard Green, U. Washington• Peter Askovich, Seattle Biomed• Brett Pickett, U.T. Southwestern/JCVI • Jyothi Noronha, U.T. Southwestern• Eva Sadat, U.T. Southwestern• Entire Systems Biology Data Dissemination Task
Force, especially Jeremy Zucker• NIAID (Alison Yao and Valentina DiFrancesco)
Future Development Plans
GOenrichment
Networkvisualization
GOGOGOGOGOGOGOGOGOGOGOGOGOGO
Recommended