Vertex labels swapping Edges swapping Pathway activity levels with ratio Abstract Metabolic pathway...

Preview:

Citation preview

Vertex labels swapping

Edges swapping Pathway activity levels with ratio

Abstract

Metabolic pathway activity estimation from RNA-Seq dataYvette Temate-Tiagueu, Qiong Cheng, Meril Mathew, Igor Mandric, Olga Glebova, Nicole Beth Lopanik, Ion Mandoiu and Alex Zelikovsky

Department of Computer Science, Department of Biology, Georgia State University

Computer Science and Engineering, University of Connecticut

Our ContributionUsing Kegg: database resource for understanding high-level functions and utilities of the biological system from molecular-level information. [Kanehisa M., and Goto S., 2000]

(1) A novel graph-based approach to analyze pathways significance

(2) Representing a pathway as a set an inferring activity from the information extracted from those sets

(3) Validating the two approaches through differential expression analysis at the transcripts and genes level and also through qPCR experiment

Objectives

Methods

Results

1. Moran NA: Symbiosis. Curr Biol 2006, 16:R866–R871. 2. McFall-Ngai M, Hadfield MG, Bosch TCG, Carey HV, Domazet-Loso T, Douglas AE, Dubilier N, Eberl G, Fukami T, Gilbert SF et al: Animals in a bacterial world, a new imperative for the life sciences. Proc Natl Acad Sci USA 2013, 110(9):3229-3236. 3. Haine ER: Symbiont-mediated protection. Proc R Soc B-Biol Sci 2008, 275(1633):353-361. 4. Lopanik NB: Chemical defensive symbioses in the marine environment. Funct Ecol 2013, 28:328-340. 5. Cragg GM, Newman DJ: Natural products: A continuing source of novel drug leads. Biochimica Et Biophysica Acta-General Subjects 2013, 1830(6):3670-3695. 6. Piel J: Metabolites from symbiotic bacteria. Natural Product Reports 2009, 26(3):338-362. 7. Gerwick WH, Moore BS: Lessons from the past and charting the future of marine natural products drug discovery and chemical biology. Chem Biol 2012, 19(1):85-98.

Our experimental studies on Bugula neritina RNA-seq data (mutualistic symbiosis data vs none) show that, by analyzing metabolic pathways using our tool XPathway, we can effectively locate pathways which activities level significantly differ. This result is been validated through qPCR.This project is supported in part by the Molecular Basis of Disease fellowship of GSU

Conclusions and Future Work

The application of RNA-Seq has allowed various differential analysis studies including differential expression for pathways. A standard approach to study the metabolic differences between species is metabolic pathway. In this study, we introduce a novel approach to characterize pathways activity levels of two samples. We present XPathway, a set of pathways activity analysis tools based on Kegg-Kaas mapping of proteins to pathways. We applied our proposed methods on RNA-Seq Bugula neritina metagenomics data. We successfully identified several pathways with differential activity levels using our novel computational approaches implemented in XPathway. Further validation of initial results is conducted through qPCR.

Develop efficient algorithms for reliable estimation of pathway activity level Identify pathways which activities significantly differ between two conditions

Validation

Experimental studies: Bugula neritinaIn United States - Three sibling species:

1. Deep-water (West coast of United States)

2. Shallow-water (West and Southern East coasts)

3. Northern Atlantic (Northern East coast)

Illumina sequence paired-end reads:Sample 1: Bugula with symbiont

Sample 2: Bugula without symbiont

50bp paired-end reads 200bp mean fragment length Assembly into contigs by Trinity BLAST with Swissprot database

Sample 1 Sample 2

Topology-based estimation of pathway significance

EM-based estimation of pathway activity

Selected pathways for qPCR validation

qPCR

Model 1: permutation of labelsa e

b

c

c

a

d

e

b d

Model 2: permutation of edges

a c

b

c

d

a

b d

RNA-seq reads

2 Samples

Trinity

Binary EM

Contigs

IsoDEContigs

validation

KEGG,SEED

Ortholog groupsK00161

K00162K00163

KEGG,SEED

Ortholog groupsK00161

K00162K00163

Graph-based

Pathway significance

Pathway activity

Differentially expressed pathways

Experimental validation

Proteins

MAFSAEDVLK EYDRRMEAL

BLAST

binary activity status of w

activity level of pathway w

𝒘  =    pathway

𝑻𝒘❑=threshold of w

𝒈𝒘❑=¿𝒇 𝒘❑=∑

g   ∊𝒘

𝒈𝒘❑

𝜹(𝒘 )={𝟏 , 𝐢𝐟 𝒇 𝒘❑≥𝑻𝒘❑

¿𝟎 , 𝐢𝐟 𝒇 𝒘❑<𝑻𝒘❑

Bootstrapping:- Repeat 1000 times

1. Randomly switch edges 2. Compute density of the

largest component- Sort wrt to density- Find the rank of the observed

induced subgraph

Pathway L 1 L2 Prob_Diff_Significanceko04146 99% 5% 0.94ko03008 99% 5% 0.94ko03013 99% 5% 0.94ko00983 99% 5% 0.94ko04530 99% 5% 0.94ko00062 1% 75% 0.74ko00400 1% 99% 0.98ko00071 99% 1% 0.98ko00100 99% 1% 0.98ko00910 4% 99% 0.95ko04122 99% 3% 0.97ko04713 99% 1% 0.99

Model 1: Pvalue

Pathway L1 L2 Prob_Diff_Significanceko04146 99% 5% 0.94ko03008 99% 5% 0.94ko03013 99% 5% 0.94ko00983 99% 5% 0.94ko04530 99% 5% 0.94ko00130 99% 2% 0.97ko00120 4% 58% 0.55ko00072 1% 99% 0.98ko00120 4% 58% 0.55ko00400 1% 99% 0.98ko00230 99% 5% 0.94ko00627 1% 99% 0.99ko00770 3% 99% 0.97ko00980 99% 1% 0.99ko04122 99% 1% 0.98ko04630 99% 4% 0.96ko04713 99% 4% 0.96

Model2: Pvalue

Highest_Diff_Activity_Level Expression1 Expression2 Diff_Expressko04068 23.83 19.77 1.21ko04145 17.35 25.78 0.67ko04610 9.83 6.83 1.44ko00051 13.06 9.34 1.40ko00740 7.83 5.83 1.34ko01230 30.38 23.81 1.28ko04020 17.75 23.72 0.75ko05012 25.71 20.07 1.28ko00983 8.63 12.20 0.71ko05034 17.83 14.30 1.25

For gene expression analyses:

- Select pathways with significantly different activity

- Select DE transcripts from these pathways

- Select the genes from these transcripts

- Primers are created to test genes per condition

Preliminary results

More primers ordered

Pathway #Mapped contigs DE contigs Ratio of DE Pathway nameko00062 14 3 21.43% Fatty acid elongationko00100 8 1 12.50% Steroid biosynthesisko00250 39 4 10.26% Alanine, aspartate and glutamate metabolismko04146 98 15 15.31% Peroxisome ko03008 67 10 14.93% Ribosome biogenesis in eukaryotesko03013 148 22 14.86% RNA transport ko00983 28 4 14.29% Drug metabolism - other enzymes ko04530 237 15 6.33% Tight junction

References

In induced graph:• # nodes N• # edges M• # green connected components• # 0 in- & out-degrees• Density of the induced graph: M/(N-1)

Recommended