15
Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion Mandoiu (UCONN) Monte-Carlo Regression Algorithm for Isoform Frequency Estimation from RNA-Seq Data IEEE ICCABS 2013, New Orleans, LA

Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA)

  • Upload
    zalika

  • View
    21

  • Download
    2

Embed Size (px)

DESCRIPTION

Monte-Carlo Regression Algorithm for Isoform Frequency Estimation from RNA-Seq Data. Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion Mandoiu (UCONN) . Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Alex ZelikovskyDepartment of Computer Science

Georgia State University

Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion Mandoiu (UCONN)

Monte-Carlo Regression Algorithm for IsoformFrequency Estimation from RNA-Seq Data

IEEE ICCABS 2013, New Orleans, LA

Page 2: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

• RNA-Seq: Introduction

• MCReg: Monte Carlo Regression based Algorithm

• Experimental Results

• Conclusions and Future Work

IEEE ICCABS 2013, New Orleans, LA

Outline

Page 3: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Genome-Guided RNA-Seq Protocol

RNA-Seq enables transcript-level resolution of gene expression

From RNA – through the process of hybridization-

Make cDNA & shatter into fragments

Sequence fragment ends

A B C D E

Map reads to genome

Gene Expression (GE)Isoform Expression (IE)

A B C

A C

D E

Isoform Discovery (ID)

[Nicolae, et. al., 11] IEEE ICCABS 2013, New Orleans, LA

Page 4: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

• RNA-Seq: Introduction

• MCReg: Monte Carlo Regression based Algorithm- Observed Read Distribution- MC-Based Estimation of Expected Read Distribution- Regression-Based Estimation of Isoform Frequencies

• Experimental Results

• Conclusions and Future Work

IEEE ICCABS 2013, New Orleans, LA

Outline

Page 5: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

MCReg: Monte-Carlo Regression

MCReg Motivation:•Reducing the error rate is critical for detecting similar transcripts especially in those cases when one is a subset of another:

Screenshot from Genome browse:

IEEE ICCABS 2013, New Orleans, LA

Page 6: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

• Map paired-end reads onto the library of known isoforms using an ungapped aligner (e.g., Bowtie)

• B. Langmead, C. Trapnell, et. al., “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome Biology, vol. 10, no. 3, p. R25, 2009.

• Group reads that have been mapped to the same transcripts into classes

• Monte-Carlo-Based Estimation of Expected Read Distribution using e.g. Grinder simulator

• F.E. Angly et. al. Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic acids research, 2012

• Solve the regression:The least-square formulation can be solved with a constrained quadratic programming solver

• M. S. Andersen et. al. CVXOPT: A Python package for convex optimization,  Available at cvxopt.org, 2013.

General Method Overview

Page 7: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Observed Read Distribution

IEEE ICCABS 2013, New Orleans, LA

Page 8: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Monte-Carlo-Based Estimation of Expected Read Distribution

IEEE ICCABS 2013, New Orleans, LA

Page 9: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

MC-Based Estimation of Expected Read Distribution

IEEE ICCABS 2013, New Orleans, LA

Page 10: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Regression-Based Estimation of Isoform Frequencies

IEEE ICCABS 2013, New Orleans, LA

Page 11: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Regression-Based Estimation of Isoform Frequencies

IEEE ICCABS 2013, New Orleans, LA

Page 12: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

• RNA-Seq: Introduction

• MCReg: Monte Carlo Regression based Algorithm

• Experimental Results

• Conclusions and Future Work

IEEE ICCABS 2013, New Orleans, LA

Outline

Page 13: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Simulation Setup

IEEE ICCABS 2013, New Orleans, LA

Page 14: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Experimental Results• Frequency estimation accuracy was assessed using

the coefficient of determination r2.

• For IsoEM r2 = 0.92, while for MCReg r2 = 0.97.

• The results shows better correlation compared with IsoEM especially because of those cases of sub-transcripts where IsoEM skewed the estimated frequency toward super-transcripts.

IEEE ICCABS 2013, New Orleans, LA

Page 15: Alex  Zelikovsky Department of Computer Science Georgia State University Joint work with  Adrian  Caciula  (GSU),  Serghei Mangul  (UCLA)

Thanks!

IEEE ICCABS 2013, New Orleans, LA