Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Use Case 3: Biomarker Potential and Limitations of Circulating miRNA
Performed by the Data Management and Resource Repository
(DMRR)
ERCC Data Analysis
1Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260.
Use Case 3: Biomarker Potential and Limitations of Circulating miRNA
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 2
The goal of this use case is to show that the Genboree Workbench and the exceRpt small RNA-seq analysis pipeline can replicate the results of Williams et al. We have developed these pipelines because as the Extracellular RNA Communication Consortium (ERCC) begins to generate more datasets, it is vital that they be analyzed in a reproducible and comparable way.
The dataset from Williams et al is freely available in the Short Read Archive (SRA), so it makes a good practice dataset for becoming familiar with the Workbench and exceRpt.
Biological Question of Interest
3
This work from the Tuschl lab at Rockefeller University is mainly a technical study. They examined extracellular RNA from the bloodstream of mothers and fathers, their newborn babies, and the placenta. They were interested in quantitating how much exRNA could be found, and whether it would be possible to detect biomarker RNAs at that level. The placenta acted as a model of a tumor; their thinking was that if detecting exRNAs from placenta is feasible, the same should be true of tumors.
Use Case 3: Biomarker Potential and Limitations of Circulating miRNA
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260.
Biological Samples to Be Analyzed
Use Case 3: Biomarker Potential and Limitations of Circulating miRNA
4
UniqueID
SampleName SRAID SampleDescription
31 C01 SRR822433 Non-pregnantwomanplasmaSample#132 C02 SRR822461 Non-pregnantwomanplasmaSample#233 C03 SRR822434 Non-pregnantwomanserumSample#334 C04 SRR822435 Non-pregnantwomanserumSample#435 C05 SRR822436 Non-pregnantwomanserumSample#518 F01 SRR822466 Father'sbloodfromSet#119 F06 SRR822441 Father'sbloodfromSet#620 F08 SRR822442 Father'sbloodfromSet#821 F10 SRR822443 Father'sbloodfromSet#1022 F12 SRR822440 Father'sbloodfromSet#121 M01 SRR822467 Mother'sbloodfromSet#12 M02 SRR822468 Mother'sbloodfromSet#23 M03 SRR822469 Mother'sbloodfromSet#34 M06 SRR822445 Mother'sbloodfromSet#65 M08 SRR822446 Mother'sbloodfromSet#86 M10 SRR822447 Mother'sbloodfromSet#107 M12 SRR822448 Mother'sbloodfromSet#1223 P05 SRR822452 PlacentafromSet#524 P06 SRR822453 PlacentafromSet#625 P07 SRR822454 PlacentafromSet#726 P08 SRR822455 PlacentafromSet#827 P09 SRR822456 PlacentafromSet#928 P10 SRR822457 PlacentafromSet#1029 P11 SRR822458 PlacentafromSet#1130 P12 SRR822459 PlacentafromSet#128 B01 SRR822462 UmbilicalcordbloodfromSet#19 B02 SRR822464 UmbilicalcordbloodfromSet#210 B03 SRR822464 UmbilicalcordbloodfromSet#311 B04 SRR822465 UmbilicalcordbloodfromSet#412 B06 SRR822437 UmbilicalcordbloodfromSet#613 B07 SRR822450 UmbilicalcordbloodfromSet#714 B08 SRR822438 UmbilicalcordbloodfromSet#815 B10 SRR822439 UmbilicalcordbloodfromSet#1016 B11 SRR822451 UmbilicalcordbloodfromSet#1117 B12 SRR822444 UmbilicalcordbloodfromSet#12
UniqueID
SampleName SRAID SampleDescription
31 C01 SRR822433 Non-pregnantwomanplasmaSample#132 C02 SRR822461 Non-pregnantwomanplasmaSample#233 C03 SRR822434 Non-pregnantwomanserumSample#334 C04 SRR822435 Non-pregnantwomanserumSample#435 C05 SRR822436 Non-pregnantwomanserumSample#518 F01 SRR822466 Father'sbloodfromSet#119 F06 SRR822441 Father'sbloodfromSet#620 F08 SRR822442 Father'sbloodfromSet#821 F10 SRR822443 Father'sbloodfromSet#1022 F12 SRR822440 Father'sbloodfromSet#121 M01 SRR822467 Mother'sbloodfromSet#12 M02 SRR822468 Mother'sbloodfromSet#23 M03 SRR822469 Mother'sbloodfromSet#34 M06 SRR822445 Mother'sbloodfromSet#65 M08 SRR822446 Mother'sbloodfromSet#86 M10 SRR822447 Mother'sbloodfromSet#107 M12 SRR822448 Mother'sbloodfromSet#1223 P05 SRR822452 PlacentafromSet#524 P06 SRR822453 PlacentafromSet#625 P07 SRR822454 PlacentafromSet#726 P08 SRR822455 PlacentafromSet#827 P09 SRR822456 PlacentafromSet#928 P10 SRR822457 PlacentafromSet#1029 P11 SRR822458 PlacentafromSet#1130 P12 SRR822459 PlacentafromSet#128 B01 SRR822462 UmbilicalcordbloodfromSet#19 B02 SRR822464 UmbilicalcordbloodfromSet#210 B03 SRR822464 UmbilicalcordbloodfromSet#311 B04 SRR822465 UmbilicalcordbloodfromSet#412 B06 SRR822437 UmbilicalcordbloodfromSet#613 B07 SRR822450 UmbilicalcordbloodfromSet#714 B08 SRR822438 UmbilicalcordbloodfromSet#815 B10 SRR822439 UmbilicalcordbloodfromSet#1016 B11 SRR822451 UmbilicalcordbloodfromSet#1117 B12 SRR822444 UmbilicalcordbloodfromSet#12
UniqueID
SampleName SRAID SampleDescription
31 C01 SRR822433 Non-pregnantwomanplasmaSample#132 C02 SRR822461 Non-pregnantwomanplasmaSample#233 C03 SRR822434 Non-pregnantwomanserumSample#334 C04 SRR822435 Non-pregnantwomanserumSample#435 C05 SRR822436 Non-pregnantwomanserumSample#518 F01 SRR822466 Father'sbloodfromSet#119 F06 SRR822441 Father'sbloodfromSet#620 F08 SRR822442 Father'sbloodfromSet#821 F10 SRR822443 Father'sbloodfromSet#1022 F12 SRR822440 Father'sbloodfromSet#121 M01 SRR822467 Mother'sbloodfromSet#12 M02 SRR822468 Mother'sbloodfromSet#23 M03 SRR822469 Mother'sbloodfromSet#34 M06 SRR822445 Mother'sbloodfromSet#65 M08 SRR822446 Mother'sbloodfromSet#86 M10 SRR822447 Mother'sbloodfromSet#107 M12 SRR822448 Mother'sbloodfromSet#1223 P05 SRR822452 PlacentafromSet#524 P06 SRR822453 PlacentafromSet#625 P07 SRR822454 PlacentafromSet#726 P08 SRR822455 PlacentafromSet#827 P09 SRR822456 PlacentafromSet#928 P10 SRR822457 PlacentafromSet#1029 P11 SRR822458 PlacentafromSet#1130 P12 SRR822459 PlacentafromSet#128 B01 SRR822462 UmbilicalcordbloodfromSet#19 B02 SRR822464 UmbilicalcordbloodfromSet#210 B03 SRR822464 UmbilicalcordbloodfromSet#311 B04 SRR822465 UmbilicalcordbloodfromSet#412 B06 SRR822437 UmbilicalcordbloodfromSet#613 B07 SRR822450 UmbilicalcordbloodfromSet#714 B08 SRR822438 UmbilicalcordbloodfromSet#815 B10 SRR822439 UmbilicalcordbloodfromSet#1016 B11 SRR822451 UmbilicalcordbloodfromSet#1117 B12 SRR822444 UmbilicalcordbloodfromSet#12
The dataset analyzed in this use case is available from the Short Read Archive (SRA) under Bioproject PRJNA187509.
http://www.ncbi.nlm.nih.gov/bioproject/PRJNA187509
To match the sample names from the article with SRA accessions from the exceRpt output, we visit the Bioproject page at the SRA:http://www.ncbi.nlm.nih.gov/bioproject/PRJNA187509
Use Case 3: Data Scrubbing
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 5
To match the sample names from the article with SRA accessions from the exceRpt output, we visit the Bioproject page at the SRA:http://www.ncbi.nlm.nih.gov/bioproject/PRJNA187509
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 6
Use Case 3: Data Scrubbing
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 7
Use Case 3: Data Scrubbing
After matching sample IDs between the two pipelines, we can import both tables into Excel and start generating comparison plots.
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 8
Use Case 3: Data Comparison
Results from the Article’s analysis pipeline
Use Case 3: Biomarker Potential and Limitations of Circulating miRNA
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 9
We can compare the number of reads mapped to various categories of RNA in Table S1 from the article with the same data from the exceRpt read mapping summary. Categories in common are total input reads, microRNA, ribosomal RNA, and transfer RNA.
Use Case 3: Biomarker Potential and Limitations of Circulating miRNA
Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260. 10
0
2,000,000
4,000,000
6,000,000
8,000,000
10,000,000
12,000,000
14,000,000
C01
C02
C03
C04
C05 F01
F06
F08
F10
F12
M01
M02
M03
M06
M08
M10
M12 P05
P06
P07
P08
P09
P10
P11
P12
B01
B02
B03
B04
B06
B07
B08
B10
B11
B12
Num
ber o
f Rea
ds
Sample
Article exceRpt 11
Use Case 3: exceRpt Pipeline Results -Number of Input Reads
0
2,000,000
4,000,000
6,000,000
8,000,000
10,000,000
12,000,000
C01
C02
C03
C04
C05 F01
F06
F08
F10
F12
M01
M02
M03
M06
M08
M10
M12 P05
P06
P07
P08
P09
P10
P11
P12
B01
B02
B03
B04
B06
B07
B08
B10
B11
B12
Num
ber o
f Rea
ds
Sample
Article exceRpt 12
Use Case 3: exceRpt Pipeline Results – MicroRNA
-
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
2,000,000
C01
C02
C03
C04
C05 F01
F06
F08
F10
F12
M01
M02
M03
M06
M08
M10
M12 P0
5P0
6P0
7P0
8P0
9P1
0P1
1P1
2B0
1B0
2B0
3B0
4B0
6B0
7B0
8B1
0B1
1B1
2
Num
ber o
f Rea
ds
Sample
Article exceRpt 13
Use Case 3: exceRpt Pipeline Results -Ribosomal RNA
14
Use Case 3: exceRpt Pipeline Results -Transfer RNA
-
50,000
100,000
150,000
200,000
250,000
300,000
350,000
C01
C02
C03
C04
C05 F01
F06
F08
F10
F12
M01
M02
M03
M06
M08
M10
M12 P0
5P0
6P0
7P0
8P0
9P1
0P1
1P1
2B0
1B0
2B0
3B0
4B0
6B0
7B0
8B1
0B1
1B1
2
Num
ber o
f Rea
ds
Sample
Article exceRpt
The exceRpt analysis pipeline recapitulates the analysis from Williams et al. except for significant differences in the number of reads mapped to transfer RNA.
Use Case 3: Summary
15http://genboree.org/theCommons/projects/exrna-tools-may2014/wiki/Version_Updates
We used exceRpt version 2.8.8; the latest version is 3.1.9. At the Genboree Workbench there is a version history.
Use Case 3: Summary
16http://genboree.org/theCommons/projects/exrna-tools-may2014/wiki/Version_Updates
v2.2.8
• We moved the alignment against endogenous repetitive elements (RE) to occur after the main smallRNA alignments performed by sRNABench. This is because we noticed that the RE library was able to ‘compete’ for reads that would be better annotated/interpreted as coming from tRNAs, piRNAs, or other transcripts. This competitive alignment did not ever affect miRNAs as these are always aligned to before other annotated RNAs, but we expect that this update will faithfully capture reads aligning to repetitive small-RNAs, especially tRNAs, piRNAs, and snoRNAs.
• exceRpt still aligns to REs as a final step before aligning to exogenous sequences as this is critical to remove highly repetitive endogenous sequences that might otherwise be confused as exogenous sequences.
Acknowledgements
17
BaylorCollegeofMedicine,Houston,TXAleksandarMilosavljevicMa;hewRothGenboreeTeamatBaylorAndrewJacksonSaiLakshmiSubramanianWilliamThistlethwaiteSameerPaithankarNeethuShahAaronBakerYaleUniversity,NewHaven,CTMarkGersteinJoelRozowskyRobKitchenFabioNavarroGladstoneInsAtuteAlexanderPicoAndersRiu;a
PacificNorthwestDiabetesResearchInsAtute,SeaHle,WADavidGalasRogerAlexanderRockefellerUniversity,NewYork,NYTomTuschlNIHJohnSa;erleeLillianKuoDenaProcaccini
Use Case 3: References
18
1. Williams Z., et al. (2013) Comprehensive profiling of circulating microRNA via small RNA sequencing of cDNA libraries reveals biomarker potential and limitations. PNAS 110: 4255–4260.
2. Farazi T.A., et al. (2012) Bioinformatic analysis of barcoded cDNA libraries for small RNA profiling by next-generation sequencing. Methods 58: 171-187.
3. ExRNA Portal Software Resources (http://exrna.org/resources/software)
19
Useful Links
• exRNA Portal Software Resources – http://exrna.org/resources/software
• exRNA Atlas – http://genboree.org/exRNA-atlas/
• Genboree Workbench – http://genboree.org/java-bin/workbench.jsp
• Data Coordination Center Wiki – http://genboree.org/theCommons/projects/exrna-mads/wiki
• exRNA Data Analysis Tools Wiki – http://genboree.org/theCommons/projects/exrna-tools-may2014/wiki
• Use Case Tutorials – exRNA Portal Data Resource http://exrna.org/resources/data/