34
ASaiM An intuitive and adjustable pipeline to process metatranscriptomic data from intestinal microbiota Bérénice Batut, Clémence Defois, Céline Ribière, Cyrielle Gasc, Jean-François Brugère, Eric Peyretaillade, CPER consortium Environnement Digestif, Pierre Peyret

ASaiM: an environment to analyze intestinal microbiota - Demo with analysis of gut metatranscriptomic sequences

  • Upload
    bebatut

  • View
    1.046

  • Download
    2

Embed Size (px)

Citation preview

ASaiM An intuitive and adjustable pipeline to process metatranscriptomic data from

intestinal microbiota

Bérénice Batut, Clémence Defois, Céline Ribière, Cyrielle Gasc, Jean-François Brugère, Eric Peyretaillade, CPER consortium Environnement

Digestif, Pierre Peyret

ASaiM An intuitive and adjustable pipeline to process metatranscriptomic data from

intestinal microbiota

Bérénice Batut, Clémence Defois, Céline Ribière, Cyrielle Gasc, Jean-François Brugère, Eric Peyretaillade, CPER consortium Environnement

Digestif, Pierre Peyret

ASaiM An environment to analyze

intestinal microbiota Demo with analysis of gut

metatranscriptomic sequences  

Why ASaiM?

3

Why ASaiM?

4

Why ASaiM?

5

Gut metagenomic projects

NCBI 2318

ENA 1103

DDBJ 28

MG-Rast 46

Camera 3

Total 3508

Why ASaiM?

6

Gut metagenomic projects

NCBI 2318

ENA 1103

DDBJ 28

MG-Rast 46

Camera 3

Total 3508

But Difficult to query those databases Not standardized information

Why ASaiM?

7

Available tools for metagenomic and metatranscriptomic sequence processing

Why ASaiM?

8

Available tools for metagenomic and metatranscriptomic sequence processing

But

Almost nothing for metatranscriptomic sequences Difficult to use Not adjustable Only one step in sequence processing and analysis

A complete solution

9

A complete solution

10

ASaiM framework A solution to process sequences

11

ASaiM framework A solution to process sequences

12

ASaiM framework A solution to process sequences

13

Philosophy Easy to use Adjustable

Demo on gut metatranscriptomic sequences

14

Proposed sequence process pipeline

15

Proposed sequence process pipeline

16

Checkout the code!

17

Download source code Move to demo directory

$ git clone https://github.com/ASaiM/ASaiM.git

$ cd demo

$ ls

config_file.json R2_sequences.fastq R1_sequences.fastq

config_file.json

18

Download source code Move to demo directory

$ git clone https://github.com/ASaiM/ASaiM.git

$ cd demo

$ ls

config_file.json R2_sequences.fastq R1_sequences.fastq

Your pipeline = Your config_file.json

19

Your pipeline = Your config_file.json

20

Your pipeline = Your config_file.json

21

Your pipeline = Your config_file.json

22

Web interface

23

h#p://g2im.u-­‐clermont1.fr/asaim/  

(Really) easy pipeline execution

24

Install requirements Docker Docker-compose make

(Really) easy pipeline execution

25

Install requirements Docker Docker-compose make

Execute the pipeline

$ cd demo/

$ make –f ../Makefile run_pipeline

What is behind the magic?

26

What is behind the magic?

27

Generated outputs

28

2015-07-02_19-31/ report.txt quality_estimation/ FastQC/ quality_treatments/ Prinseq/ paired_end_assembly/ FastQ_Join/ rna_sorting/ SortMeRNA/ non_rRNA_taxonomic_assignation/ MetaPhlAn/ protein_ncrna_db_search/ search_against_cog/ Blast/

Generated outputs

29

2015-07-02_19-31/ report.txt quality_estimation/ FastQC/ quality_treatments/ Prinseq/ paired_end_assembly/ FastQ_Join/ rna_sorting/ SortMeRNA/ non_rRNA_taxonomic_assignation/ MetaPhlAn/ protein_ncrna_db_search/ search_against_cog/ Blast/

k__Bacteria  100.0  k__Bacteria|p__Bacteroidetes  95.68413  k__Bacteria|p__Fusobacteria  4.31587  k__Bacteria|p__Bacteroidetes|c__Bacteroidia  92.62004  k__Bacteria|p__Fusobacteria|c__Fusobacteria  4.31587  k__Bacteria|p__Bacteroidetes|c__Flavobacteria  3.06409  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales  92.62004  k__Bacteria|p__Fusobacteria|c__Fusobacteria|o__Leptotrichales  4.31587  k__Bacteria|p__Bacteroidetes|c__Flavobacteria|o__Flavobacteriales  3.06409  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Bacteroidaceae  88.87376  k__Bacteria|p__Fusobacteria|c__Fusobacteria|o__Leptotrichales|f__Leptotrichales_unclassified  4.31587  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae  3.74628  k__Bacteria|p__Bacteroidetes|c__Flavobacteria|o__Flavobacteriales|f__Flavobacteriaceae  3.06409  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Bacteroidaceae|g__Bacteroides  88.87376  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae|g__Parabacteroides  3.74628  k__Bacteria|p__Bacteroidetes|c__Flavobacteria|o__Flavobacteriales|f__Flavobacteriaceae|g__Cellulophaga  3.06409  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Bacteroidaceae|g__Bacteroides|s__Bacteroides_unclassified  88.87376  k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae|g__Parabacteroides|s__Parabacteroides_unclassified  3.74628  k__Bacteria|p__Bacteroidetes|c__Flavobacteria|o__Flavobacteriales|f__Flavobacteriaceae|g__Cellulophaga|s__Cellulophaga_unclassified

 3.06409  

report.txt Executed treatments and some results

30

2015-07-02_19-31/ report.txt quality_estimation/ FastQC/ quality_treatments/ Prinseq/ paired_end_assembly/ FastQ_Join/ rna_sorting/ SortMeRNA/ non_rRNA_taxonomic_assignation/ MetaPhlAn/ protein_ncrna_db_search/ search_against_cog/ Blast/

Pretreatments...    Quality  control...      Quality  esOmaOon...        Run  FastQC...      Quality  treatment...        Run  PRINSEQ...          60  bad  sequences  for  R1          979  bad  sequences  for  R2          6136  conserved  sequences  for  R1          5217  conserved  sequences  for  R2    Paired-­‐end  assembly...      Run  FastQ_Join...        3777  joined  sequences        2359  single  sequences  for  R1        1440  single  sequences  for  R2    RNA  sorOng...      Run  SortMeRNA...        1465  rRNA  sequences        2312  non  rRNA  sequences  

Taxonomic  assignaOon...    Non  rRNA  sequence  taxonomic  assignaOon...      Run  MetaPhlAn…    

FuncOonal  assignaOon...    Search  against  protein  and  ncRNA  databases...      Search  against  COG  database...        Run  Blast…  

Current status

31

Core structure Available

7 tools 1 sequence database (COG)

Open-source Documentation

https://asaim.github.io/

ASaiM

What’s next?

32

Short term More detailed reports Addition of visualization tools Tests

Long term

More tools and treatments Better web interface Expert database

Who is involved?

33

CPER consortium Environnement digestif

Want to try? We need feedback!!

Tools and treatments to add? Ideas?

34

http://asaim.github.io/