Upload
berenice-batut
View
12.919
Download
0
Embed Size (px)
Citation preview
ASaiMAGalaxyframeworktoanalyzegutmicrobiotadata
BéréniceBatut—EACIDAM,Clermont-Ferrand
November19th,2015
EuropeanNucleotideArchive"humangutmetagenome"search:
236studies
1,545runs,...
�Dispersedandnotcomparableinformation
MustCollectdatasets
Analyzethemgivenastandardworkflow
Quality control
Sequence sorting
Taxonomic analysis Functional analysis
Raw sequences
Formatted taxonomic assignations Formatted functional assignations
ExistingtoolsQIIME,MG-RAST,EBImetagenomics,MetAMOS,...
Butnoneofthemfollowsalltherequirements:
Analyzedatasetsgiventhestandardworkflow
ExistingtoolsQIIME,MG-RAST,EBImetagenomics,MetAMOS,...
Butnoneofthemfollowsalltherequirements:
Analyzedatasetsgiventhestandardworkflow
Usegutmicrobiotaspecificdatabases
ExistingtoolsQIIME,MG-RAST,EBImetagenomics,MetAMOS,...
Butnoneofthemfollowsalltherequirements:
Analyzedatasetsgiventhestandardworkflow
Usegutmicrobiotaspecificdatabases
Combineuser-friendlyinterfaceandcommand-line
ASaiMAuvergneSequenceanalysisofintestinalMicrobiota
Anenvironmenttoanalyzemetagenomicand
metatranscriptomicsequencesfromgutmicrobiota
R1 sequences R2 sequences
COG databaseNon rRNA sequencesrRNA sequencesLong rRNA sequences
Functional assignation
Diamond
KEGG module abundance
KEGG module coverage
HUMAnN
COG family coverage
COG family abundance
KEGG pathway abundance
KEGG pathway coverage
Similarity search report
Taxonomic assignation
MetaPhlAnQIIME
De novo OTU picking
Taxonomic assignation reportof non rRNA sequences
OTU of long rRNA sequences
QIIME
Taxonomic assignation of OTU
OTU table of long rRNA sequences
QIIME
Community summary by
taxonomic composition
Taxonomy table of long rRNA sequences
QIIME
Alpha diversity and alpha
rarefaction computation
Alpha diversity of long rRNA sequences
Alpha rarefaction of long rRNA sequences
QIIME
De novo OTU picking
OTU of long rRNA sequences
QIIME
Taxonomic assignation of OTU
OTU table of long rRNA sequences
QIIME
Community summary by
taxonomic composition
Taxonomy table of long rRNA sequences
QIIME
Alpha diversity and alpha
rarefaction computation
Alpha diversity of long rRNA sequences
Alpha rarefaction of long rRNA sequences
Paired-end assembly
FastQ Joiner
Quality control
PRINSEQ
Sequence sorting
Reago SortMeRNA
Paired-end assembled sequences
Quality controlled sequences
Mainrequirements
�Generationofworkflowwithnumeroustools
�Easyuse
�Flexibilityandmodularity
�Incorporationofwanted/neededtoolsanddatabases
ThingsItriedSimplePythonscripts
WorkflowmanagerssuchasLuigi,Airflow,...
Homemadeapproach
Configurationfile
Workflowdescription
Webinterfaceforgeneration
Pythonscriptstoexecuteworkflow
GalaxyFitmainrequirements
�Generationofworkflowwithnumeroustools
�Easyuse
�Flexibilityandmodularity
�Incorporationofwanted/neededtoolsanddatabases
TolaunchtheinstanceGetthecodesourcefrom
Installtherequireddependencies
Launchtheinstance
Browseiton
GitHub
$ git clone [email protected]:ASaiM/framework.git
$ cd framework$ ./src/launch_galaxy.sh
http://127.0.0.1:8080/
BehindthemagicShellscriptstoconfiguretheinstance
1. GetlatestrevisionofGalaxyfromGitHub
2. Preparedatabasesandlocaltools
3. Configurewith
Customconfigurationfiles
Wantedtools
Wanteddatabases
4. LaunchGalaxy
Tools
FromstandardGalaxyinstance
FromToolShed
Developedwrappers
Planemo
IntegrationintestToolShed
https://github.com/ASaiM/galaxytools
DatabasesSortMeRNAribosomaldatabases
COG
RefSeq
Catalogofreferencegenesinthehumangutmicrobiomefrom
Lietal.(2014)
Greengenes
Todo
�Automatizetheconfigurationanddeploymentofthe
instancewithAnsible
�Addtoolsindevelopment,databases,workflows
�Validateworkflowsondatasets(local,mock,...)
�IntegratetoolsandworkflowstotheToolShed
�AutomatizetoolintegrationfromToolShedwithAnsible
�Completethedocumentation
ThankYou.Questions?
�
�
�
http://asaim.github.io
bebatut.fr
github.com/bebatut
twitter.com/bebatut