Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
ToolDependenciesandContainers
WhataretheadvantagesofrunningmyGalaxytoolinsideofacontainer?
HowdoesGalaxyfindacontainertorunmytoolin?
WhatareBioContainersandhowaretheyrelatedtoGalaxy?
Questions
ExplorethedifferencesbetweencontainerizingGalaxyandtoolexecution.
Discusstheadvantagesofcontainerizingtools.
Learntobuildbestpracticetoolsreadytobecontainerized.
Objectives
PlanemoTheseslidesmirrorthesectionon"DependenciesandDocker"inthePlanemo
Documentation.
GenericContainersareGoodSlide
IsolationandSecurity*ReproducibilityFlexibility*
*theindustryisgettingthere
GalaxyinContainers?
ContainerizingGalaxyvsToolsWearegoingtodiscusscontainerizingtoolexecutioninstead-executingjust
jobsincontainers.
Containersfortheparticularjob'stool.
HoweveryoudeployGalaxy,includinginacontainer,toolexecutioncanstillbecontainerized.
Isolatedtoolexecution.Isolatefilesystemaccess.Addedlayerofsecurity.Increasedre-computability.Newdeploymentoptions-Kubernetes,MesosChronos,AWSBatch,etc.
ContainerizingToolsisStillImportant
Decomposestobasicproblems:
InstructGalaxywheretofindacontainerforthetool.InstructGalaxytorunthetoolinacontainer.
ContainerizingToolExecution
Justconfigurethedestination.Forinstance,transformtheclusterdestination:
<destinationid="short_fast"runner="slurm"><paramid="nativeSpecification">--time=00:05:00--nodes=1</destination>
asfollows:
<destinationid="short_fast"runner="slurm"><paramid="nativeSpecification">--time=00:05:00--nodes=1<paramid="docker_enabled">true<paramid="docker_sudo">false</destination>
Thatisit!
Fordevelopment,thePlanemoflag--dockerdoesthisfortest,serve,andrelatedcommands.
ConfiguringGalaxytoUseContainers
Decomposestobasicproblems
InstructGalaxywheretofindacontainerforthetool.InstructGalaxytorunthetoolinacontainer.
ContainerizingToolExecution
ExplicitContainerDependenciesReturningtotheseqtkexample,letschangetherequirementsfrom:
<requirements><requirementtype="package"version="1.2">seqtk</requirement></requirements>
into
<requirements><containertype="docker">quay.io/biocontainers/seqtk:1.2--1</container><requirementtype="package"version="1.2">seqtk</requirement></requirements>
NowrunPlanemotestandservewiththe--dockerflagandasatooldeveloperyouaredone!
Decomposestobasicproblems
InstructGalaxywheretofindacontainerforthetool.InstructGalaxytorunthetoolinacontainer.
We'redone...right?
ContainerizingToolExecution
TheProblemswithMakingDockerExplicitSettingupaDockerfileandpublishingaDockerimagemoreprocessforthetooleventhoughthedependencieshavealreadybeencompletelydefined.AnarbitraryDockerimageisablackboxandthereisnoguaranteeGalaxywillexecutethesamebinariesastheCondarequirements.
ToPutitAnotherWayThis:
<requirements><requirementtype="package"version="1.2">seqtk</requirement></requirements>
shouldhavebeensufficient!
Andthegoodnewsisthatnowitis(mostly)!
Galaxycannowautomaticallyfindorbuildcontainersforbestpracticetools.
Planemowillcheckifsuchacontainerhasbeenpublishedwiththe--biocontainersflagtoplanemolint.
$planemolint--biocontainersseqtk_seq.xml...Applyinglinterbiocontainer_registered...CHECK..INFO:BioContainerbest-practicecontainerfound[quay.io/biocontainers/seqtk:1.2--1].
BioContainers-TheMagic
TheMysteryquay.io/biocontainers/seqtkContainer
RunPlanemotestorservewith--biocontainerstotrymysterycontainer.
$planemotest--biocontainersseqtk_seq.xml...[galaxy.tools.actions]SetupforjobJob[unflushed,tool_id=seqtk_seq]complete,readytoflush(20.380ms)[galaxy.tools.actions]FlushedtransactionforjobJob[id=2,tool_id=seqtk_seq](15.191ms)...[galaxy.tools.deps.containers]Checkingwithcontainerresolver[ExplicitDockerContainerResolver[]]founddescription[None][galaxy.tools.deps.containers]Checkingwithcontainerresolver[CachedMulledDockerContainerResolver[namespace=None]]founddescription[None][galaxy.tools.deps.containers]Checkingwithcontainerresolver[MulledDockerContainerResolver[namespace=biocontainers]]founddescription[ContainerDescription[identifier=quay.io/biocontainers/seqtk:1.2--1,type=docker]][galaxy.jobs.command_factory]Builtscript[/tmp/tmpw8_UQm/job_working_directory/000/2/tool_script.sh]fortoolcommand[seqtkseq-a'/tmp/tmpw8_UQm/files/000/dataset_1.dat'>'/tmp/tmpw8_UQm/files/000/dataset_2.dat']...ok
----------------------------------------------------------------------XML:/private/tmp/tmpw8_UQm/xunit.xml----------------------------------------------------------------------Ran1testin11.926sOK
Theimportantlinehereis-Checkingwithcontainerresolver[MulledDockerContainerResolver[namespace=biocontainers]]founddescription[ContainerDescription[identifier=quay.io/biocontainers/seqtk:1.2--1,type=docker]].
BioContainers-UsingtheContainer
BioContainers-TheCommunity
AllBiocondapackagesarebuiltintominimalcontainers.
Thissetupallowsthesamebinariestobeusedwithinthecontainerorontraditional/HPCresources.Withoutanyextraworkbytoolauthors,Galaxy
canautomaticallyfindorbuild“thecorrect”containerforabest-practicetool.
Over4,000containersalreadypublished.
BioContainers-BiocondaforContainers
BioContainers-LotsofContainers
Galaxycanbeconfiguredtoattempttobuildcontainersondemandforcontainersthathaven'tbeenpublishedtotheBioContainersnamespaceon
quay.io.
Thesecontainerswillhavesamenamesaswouldbepublishedtoquay.io(e.g.quay.io/biocontainers/seqtk:1.2--0).
BioContainers-OnDemandCreation
Decomposestobasicproblems
InstructGalaxywheretofindacontainerforthetool.InstructGalaxytorunthetoolinacontainer.
We'redonethistimenowright?
ContainerizingToolExecution
RevisitingGalaxyRequirements
Manytoolshavemultiplerequirements,theseneedcontainersalso!
AnExampleTheBWAtoolisanexampleofonesuchtoolthathasmultiplerequirements-
becausesamtoolsisusedtosortBAMsaftermapping.ThePlanemoconda_testingprojectisdistributedwithasimpletooltosimulatethis
containingbothbwaandsamtoolsrequirements.
$planemoproject_init--template=conda_testingconda_testing$cdconda_testing/$grep-rrequirebwa_and_samtools.xmlbwa_and_samtools.xml:<requirements>bwa_and_samtools.xml:<requirementtype="package"version="0.7.15">bwa</requirement>bwa_and_samtools.xml:<requirementtype="package"version="1.3.1">samtools</requirement>bwa_and_samtools.xml:</requirements>
ContainerHashingGalaxyfindscontainersbasedonthenamesandversionsofrequirements,so
farwehaveseenthehashforsinglerequirementsisjustthenameandtheversion.
Ifmultiplerequirementsarepresent,[email protected]@1.3.1,Galaxywilllookforacontainercalled:
quay.io/biocontainers/mulled-v2-fe8faa35dbf6dc65a0f7f5d4ea12e31a79f73e40:03dc1d2818d9de56938078b8b78b82d967c1f820-0
GalaxyTerminology-MulledTomullistocreateanenvironment(eitherintheCondasenseorgloballyinsideacontainer)foroneormoreCondapackages.Theresultofthisisa
mulledenvironment.
Fixednamingschemesfortheresultingenvironmentsensurethatdifferenttoolswiththesamesetofrequirementscanreuseapreviouslycreated
environmentorcontainer.
Ifmorethanonepackageisincludedintheresultingenvironment,acomplicatedhashisusedasthisname.
MulledHashingBacktotheexampleofbwaatversion0.7.15andsamtoolsatversion1.3.1,we
saidtheresultingcontainerwillbe
quay.io/biocontainers/mulled-v2-fe8faa35dbf6dc65a0f7f5d4ea12e31a79f73e40:03dc1d2818d9de56938078b8b78b82d967c1f820-0
Thiscanbebrokenintotheseparts:
quay.io/<namespace>/mulled-v2-<package_hash>:<version_hash>-<build>
ExploringMulledHashes-byEvgenyAnatskiy
ThePlanemomullcommand(1/2)WhileGalaxycanbeconfiguredtoauto-buildBioContainersastheyareneeded,thePlanemomullcommandcanbeusedtomanuallybuildthemforyourlocalDockerhost.
Themullcommand(2/2)$planemomullbwa_and_samtools.xml/Users/john/.planemo/involucro-v=3-f/Users/john/workspace/planemo/.venv/lib/python2.7/site-packages/galaxy_lib-17.9.0-py2.7.egg/galaxy/tools/deps/mulled/invfile.lua-setCHANNELS='iuc,bioconda,r,defaults,conda-forge'-setTEST='true'-setTARGETS='samtools=1.3.1,bwa=0.7.15'-setREPO='quay.io/biocontainers/mulled-v2-fe8faa35dbf6dc65a0f7f5d4ea12e31a79f73e40:03dc1d2818d9de56938078b8b78b82d967c1f820'-setBINDS='build/dist:/usr/local/'-setPREINSTALL='condainstall--quiet--yesconda=4.3'build/Users/john/.planemo/involucro-v=3-f/Users/john/workspace/planemo/.venv/lib/python2.7/site-packages/galaxy_lib-17.9.0-py2.7.egg/galaxy/tools/deps/mulled/invfile.lua-setCHANNELS='iuc,bioconda,r,defaults,conda-forge'-setTEST='true'-setTARGETS='samtools=1.3.1,bwa=0.7.15'-setREPO='quay.io/biocontainers/mulled-v2-fe8faa35dbf6dc65a0f7f5d4ea12e31a79f73e40:03dc1d2818d9de56938078b8b78b82d967c1f820'-setBINDS='build/dist:/usr/local/'-setPREINSTALL='condainstall--quiet--yesconda=4.3'buildDEBURunfile[/Users/john/workspace/planemo/.venv/lib/python2.7/site-packages/galaxy_lib-17.9.0-py2.7.egg/galaxy/tools/deps/mulled/invfile.lua]STEPRunimage[continuumio/miniconda:latest]withcommand[[rm-rf/data/dist]]DEBUCreatingcontainer[step-730a02d79e]DEBUCreatedcontainer[5e4b5f83c455step-730a02d79e],startingitDEBUContainer[5e4b5f83c455step-730a02d79e]started,waitingforcompletionDEBUContainer[5e4b5f83c455step-730a02d79e]completedwithexitcode[0]asexpectedDEBUContainer[5e4b5f83c455step-730a02d79e]removedSTEPRunimage[continuumio/miniconda:latest]withcommand[[/bin/sh-ccondainstall--quiet--yesconda=4.3&&condainstall-ciuc-cbioconda-cr-cdefaults-cconda-forgesamtools=1.3.1bwa=0.7.15-p/usr/local--copy--yes--quiet]]DEBUCreatingcontainer[step-e95bf001c8]DEBUCreatedcontainer[72b9ca0e56f8step-e95bf001c8],startingitDEBUContainer[72b9ca0e56f8step-e95bf001c8]started,waitingforcompletionSOUTFetchingpackagemetadata.........SOUTSolvingpackagespecifications:.SOUTSOUTPackageplanforinstallationinenvironment/opt/conda:SOUTSOUTThefollowingpackageswillbeUPDATED:SOUTSOUTconda:4.3.11-py27_0-->4.3.22-py27_0SOUTSOUTFetchingpackagemetadata.................SOUTSolvingpackagespecifications:....SOUTDEBUContainer[72b9ca0e56f8step-e95bf001c8]completedwithexitcode[0]asexpectedDEBUContainer[72b9ca0e56f8step-e95bf001c8]removedSTEPWrap[build/dist]as[quay.io/biocontainers/mulled-v2-fe8faa35dbf6dc65a0f7f5d4ea12e31a79f73e40:03dc1d2818d9de56938078b8b78b82d967c1f820-0]DEBUCreatingcontainer[step-6f1c176372]DEBUPackingsucceeded
Thisbuiltquay.io/biocontainers/mulled-v2-fe8faa35dbf6dc65a0f7f5d4ea12e31a79f73e40:03dc1d2818d9de56938078b8b78b82d967c1f820-0!
Testinglocallymulledcontainers$planemotest--galaxy_branchdev--biocontainersbwa_and_samtools.xml...[galaxy.tools.actions]Handledoutputnamedoutput_2fortoolbwa_and_samtools(17.443ms)[galaxy.tools.actions]Addedoutputdatasetstohistory(12.935ms)[galaxy.tools.actions]VerifiedaccesstodatasetsforJob[unflushed,tool_id=bwa_and_samtools](0.021ms)[galaxy.tools.actions]SetupforjobJob[unflushed,tool_id=bwa_and_samtools]complete,readytoflush(5.755ms)[galaxy.tools.actions]FlushedtransactionforjobJob[id=1,tool_id=bwa_and_samtools](19.582ms)[galaxy.jobs.handler](1)Jobdispatched[galaxy.tools.deps]Usingdependencybwaversion0.7.15oftypeconda[galaxy.tools.deps]Usingdependencysamtoolsversion1.3.1oftypeconda[galaxy.tools.deps]Usingdependencybwaversion0.7.15oftypeconda[galaxy.tools.deps]Usingdependencysamtoolsversion1.3.1oftypeconda[galaxy.tools.deps.containers]Checkingwithcontainerresolver[ExplicitContainerResolver[]]founddescription[None][galaxy.tools.deps.containers]Checkingwithcontainerresolver[CachedMulledContainerResolver[namespace=None]]founddescription[ContainerDescription[identifier=quay.io/biocontainers/mulled-v1-01afc412d1f216348d85970ce5f88c984aa443f3:latest,type=docker]][galaxy.jobs.command_factory]Builtscript[/tmp/tmpQs0gyp/job_working_directory/000/1/tool_script.sh]fortoolcommand[bwa>/tmp/tmpQs0gyp/files/000/dataset_1.dat2>&1;samtools>/tmp/tmpQs0gyp/files/000/dataset_2.dat2>&1][galaxy.tools.deps]UsingdependencysamtoolsversionNoneoftypeconda[galaxy.tools.deps]UsingdependencysamtoolsversionNoneoftypecondaok
----------------------------------------------------------------------XML:/private/tmp/tmpQs0gyp/xunit.xml----------------------------------------------------------------------Ran1testin7.553s
OK[test_driver]Shuttingdown[test_driver]Shuttingdownembeddedgalaxywebserver[test_driver]Embeddedwebservergalaxystopped[test_driver]Stoppingapplicationgalaxy...[galaxy.jobs.handler]jobhandlerstopqueuestoppedTestingcomplete.HTMLreportisin"/home/planemo/workspace/planemo/tool_test_output.html".All1test(s)executedpassed.bwa_and_samtools[0]:passed
TheGoal
Use--biocontainerstobuildacontaineron-demandforatesttool.
Hands-on
Steps
Runthroughthebwa_and_samtools.xmltesttoolandverifycontainercreation.
$planemoproject_init--project_template=conda_testingconda_testing$cdconda_testing/$planemomullbwa_and_samtools.xml$dockerimages#verifythecontainerwasbuilt$#usedockerruntoverifythecontainerhassamtoolsandbwa.$planemotest--biocontainersbwa_and_samtools.xml
Hands-on
PublishingMulti-PackageContainers
PublishingYourMulti-PackageContainers
Itisbecomingeasier,moreadvantageous,andmorecommonforGalaxyadminstorunalltoolswithintheirowncontainer.
Youcanexplicitlydefineacontainerforyourtool-butitiseasierandmorereproducibletoletGalaxyfindorbuildoneusingyourtool'sbestpracticerequirements.
TheGalaxycommunitywillinfrastructuretoautomaticallybuildand/orpublishcontainersforyourtoolaslongasitdefinesbestpracticeCondadependencies.
Planemomakesiteasytotestyourtoolinsideofcontainers.
Keypoints
Thankyou!Thismaterialistheresultofacollaborativework.ThankstheGalaxyTraining
Networkandallthecontributors(JohnChilton,BjörnGrüning)!
Foundatypo?Somethingiswronginthistutorial?EdititonGitHub