8
www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago [email protected] Im IMPROVING REPRODUCIBILITY WITH CLOUDS AND NOTEBOOKS November 8, 2019 GEFI Workshop, Coimbra, Portugal

www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

Kate Keahey

Mathematics and CS Division, Argonne National Laboratory

CASE, University of Chicago

[email protected]

Im IMPROVINGREPRODUCIBILITYWITHCLOUDSANDNOTEBOOKS

November 8, 2019 GEFI Workshop, Coimbra, Portugal

Page 2: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

REPRODUCIBILITYDILEMMA

�  Challenges�  Actionabledigitalartifacts:configurations,scientificpracticesandprocesses�  Contextinwhichtheycanbeshared:instruments,resources,etc.�  Publication,discovery,indexing,etc.

�  Towardsintentionalupfrontshareableresearch

?

Should I invest in more new research instead?

Should I invest in making my experiments repeatable?

Page 3: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

CHAMELEONASASCIENTIFICINSTRUMENT� Weliketochange:testbedthatadaptsitselftoyourexperimentalneeds

�  Deepreconfigurability(baremetal)andisolation(CHI)–butalsoeaseofuse(KVM)�  CHI:poweron/off,reboot,customkernel,serialconsoleaccess,etc.

� Wewanttobeallthingstoallpeople:balancinglarge-scaleanddiverse�  Large-scale:~largehomogenouspartition(~15,000cores),5PBofstoragedistributedover

2sites(now+1!)connectedwith100Gnetwork…�  …anddiverse:ARMs,Atoms,FPGAs,GPUs,Corsaswitches,etc.

�  Cloud++:leveragingmainstreamcloudtechnologies�  PoweredbyOpenStackwithbaremetalreconfiguration(Ironic)+“specialsauce”�  ChameleonteamcontributionrecognizedasofficialOpenStackcomponent

� Welivetoserve:open,productiontestbedforComputerScienceResearch�  Startedin10/2014,testbedavailablesince07/2015,renewedin10/2017�  Currently3,500+users,500+projects,100+institutions

Page 4: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

BEYONDTHEINSTRUMENT:ANECOSYSTEMFORREPEATABILTYANDSHARING�  Clouds/testbedsgenerateawealthofshareableartifacts

�  Images,orchestrationtemplates,tools,etc.�  Clouds/testbedsas“players”forcommonartifacts

�  Repeatability/replicabilityfeatures�  Testbedversioning(>50versionsofthetestbed)�  Appliance/digitalartifactversioning�  ExperimentPrecis:ahistorycommandanalogue:usestestbedloggingdatato

reconstructasetofactions

�  Documentingascientificprocess�  Imperative,non-transactional,versioncontrolled,etc.�  Orchestrationversusnotebooks

Page 5: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

CHAMELEONJUPYTERINTEGRATION�  Combiningtheeaseofnotebooksandthepowerofasharedplatform

�  StorytellingwithJupyter:ideas/text,process/code,results�  Chameleon:sophisticatedexperimentalcontainersinneedof“storytelling”

�  JupyterLabserverforourusers

�  Gotojupyter.chameleoncloud.organdusewithyourChameleoncredentials

�  Chameleon/Jupyterintegration�  Python/bashinterfacestothetestbed,

storingandsharing,Chameleoncredentials�  Namedcontainers

�  Templatesofexistingexperiments

Screencastofacomplexexperiment:https://vimeo.com/297210055“ACaseforIntegratingExperimentalContainerswithNotebooks”,CloudCom2019

Page 6: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

SHARING,PUBLISHING,LEVERAGING� Wenowhaveeverythingweneedtoshareexperiments

�  Waystoestablishanexperimentalenvironment+player

�  Waystodocumentanexperimentalprocess

�  Butwait…howdoIactuallysharethem?�  Sendmail,Chameleonobjectstore,github…

�  PublishingviaZenodo:storeyourexperimentsandmakethemcitableviaDOIs

�  Creatingbridges,integration�  Import/Exportfrom/toZenodo

� Makingresearchfindable:thesharingplatform

SC19Poster:SharingandReplicabilityofNotebook-BasedResearchonOpenTestbeds

Page 7: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

?

Well-documented process

Executable code

Accessible, consistent code environment

Easy to find experiment

?

Notebooks

Open testbeds Sharing

services

?

Well-documented process

Executable code

Accessible, consistent code environment

Easy to find experiment

?

Notebooks

Open testbeds Sharing

services

?

Well-documented process

Executable code

Accessible, consistent code environment

Easy to find experiment

?

Notebooks

Open testbeds Sharing

services

?

Well-documented process

Executable code

Accessible, consistent code environment

Easy to find experiment

?

Notebooks

Open testbeds Sharing

services

Well-documented process

Experiment actions

Accessible, consistent experimental environment

Publically shared experiment

? Integration

Publishing platform

Open testbeds

Notebooks

Sharing platform

Page 8: www. chameleoncloud.org IMPROVING REPRODUCIBILITY WITH ... · www. chameleoncloud.org Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago

www. chameleoncloud.org

PARTINGTHOUGHTS�  Logisticalbarriersstuntcreativityandambition�  TowardsaDigitalResearchEcosystem:ameetingplaceofusersand

providerssharingresourcesandresearch�  Clouds/testbedsaremorethanjustexperimentalplatforms;theycreatea“common

denominator”thatcaneliminatemuchcomplexitythatgoesintosystematicexperimentation,sharing,andreproducibility

�  Notebooks+testbedsprovideboththesharingunderpinnings(commonartifacts)andtheabilitytodocumenttheprocess

�  Digitalerapublishingtoolsfacilitatesharing

�  Leveragingnewdigitalartifactandsharingpatternstowardsupfrontshareableresearch