3
eResearch Australasia Conference | Melbourne – Australia | 10 - 14 October - 2016 Chiminey: Connecting Researchers to Cloud, HPC and Big Data Resources Ian Thomas 1 , Iman Yusuf 2 , Heinz Schmidt 3 , Grischa Meyer 4 , Steve Androulakis 5 , Daniel Drumm 6 1. eResearch Office, RMIT University, Melbourne, Australia, [email protected] 2. eResearch Office, RMIT University, Melbourne, Australia, [email protected] 3. eResearch Office, RMIT University, Melbourne, Australia, [email protected] 4. Monash eResearch Centre, Monash University, Melbourne, Australia, [email protected] 5. Monash Bioinfomatics Platform, Monash University, Melbourne, Australia, [email protected] 6. ARC Centre of Excellence for Nanoscale BioPhotonics, RMIT University, Melbourne, [email protected] ABSTRACT Many scientific experiments have a twofold challenge: they are challenging as examples of complicated domain-specific research tasks (i.e., a complicated analyses of the quantum physics approaches) and at the same time challenging in that the corresponding computations and datasets are too large-scale to be executed on a single desktop machine. That is, more sophisticated approaches are required, such as cloud-based and high-performance computing (HPC) solutions. Enabling of scientific experiments increasingly includes data, software, computational and simulation elements, often embarrassingly parallel, long running and data-intensive. Increasingly, such experiments are run in a cloud environment or on high-end clusters and supercomputers. Many disciplines in sciences and engineering (and outside computer science) find the requisite computational skills distracting from their focus on their science domain. In this presentation we describe Chiminey, our containerized cloud-based platform that provides a reliable computing and data management service. BACKGROUND Any new technology usually means both opportunities and challenges, as technology often mandates some initial knowledge acquisition task of its users. Cloud computing [1] enables acquisition of very large computing and storage resources that can be integrated with big data technologies for massive-scale computation. Nevertheless, failure while setting up a cloud-based execution environment or during the execution itself is arguably inevitable: some or all of the requested virtual machines (VMs) may not be successfully created/instantiated, or the communication with an existing VM may fail due to long-distance network failure – given clouds data centers are typically remote and communication crosses many network boundaries. Also, all tasks of such parallel computations are required to complete, therefore the failure of any one of them may corrupt the result in some way. Statistically this means that the reliability of the overall task completion is the product of that of the individual tasks – and with very many thousands or millions of compute tasks, this may quickly become a vanishingly small number. When using cloud computing platforms and big data technologies like Hadoop, scientists require operational skills and to some extent knowledge of aspects of fault tolerance. Scientists need to learn, for example, how to create and set up virtual machines (VMs), collect the results of experiments, and finally destroy VMs. Such challenges distract the user from focusing on their core goals. Thus, the Chiminey platform encapsulates these problems and isolates them from the user. This allows the user to focus on domain-specific problems, and to delegate the tool to deal with the detail that comes with accessing high-performance and cloud computing infrastructure, and the data management challenges posed. CHIMINEY: CONNECTING RESEARCHERS TO CLOUD, HPC AND BIG DATA RESOURCES In this presentation we describe Chiminey, our cloud-based platform that provides a reliable computing and data management service. Chiminey enables domain scientists to execute new and legacy applications on cloud-based, big data (i.e. Hadoop), and high-performance computing (HPC) facilities without the need to have a technical understanding of cloud computing, HPC, fault tolerance, or data management. Chiminey handles failure during the execution of applications. It also curates and visualises execution outputs using MyTardis, through which data can be shared with collaborators or with the public. We developed Chiminey under direction by quantum physicists and molecular biologists, to reduce the learning curve in data management and software platforms required for the complex computational target systems. This open-source

Chiminey: Connecting Researchers to Cloud, HPC and Big ... · Quantum Physics group, he has also worked on ab initio high-performance computing simulations of defects and arrays of

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chiminey: Connecting Researchers to Cloud, HPC and Big ... · Quantum Physics group, he has also worked on ab initio high-performance computing simulations of defects and arrays of

eResearchAustralasiaConference|Melbourne–Australia|10-14October-2016

Chiminey:ConnectingResearcherstoCloud,HPCandBigDataResources

IanThomas1,ImanYusuf2,HeinzSchmidt3,GrischaMeyer4,SteveAndroulakis5,DanielDrumm61.eResearchOffice,RMITUniversity,Melbourne,Australia,[email protected] 2.eResearchOffice,RMITUniversity,Melbourne,Australia,[email protected],RMITUniversity,Melbourne,Australia,[email protected],MonashUniversity,Melbourne,Australia,[email protected],MonashUniversity,Melbourne,Australia,Steve.Androulakis@monash.edu6.ARCCentreofExcellenceforNanoscaleBioPhotonics,RMITUniversity,Melbourne,[email protected]

ABSTRACTManyscientificexperimentshaveatwofoldchallenge:theyarechallengingasexamplesofcomplicateddomain-specificresearch tasks (i.e.,acomplicatedanalysesof thequantumphysicsapproaches)andat thesametimechallenging inthat the corresponding computations and datasets are too large-scale to be executed on a single desktopmachine.Thatis,moresophisticatedapproachesarerequired,suchascloud-basedandhigh-performancecomputing(HPC)solutions.Enablingofscientificexperiments increasingly includesdata,software,computationalandsimulationelements,oftenembarrassinglyparallel, long runninganddata-intensive. Increasingly, suchexperimentsare run inacloud environment or on high-end clusters and supercomputers.Many disciplines in sciences and engineering (andoutside computer science) find the requisite computational skills distracting from their focus on their sciencedomain. In this presentationwedescribeChiminey, our containerized cloud-basedplatform that provides a reliablecomputinganddatamanagementservice.

BACKGROUNDAny new technology usually means both opportunities and challenges, as technology often mandates some initialknowledgeacquisitiontaskof itsusers.Cloudcomputing [1]enablesacquisitionofvery largecomputingandstorageresourcesthatcanbeintegratedwithbigdatatechnologiesformassive-scalecomputation.Nevertheless,failurewhilesettingupacloud-basedexecutionenvironmentorduringtheexecutionitselfisarguablyinevitable:someoralloftherequestedvirtualmachines(VMs)maynotbesuccessfullycreated/instantiated,orthecommunicationwithanexistingVMmayfailduetolong-distancenetworkfailure–givencloudsdatacentersaretypicallyremoteandcommunicationcrossesmanynetworkboundaries.Also,alltasksofsuchparallelcomputationsarerequiredtocomplete,thereforethefailureofanyoneofthemmaycorrupttheresultinsomeway.Statisticallythismeansthatthereliabilityoftheoveralltaskcompletionistheproductofthatoftheindividualtasks–andwithverymanythousandsormillionsofcomputetasks,thismayquicklybecomeavanishinglysmallnumber. WhenusingcloudcomputingplatformsandbigdatatechnologieslikeHadoop,scientistsrequireoperationalskillsandtosomeextentknowledgeofaspectsoffaulttolerance.Scientistsneedtolearn,forexample,howtocreateandsetupvirtualmachines (VMs),collect the resultsofexperiments,and finallydestroyVMs.Suchchallengesdistract theuserfromfocusingontheircoregoals.Thus,theChimineyplatformencapsulatestheseproblemsandisolatesthemfromtheuser.Thisallowstheusertofocusondomain-specificproblems,andtodelegatethetooltodealwiththedetail thatcomes with accessing high-performance and cloud computing infrastructure, and the data management challengesposed.

CHIMINEY:CONNECTINGRESEARCHERSTOCLOUD,HPCANDBIGDATARESOURCESIn this presentation we describe Chiminey, our cloud-based platform that provides a reliable computing and datamanagementservice.Chimineyenablesdomainscientiststoexecutenewandlegacyapplicationsoncloud-based,bigdata (i.e. Hadoop), and high-performance computing (HPC) facilities without the need to have a technicalunderstanding of cloud computing, HPC, fault tolerance, or datamanagement. Chiminey handles failure during theexecutionofapplications. ItalsocuratesandvisualisesexecutionoutputsusingMyTardis,throughwhichdatacanbesharedwithcollaboratorsorwiththepublic.WedevelopedChimineyunderdirectionbyquantumphysicistsandmolecularbiologists,toreducethelearningcurveindatamanagementandsoftwareplatformsrequiredfor thecomplexcomputational targetsystems. Thisopen-source

Page 2: Chiminey: Connecting Researchers to Cloud, HPC and Big ... · Quantum Physics group, he has also worked on ab initio high-performance computing simulations of defects and arrays of

eResearchAustralasiaConference|Melbourne–Australia|10-14October-2016

platform has been applied across two research disciplines, physics (material characterisation) and structural biology(understandingmaterialsat theatomic scale), toassess itsusabilityandpracticality.Thedomainexpertsnoted timesavings for computinganddatamanagement, auser-friendly interface for the computation setup, and theability tovisualizecalculationresultsas2Dor3Dgraphs.Thiscontainerisedplatformprovidesanabstractionofresearchtasksassmartconnectors.Theinfrastructureprovidedfor these connectors allows the domain scientists to choose the target platform, and then the target platform ismanagedautomatically; itacceptsall thenecessaryparameterstorunmany instancesoftheirprogramregardlessofwhether this runs on a peak supercomputer, a commercial cloud like Amazon EC2 or (in Australia) the nationalfederatedcloudsystemNeCTAR.Chimineynegotiateswithtargetsystemschedulers,dashboardsanddatabases,andprovidesaneasy-to-usedashboardinterfacetomonitortherunningjobs,regardlessofthespecifictargetplatform.Thesmartconnectorencapsulatesandvirtualisesanumberoffurtheraspectsthatthedomainscientistsdirectingoureffortfoundnecessaryordesirable. Previouswork:ThefirstprototypeofChimineyisdiscussedin[2].Aformalmodelofaplatformforscalableandfault-tolerantcloudcomputationsaswellastheimplementationofthisplatformastheChimineyplatformwasintroducedin[3]. The formalmodel allowsus to have a precise and concise specificationof theplatformon the logical level.Wepresenttherefinedformalmodelofacloud-basedplatformandthelatestversionofitsopen-sourceimplementation,withtheemphasisonusabilityandreliabilityaspectsin[4].ThefeasibilityoftheChimineyplatformisshownusingcasestudiesfromtheTheoreticalChemicalandQuantumPhysicsgroupatRMITUniversity.

REFERENCES1.M.Armbrust,A.Fox,R.Griffith,etal.,Aviewofcloudcomputing,Commun.ACM53(2010)50–58.2.I.Yusuf,I.Thomas,M.Spichkova,S.Androulakis,G.Meyer,D.Drumm,G.Opletal,S.Russo,A.Buckle,H.Schmidt,Chiminey:Reliablecomputinganddatamanagementplatform in thecloud, in:Proceedingsof the37th InternationalConferenceonSoftwareengineering,pp.677–680.3. M.Spichkova, I.Thomas,H.Schmidt, I.Yusuf,D.Drumm,S.Androulakis,G.Opletal,S.Russo,Scalableandfault-tolerantcloudcomputations:Modellingandimplementation,in:ProceedingsICPADS2015,pp.396–404.4.M.Spichkova,H.Schmidt,I.Thomas,I.Yusuf,S.Androulakis,G.Meyer,Managingusabilityandreliabilityaspectsincloudcomputing,in:Proceedingsofthe11thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

ABOUTTHEAUTHORSIan Thomas is a software developer and system administrator at the eResearch Office of RMIT University. He hasworkedindatacurationforoutputofhigh-performancecomputingsystems,microscopydataformaterials,andscreenmedia objects (film and television). His current work is in institutional metadata stores, containerized researchworkflowsandincloud-basedplatformsinsupportofeResearchapplications.ImanYusufisasoftwareengineerandresearchacademicattheeResearchOfficeofRMITUniversity.Shehasworkedon projects that focus on distributed computing, in particular,cloud computing, and big data. She is one of thedevelopersofChiminey.Her research interestsaregridandcloudcomputing, fault tolerance, reliability, component-basedsoftwarearchitecture,andbigdata.Heinz is Professor of Software Engineering at RMITUniversitywhere he is theDirector of eResearch and heads theDistributed Software Engineering and Architecture lab in Computer Science. Heinz is also an adjunct professor atMälardalen University in Sweden. Heinz received his PhD from Bremen University, Germany. He has over 30 yearsexperiencewithcomponent-basedandobject-orientedarchitecture,especiallyinparallelanddistributedsystemsandlanguages inpractice, researchandeducation.Heinz is aneminent researcherwhohaspublishedover120 refereedarticles,supervisedover25higher-degreeresearchstudents,andlecturesinsoftwareengineering,distributedsystemsandenterprisearchitecture.PriortoRMITHeinzheldpositionsatMonashUniversity,theCSIROandANUinCanberra,attheGermanNationalResearchCentreforInformationTechnologyandtheInternationalComputerScienceInstituteat the University California, Berkeley. Prof Schmidt has led large university-industry research collaborations, in theEuropean ESPRIT program and the Australian Collaborative Research Center program, among others with SIEMENS,ABB,DECandOlivetti,IBMandothers.

Page 3: Chiminey: Connecting Researchers to Cloud, HPC and Big ... · Quantum Physics group, he has also worked on ab initio high-performance computing simulations of defects and arrays of

eResearchAustralasiaConference|Melbourne–Australia|10-14October-2016

GrischaMeyeristhetechnicalleadontheInstrumentIntegrationprojectatMonashUniversity.HehasbeenworkingontheMyTardisprojectforover3yearsandisthecurrentMyTardiscoremaintainer.SteveAndroulakisisthemanageroftheMonashBioinformaticsPlatform.TheMBPprovidesbioinformaticssupportandtrainingtotheresearchcommunity.HehasalsocreatedandledtheMyTardisprojectforfiveyears.DanielDrummisapostdoctoral researcher in theRMITUniversitynodeof theAustralianResearchCouncilCentreofExcellenceforNanoscaleBioPhotonics(CNBP),studyingoptimalsearchalgorithms,quantumimagingtechniques,two-dimensional materials, and switchable fluorescent marker molecules. Within RMIT’s Theoretical Chemical andQuantumPhysicsgroup,hehasalsoworkedonabinitiohigh-performancecomputingsimulationsofdefectsandarraysof defects in semiconductors, organo-metallic crystals, ternary chalcogenide glasses, dye-sensitised solar cells, andtransition-metaloxides,aswellasdatacurationandworkflowautomationforthistypeofresearch.