Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Programming Patterns and Tools for Cloud
Computing SeSe ACC
fdsf
AmazonWebServicesHPHelionBackspaceSNICCloud
GoogleAppEngineOpenShiftCloudFoundry……
GoogleDocsOffice365DropboxStochSS
2SamJohnson,http://creativecommons.org/licenses/by-sa/3.0/
Basicservicemodels:IaaS,PaaS,SaaS
Virtual Appliances
3
Installationofcompletesoftwarestackpackagedasvirtualmachineimages.
• Distributedasfiles(imageformats)ordirectlyinacloudenvironment
• TypicallyusedtopackageupasingleapplicationasaVMI,simpledistribution
• CommonwaytomakeSaaSoutoflegacyapplication• Caneasymovebetweendifferenthostingenvironments(Public/PrivateCloud,VirtualBox,VMWareetc)
• Userstypicallydeployandusethemasvirtualprivateresources
https://en.wikipedia.org/wiki/Virtual_appliance
Platform as a Service (PaaS)
4
Thecapabilityprovidedtotheconsumeristodeployontothecloudinfrastructureconsumer-createdoracquiredapplica\onscreatedusingprogramminglanguages,libraries,services,andtoolssupportedbytheprovider.Theconsumerdoesnotmanageorcontroltheunderlyingcloudinfrastructureincludingnetwork,servers,opera\ngsystems,orstorage,buthascontroloverthedeployedapplica\onsandpossiblyconfigura\onse^ngsfortheapplica\on-hos\ngenvironment.
AmajoraimofPaaSistoaidindevelopmentof“cloudy”applicationsby:
• ReducingtheburdenofexplicitlymanagingIaaS• Cloudinteroperability(agoalforsomeofthem)• Enforcingprogrammingmodels/frameworksthatleadstodesignsthatmakegooduseofthecloudmodel.
5
Platform as a Service (PaaS)DirectlyconsumingIaaSgivesgreatflexibilityandcontrolbutmaybetoocomplicatedfortheaverageuser.PlatformasaServiceaimsatsimplifyingapplicationdevelopmentbyabstractingawayIaaS.Also,insomecases,byenforcingcertainenvironmentsandprogrammingmodels,theplatformcanprovideaddedvalue,particularlyautoscaling(seeJohanTordsson’slectures).
Example: Google App Engine
6
• Letsyoueasilybuildscalablewebapplications• DevelopinalocalSDKsandbox,thendeploytoGoogleCloudwithnocodemodification
• Theplatformwillautoscaleyouwebservicetomeetincreasedloadandtraffic• Supportsdifferentlanguages,e.g.Python,Java,GoandyourfavoriteframeworkslikeDjango,Jinja2etc.
• ButnotallpossiblePythonpackages(sandbox)• Worksreallywellformany(traditional)webapplications• Sandboxcanbelimitingfornon-typicalapplications,likethosefromCSE
Example: CloudFoundry
http://12factor.net/
https://www.cloudfoundry.org/platform/
Example: OpenShift
8
https://www.openshift.org/
FromRedHat.SimilargoalsasCloudFoundy,basedonDockerandKubernetes
Software as a Service
9
Thecapabilityprovidedtotheconsumeristousetheprovider’sapplica\onsrunningonacloudinfrastructure.Theapplica\onsareaccessiblefromvariousclientdevicesthrougheitherathinclientinterface,suchasawebbrowser(e.g.,web-basedemail),oraprograminterface.Theconsumerdoesnotmanageorcontroltheunderlyingcloudinfrastructureincludingnetwork,servers,opera\ngsystems,storage,orevenindividualapplica\oncapabili\es,withthepossibleexcep\onoflimiteduser-specificapplica\onconfigura\onse^ngs.
10
www.molns.org,github.com/MOLNs/molnsCloudvirtualappliance/platformforlargescalesimulationswithPyURDME
Case study: MOLNs and StochSS
The tasks: Spatial Stochastic Simulations in Systems Biology
11
PyURDME(www.pyurdme.org)isamodelingandsimulationlibraryforstochasticreaction-diffusionsimulationsofbiochemicalnetworks.
• Stochasticprocesssimulatedonacomputationalmesh.
• Applicationsincludestudyinggeneregulation,yeastpolarization.
• Outputisatimeseriesofspatialdata,X.
What problems did we want to solve?
12
1. PyURDMEsimulations,inparticularparametersweeps,requirelargecomputationalresources.
13
Xisjustonepossibleoutcome,weneedtogeneratelargeensemblesofrealizationstoconductastatisticalanalysis.
1k-1Mtrajectories—> 10*1000~10Gbdata
Fallsinthecategory“High-TroughputComputing(HTC)”and“Many-TaskComputing(MTC)”
Potential solution, use HPC -clusters
• We did that for a while, and used them to make large sweeps and write papers ourselves. Then we started to think about the problem of scientific software. There were problems: • The modeling process really benefits from interactivity,
but HPC resources are not for that. • It was daunting to try to write software to work with any
type of university cluster. • The software stack needed is complex and fragile and
would likely be very time-consuming to install in the HPC environment (leading to long wait times for users)
• Computational experiments become hard to reproduce since they rely on specific resources and organizations.
14
Components in the implementation
15
Amazon EC2
HP Helion
OpenStack
PyURDME: !Spatial stochastic modeling
and simulation engine
molnsclient: automatic setup and provisioning of a distributed !! ! ! computational environment - the “virtual lab”
molnsutil: Library for scalable parallel execution of simulation engines. !! ! Unified data management in the cloud environments.
IPython project: Web based notebook front-end to Python. !! ! ! ! Sharable, computable documents. !
! ! ! ! ! ! Parallel computing in via Ipython.parallel.
MOLNs
Simulation engine
Data analysis frameworks
Simulation engine
Summary of Experiences so far
• Virtual appliance for building VM images with the software stack for one OS only reduces time to release, critical for a small science team.
• Some parts were easy, essentially just deploying existing software in a virtual environment (e.g. IPython)
• Special care was needed for data transport + Special care was needed for storage abstractions
• As the stack grows, the large monolithic virtual appliance/image is starting to give us problems. We are now looking into alternative technologies as a remedy, e.g. containers and more extensive use of orchestration tools.
16
What we have not cover in this course that is important for developing SaaS
• Scalable web sites/services • frameworks • sessions • authentication • databases • security
17
Inpractice,thisisanintegralpartofalmostallcloudcomputingsoftwareprojects,butheremanysourcesofinformation,toolsandframeworks(Webapp2,Djago,Flask..),PaaS(GAE…),existthatwillmakeyourlifefairlyeasy.
“I need my application to scale”• What do we mean with a scalable CSE application?
18
Tightly coupled applications?• Common in CSE, typical examples are Partial Differential
Equation (PDE) solvers
19
Communicationoverblock-boundaries
HPC programming models • Message Passing Interface (Distributed Memory) • OpenMP, Pthreads etc. (Shared Memory) • CUDA, OpenCL (GPGPU) • FGPA
20
• Low-latencyinterprocesscommunicationiscriticaltoapplicationperformance• Specializedhardware/instructionsets.• “Everycyclecounts”,usersofHPCbecomeconcernedaboutVMoverhead• Failedtaskscausesmajorproblems/performancelosses• Computeinfrastructureorganizedaccordingly• ’High-PerformanceComputing,HPCclusterresources.Waitinaqueuetoobtaindedicateddirectaccesstohigh-bandwidthhardwareresources.
“Cloud computing doesn’t scale”
21
Whenyouhearthosetypesofstatements,theytypicallyrefertoconceptionsabouttraditionalcommunicationintensiveHPCapplications,typicallyimplementedwithMPI.
• Referstolow-performingnetworkbetweeninstancesinPublicClouds• DoesnottypicallyaccountfortailoredHPCclusterofferingsandplacementgroups(availableine.g.EC2)
Representsaverynarrowviewonwhatapplicationsandprogrammingmodelsareadmissible.
http://star.mit.edu/cluster/StarClusterdeploysMPI-clustersoverAWS:
22
Vertical vs. Horizontal Scaling
Verticalscaling(“scale-up”): Increasethecapacityofyourserverstomeetapplicationdemand.
• AddmorevCPUs• AddmoreRAM/storage • FundamentallylimitedbyHWcapacityoftheunderlyinghost• Costincreasessharplyforhigh-endmachines
Horizontalscaling(“scale-out”): Meetincreasedapplicationdemandbyaddingmoreservers/storage/etc.
• Alsohaslimitations,butalotofworkgoingontoimprovescale-outproperties.• Costofaddinganothermachineofthesame(commodity)typescaleslinearlyTypicallywhatwearemostinterestedininCloudComputing.
Strong and Weak Scaling
23
Strongscaling:Howdoestheexecutiontimedecreaseasafunctionofincreasedresources,forafixedproblemsize?
HowquicklycanIsolveaproblemofthissizebyaddingmoreHW?
24
Weakscaling:Howdoestheexecutiontimebehaveasafunctionofincreasedresources,foranincreasingproblemsize?
Strong and Weak Scaling
HowmuchcanIgrowmyproblemsizeandstillmaintainthesameexecutiontime?
Scalability and Elasticity
25
Capabili\escanbeelas\callyprovisionedandreleased,insomecasesautoma\cally,toscalerapidlyoutwardandinwardcommensuratewithdemand.Totheconsumer,thecapabili\esavailableforprovisioningomenappeartobeunlimitedandcanbeappropriatedinanyquan\tyatany\me.
Designprinciplestoallowforhorizontalscalingandelasticity?
(Discuss/Brainstorminsmallgroupsfor5min)
Strive for statelessness of task/component
26
Controller
Worker
“Growingiseasierthanshrinking”Data,taskinput/output
Butofcourse,allcomputations/applicationsneeddataandstate.Itisnotsomuchaboutifyoustoreitasitisaboutwhereyoustoreit.
Ideallyyouwantyourtasktobeabletoexecuteanywhere,atanytime.
• Robustness• Canincrease,decreasenumberofworkers
27
Controller
Worker
Isthecontrollermachinedesignedforscalablestorage?
Highpressureoncriticalcomponent(robustness)
Networkapotentialperformancebottleneck
Use the object store when you can
28
Controller
Worker
S3,Swift
key-valuestorage,designedforhorizontalscaling
Inthiskindofmaster-slavesetupthecontrollerwillstillmaintainalotofinformation,routetasksetc.
Whyevenstoretaskinformationoncontroller?Performance(latency)isonegoodreason
Use the object store when you can
29
Controller
Worker
S3,Swift
Decouplethelife-timeofcomputeserversanddata!
• Neededforelasticity• Serversarecattle
http://www.slideshare.net/randybias/pets-vs-cattle-the-elastic-cloud-story
• Datacanbeprecious
30
Controller
Worker
Anotheroptionforrobustness,storereplicasofdataovermultiplehostsinadistributedfilesystem.
• HDFS(Hadoop,BigData)• Doesnotdecouplecompute/storagelifetimesaseasily• DesignedforextremeI/Othroughput
AsynchronousTask
3131
Controller
S3,Swift
Aimtoexecutetasksinnon-blockingmode,andhaveasfewglobalsynchronizationpointsaspossible.
32http://abhishek-tiwari.com/post/amqp-rabbitmq-and-celery-a-visual-guide-for-dummies#fn:1
key,valuestoredatabaseRedis
Example: Celery/RabbitMQ (Lab3)
Celery-DistributedTaskQueue
Our favorite example: MOLNs
33
PyURDME(www.pyurdme.org)isamodelingandsimulationlibraryforstochasticreaction-diffusionsimulationsofbiochemicalnetworks.
Wehaveseenhowweworkedwithautomation,orchestrationetctomakethingsintocloudservice,let’slookabitatthecomputationalproblem.
What problems did we want to solve?
34
1. PyURDMEsimulations,inparticularparametersweeps,requirelargecomputationalresources.
• MonteCarlo• Tasksarenumerous• Butindependent
mostlyMTCHPC,HTCorMTC?
Butalotofthetimewespendbuildinganddebuggingmodels…
ordevelopingsolvers
Forthis,wedon’tneedbigresources,butuserinteractivityisinvaluable.
Interactive Computation
35
Personalopinion:OneofthemoreexcitingthingsaboutcloudcomputingtechnologiesfromtheCSEperspectiveistheopportunitytoenableinteractiveparallelcomputing.
Input Output
Output
Input
Interactwithpartialoutputascomputationproceeds
Non-interactive,obtaincompleteoutputwhenjobisdone.
MOLNs revisited
36
Amazon EC2
HP Helion
OpenStack
PyURDME: !Spatial stochastic modeling
and simulation engine
molnsclient: automatic setup and provisioning of a distributed !! ! ! computational environment - the “virtual lab”
molnsutil: Library for scalable parallel execution of simulation engines. !! ! Unified data management in the cloud environments.
IPython project: Web based notebook front-end to Python. !! ! ! ! Sharable, computable documents. !
! ! ! ! ! ! Parallel computing in via Ipython.parallel.
MOLNs
Simulation engine
Data analysis frameworks
Simulation engine
Architecture
37
Workers
Controller
Workers
WebBrowser
Molns Client
SharedStorage
PersistentStorage
(S3 / Swift)
InfrastructureDeployment
IPythonNotebookServer
IPythonController
IPythonEngine(s)
IPythonEngine(s)
38
IPython ParallelTwotypesofviewstoyourcluster:
• direct_view• Supportsexplicitmappingofdata/taskstoengines(workers)• Runtasksonspecificmachines• MPI
• load_balanced_view• Asynchronoustaskqueue(usesZeroMQ)
Enginesalwaysblockonexecution
AsetofschedulersprovideasynchronousinterfacetoClients
IPython Parallel
39
• Let’syouworkwithmanymodelssimultaneously• InteractivelyfromIPython(Notebook)• Idealforrapidprototyping/simpleparallelflows
IPparallelinClouds:Doesnotattempttosolvetheproblemofdatamanagement
Oneofthegoalsof‘molnsutil’
Inmolnsutilwehaveabstractionsforstoragesystems: SharedStorage() PersistentStorage() LocalStorage()
Programming Patterns and Tools for Cloud
ComputingPart 2: “Big Data”
What is “Big Data”
41
Noteasytodefinejustbasedonsize:• Movingtarget,today’s“big”mightbetomorrows“small”• Rightnow,maybemultipleTBstoPBs(maybe)• Dependsonthetypeofdata(structured,unstructured)• Dependsonthe“velocity”ofthedata(streamingdata)
Relativetostate-of-theartcommoditysystems: Theproblemis“BigData”ifitcan’tbestored,handled,processedusingcommoditysystemsortraditionalRDMSsolutions.Typically,singleworkstations/disksarenotsufficient.MoreaboutthisinLectures2and3.
ThethreeV’s:Volume,Velocity,Variety
The three Vs, Volume, Velocity and Variety
42http://www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data
43
VolumeBigDataproblemsinvolvelargedatasets,andcomputationstypicallybecomeI/Obound:
• HardDrivescapacityto(cheaply)storelargeamountsofdatahasincreasedrapidlybutread/writespeedshavenot.
—>Distributedatasetsovermultipledisks/machinesandprocessitinparallel.
Posesfundamentalrequirementsonthesystemstosupportsuchanalysis(moreaboutthisinLectures2,3)
1000 Genomes project
44
Today,DNAsequencesfrom~2600individuals—>>460TBorrawdata(fullgenomicsequences),andgrowing.
Aim:Createanatlasofthehumangeneticvariation
FindbiomedicalrelevantDNAvariations
Understandsimpleandcomplextraits,andthein-betweens.
Capturegeneticvariationsthatisfoundinatleast1%ofthehumanpopulation
45
Fig. 3. Cloud computing usage in big data.
Ibrahim Abaker Targio Hashem, Ibrar Yaqoob, Nor Badrul Anuar, Salimah Mokhtar, Abdullah Gani, Samee Ullah Khan
The rise of “big data” on cloud computing: Review and open research issues
Information Systems, Volume 47, 2015, 98–115
http://dx.doi.org/10.1016/j.is.2014.07.006
FrameWorks for Big Data
46
Problemswithverylargedatasetsrequirespecialframeworks:• Hadoop• Hive• Spark• …• …
Focusof“LargeDatasetsforScientificApplications” CanbefollowedasaPhDcourse.March28-endofsemester.
47Theriseof“bigdata”oncloudcomputing:Reviewandopenresearchissues
What is Hadoop?
48
OriginalImplementationofMapReducebyGoogle.(Read“googlemapreduce.pdf”intheportal)
HadoopisanOpenSourceimplementationinspiredbyGoogle’soriginalframework.
Firstdevelopers:DougCuttingandMikeCafarellaatthattimeatYahoo!,namedafterCutting’sson’stoyelephant.
Hadoop
49
Today,theHadoopecosystemisquitelarge,andcontainsmanyanalyticstools.Thecorecomponentsare:
• HDFS(DistributedFilesystem)• Hadoopcommons(libraries,utilitiesetc.)• YARN(Resourcemanagementplatform,notin1.2.1)• MapReduce
What is MapReduce ?
50
• Aprogrammingmodelforwriting(simple)(data)parallelprograms• Aframeworktoexecutesuchprograms.
Advantages:IfyoumanagetoexpressyourproblemasMapReduce,theframeworkhandles:
• Datadistribution(HDFS)• Fault-tolerance(onmanylevels)• Bringscomputationsclosetodata*• Runsoncommodityhardware• Scalesto1000’sofnodes(failureisinevitable) • Simpleprogrammingmodel(whenyouknowit…)
Thedata(andthecluster)shouldbeverylargefortheadvantagestobecomeapparent.
ItwasdesignedforthecaseofcommodityHWwithouthighperformancenetworkconnects.YoushouldnotthinkofHDFSasa“HPCparallelfilesystem”.Youdon’tcommunicatedatatocompute,youcommunicatecomputetodata.
Basic design
51
http://en.wikipedia.org/wiki/Apache_Hadoop
HandlesJobs,Tasks,logicofcomputingclosetodata
DistributedFilesystem,HDFS
Resources
52
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
ThereareacoupleofexcellentbasictutorialsforgettingstartedtoplayingwithHadooponsinglenodeclustersorsmalldistributedclusters(yes,thisisfun,IrecommenditifyouhavethetimeandasomeLinux(admin)experience).
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
Also,itisrathereasytosetupa“toy-hadoop”onyourownlaptop:
https://hadoop.apache.org/docs/r1.2.1/single_node_setup.html
Setting up Hadoop in Production
53
Thisisacomplexprocess.Inpractice,therearemanyvendorsthatprovidehostedservices,orprojectstomakeitsimplere.g.
• Cloudera• Hortonworks• ElasticMapReduce(EMR)inAWS• OpenStackSahara • ApacheBigtop
—>MoreimportanttogetfamiliarwithMapReduce.
54
https://developers.google.com/appengine/docs/python/dataprocessing/
So how do you use it?
55Frameworkprovidesoptimizedsort,shuffle.
Youprovide.
Key stages of MapReduce
56
1.Splitinput2.Map3.Shuffle4.Reduce
Hadoop Streaming
57
WhatifIdon’twanttouseJava?
TheHadoopstreamingAPIlet’syouwritethemapperandreducerasastandaloneprogram,readingfromstdinandwritingtostdout.YoumapperandreducercannowbewrittenusingPython,Ruby,Bashshell,R,younameit.
catinput.txt|./map|sort|./reduce>output.txt
Logically,itcorrespondstothefollowingUnixpipe:
Performance guidelines
•ThereisasignificantlatencyinstartingMap-tasksbytheframework•Mapsshoulddoconsiderablework(ruleofthumb,workatleast1min)•Defaultblocksizeis64-128MB,inputfilesshouldbelarge(GBsized)•Worksbestfor“one-shot”computations,notiterativeflows,duetothelargelatency.
•UseCombiner(localreduce)toreducethenetworkoverheadforshuffle/reduce.
58
Criticism against MapReduce• Restricted programming model. • Performance
59
http://homes.cs.washington.edu/~billhowe/mapreduce_a_major_step_backwards.html
Anoldbutstillinterestingdiscussion:
http://blog.tonybain.com/tony_bain/2010/09/was-stonebraker-right.html
• Not novel
Apache Spark
60