Migration of Multi-tier Applications to Infrastructure-as-a-Service Clouds:

Cloud Services Innovation Platform (CSIP)

Migration of Multi-tier Applications to Infrastructure-as-a-Service Clouds:An Investigation Using Kernel-based Virtual Machines

Wes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas

September 23, 2011

Colorado State University, Fort Collins, Colorado USAGrid 2011: 12th IEEE/ACM International Conference on Grid Computing

1OutlineCloud Computing ChallengesResearch QuestionsRUSLE2 ModelExperimental SetupExperimental ResultsConclusionsFuture Work2Traditional Application Deployment3Object StoreSingle ServerApplicationServersCloud Application Deployment4Load BalancerLoad BalancerService RequestsnoSQL datastoresloggingrDBMSProvisioning Variation5VMPhysical HostPhysical HostPhysical HostPhysical HostPhysical HostPhysical HostVMVMVMAmbiguousMappingVMVMVMVMVMVMVMVMVMVMVMVMVMVMRequest(s) to launch VMsCPU / MemoryReservedDisk / NetworkShared

PERFORMANCEVirtualization Overhead6NetworkDiskCPUMemoryNetworkDiskCPUMemoryApplication ProfilesApplication AApplication B

PERFORMANCEResearch QuestionsHow should multi-tier client/server applications be deployed to IaaS clouds?How can we deliver optimal throughput?How does provisioning variation impact application performance?Does VM co-location matter?What overhead is incurred from using Kernel-Based virtual machines (KVM)?77RUSLE2 Model8

RUSLE2 ModelRevised Universal Soil Loss EquationCombines empirical and process-based sciencePrediction of rill and interrill soil erosion resulting from rainfall and runoffUSDA-NRCS agency standard modelUsed by 3,000+ field officesHelps inventory erosion ratesSediment delivery estimationConservation planning tool9RUSLE2 Web ServiceMulti-tier client/server applicationRESTful, JAX-RS/Java using JSON objectsSurrogate for common architectures

10OMS3RUSLE2POSTGRESQLPOSTGIS1.7+ million shapes57k XML files, 305Mb

Eucalyptus 2.0 Private Cloud(9) Sun X6270 blade serversDual Intel Xeon 4-core 2.8 GHz CPUs24 GB ram, 146 GB 15k rpm HDDsUbuntu 10.10 x86_64 (host)Ubuntu 9.10 x86_64 & i386 (guests)Eucalytpus 2.0Amazon EC2 API support8 Nodes (NC), 1 Cloud Controller (CLC, CC, SC)Managed mode networking with private VLANsKernel Based Virtual Machines, full virtualization11Experimental SetupRUSLE2 modeling engineConfigurable number of worker threads, 1 engine per VMHAProxy round-robin load balancingModel requestsJSON object representationModel inputs: soil, climate, management dataRandomized ensemble testsPackage of 25/100/1000 model requests (JSON object)Decomposed and resent to modeling engine (map)Results combined (reduce)12RUSLE2 Component Provisioning13P1/V1Physical HostMDFLP3/V3Physical HostPhysical HostPhysical HostPhysical HostP2/V2Physical HostPhysical HostP4/V4Physical HostPhysical HostModelDatabaseFileserverLoggerMMMMDDDDFFFFLLLLP1/V1MDFLRUSLE2 Test Models1414P2/V2MDFLP3/V3MDFLP4/V4MLFD Database bound Join on nested query Much greater complexity CPU bound

Model bound Standard RUSLE2 model primarily I/O boundd-boundm-boundTiming DataAll times are wall clock time15TimeDescriptionFileIO

XML parameterization file readsSubset of model timeModelRUSLE2 model executionclimate/soil querySpatial query executionloggingLogging requests to queueoverheadCalculated unaccounted for timetotalComplete model run15Experimental Results16RUSLE2 Application Profile17D-bound Database 77% Model 21% Overhead1% File I/O .75% Logging .1%

M-bound Model 73% File I/O 18% Overhead 8% Logging 1% Database 1%

17Scaling RUSLE2:Single Component Provisioning18 V1 Stack 100 model run ensemble

1819

Impact of varying shared DB connections on average model execution time(figure 2)1920

d-boundImpact of varying D VM virtual cores on average model execution time(figure 3)20Impact of varying M VM virtual cores on average model execution time 21

(figure 4)21Impact of varying worker threads on ensemble execution time 22

(figure 5)22RUSLE2 V1 Stack23 d-boundm-bound100 model runs3.75x120 sec100 model runs32 sec6 workers5 dbconn /M8 workers8 dbconn /M6 cores:D5 cores:8 cores:6 cores:5 cores:MFLMDFLScaling RUSLE2:Multiple Component Provisioning24 100 model run ensemble

24Impact of increasing D VMs and db connections on ensemble execution time25

d-bound(figure 6)25Impact of varying worker threads on ensemble execution time26

(figure 7)26Impact of varying M VMs on ensemble execution time27

(figure 8)27Impact of varying M VMs and worker threads on ensemble execution time28

m-bound(figure 9)28RUSLE2 Scaled Up29 d-boundm-bound100 model runs5.5x21.8 sec100 model runs4.8x6.7 sec24 workers40 dbconn /M48 workers8 dbconn /M6 cores:DDDDDDDD5 cores:8 cores:6 cores:5 cores:MMMMMMFLMMMMMMMMMMMMMMMMDFLRUSLE2 - Provisioning Variation30V1MDFLV2MDFLV3MDFLV4MLFD

30KVM Virtualization Overhead31NetworkDiskCPUMemoryNetworkDiskCPUMemoryApplication ProfilesD-boundM-bound10.78%112.22%ConclusionsApplication scaling Applications with different profiles (CPU, I/O, network) present different scaling bottlenecksCustom tuning was required to surmount each bottleneckNOT as simple as increasing number of VMsProvisioning variationIsolating I/O intensive components yields best performanceVirtualization OverheadI/O bound applications are more sensitiveCPU bound applications are less impacted

32Future WorkVirtualization benchmarkingKVM paravirtualized driversXEN hypervisor(s)Other hypervisorsDevelop application profiling methodsPerformance modeling based onHypervisor virtualization characteristicsApplication profilesProfiling-based approach to resource scaling33 QuestionsApplication scaling Applications with different profiles (CPU, I/O, network) present different scaling bottlenecksCustom tuning was required to surmount each bottleneckNOT as simple as increasing number of VMsProvisioning variationIsolating I/O intensive components yields best performanceVirtualization OverheadI/O bound applications are more sensitiveCPU bound applications are less impacted

34

Extra Slides35Related WorkProvisioning VariationAmazon EC2 VM performance variability [Schad et al.]Provisioning Variation [Rehman et al.]ScalabilitySLA-driven automatic bottleneck detection and resolution [Iqbal et al.]Dynamic 4-part switching architecture [Liu and Wee]Virtualization BenchmarkingKVM/XEN Hypervisor comparison[Camargos et al.] Cloud middleware and I/O paravirtualization [Armstrong and Djemame]

36IaaS Cloud Computing37Benefits:Multiplexing resources w/ VMsHybrid Clouds privatepublicElasticity, ScalabilityService IsolationChallenges:Virtual Resource TuningVirtualization OverheadVM image compositionResource ContentionApplication TuningIaaS Cloud Benefits (1/2)Hardware VirtualizationEnables sharing CPU, memory, disk, and network resources of multi-core serversParavirtualization: XENFull Virtualization: KVMService IsolationInfrastructure components run in isolation Virtual machines (VMs) provide explicit sandboxes Easy to add/remove/change infrastructure components 3838IaaS Cloud Benefits (2/2)Resource ElasticityEnabled by service isolationDynamic scaling of multi-tier application resources Scale number, location, and size of VMsDynamic Load BalancingHybrid CloudsEnables scaling beyond local private cloud capacityAugment private cloud resources using a public cloude.g. Amazon EC239IaaS Cloud ChallengesApplication deployment Application tuning for optimal performanceProvisioning VariationAmbiguity of where virtual machines are provisioned across physical cloud machinesHardware Virtualization OverheadPerformance degradation from using virtual machines

4040RUSLE2: Multi-tier Client/Server applicationApplication stack surrogate for Web Application ServerApache Tomcat hosts RUSLE2 modelRelational DatabasePostgresql supports geospatial queries for determining climate, soil, and management characteristicsFile ServerNginx Provides climate, soil, and management XML files used for model parameterizationLogging Server Codebeamer model logging/tracking4141Experimental Setup (1/2)RESTful webservice Java implementation using JAX-RSJSON objects Object Modeling System 3.0 Java Framework supporting component oriented modelingInterfaces with RUSLE2 Legacy Visual C++ implementation using RomeShell and WINEHosted by Apache Tomcat42RUSLE2 Components43Virtual MachineDescriptionMModel64-bit Ubuntu 9.10 server w/ Apache Tomcat 6.0.20, Wine 1.0.1, RUSLE2, Object Modeling System (OMS 3.0)DDatabase64-bit Ubuntu 9.10 server w/ Postgresql-8.4, and PostGIS 1.4.0-2. soil data: 1.7 million shapes, 167 million pointsmanagement data: 98 shapes, 489k pointsclimate data: 31k shapes, 3 million points4.6 GB for the state of TN and COFFile Server64-bit Ubuntu 9.10 server w/ nginx 0.7.62 Serves XML files to parameterize RUSLE257,185 XML files consisting of 305MB.LLogger32-bit Ubuntu 9.10 server with Codebeamer 5.5 running on Tomcat. Custom RESTful JSON-based logging web service provides a wrapper. 43Provisioning VariationPhysical location of VMs placement is nondeterministic which may result in varying VM performance characteristics 44Node 1Node 2Node 3Node 4P1/V1M D F LP2/V2MD F LP3/V3MDFLP4/V4M L FD44RUSLE2 DeploymentTwo versions testedDatabase bound (d-bound)Model throughput bounded by performance of spatial queriesSpatial queries were more complex than requiredPrimarily processor boundModel bound (m-bound)Model throughput bounded by throughput of RUSLE2 modeling engineProcessor and File I/O bound4545RUSLE2- Single StackD-bound100-model run ensemble ~120 seconds6 worker threads, 5 database connectionsD: 6 CPU coresM, F, L: 5 CPU coresM-bound 100-model run ensemble ~32 seconds8 worker threads, 8 database connectionsM: 8 CPU coresD: 6 CPU coresF, L: 5 CPU cores4646RUSLE2- scaled using IaaS cloudD-bound100-model run ensemble ~21.8 seconds (5.5x)24 worker threads, 40 database connections per MD: 8 VMs, 6 CPU coresM: 6 VMs, 5 CPU coresF, L: 5 CPU coresM-bound 100-model run ensemble ~6.7 seconds (4.8x)48 worker threads, 8 database connections per MM: 16 VMs, 8 CPU coresD: 6 CPU coresF, L: 5 CPU cores4747KVM Virtualization Overhead48D-boundVirt O/HP1 D-bound averageV1 D-bound averageM-bound Virt O/HP1 M-bound averageV1 M-bound averageTOTAL10.78%6.0466.698112.22%.8451.792model54.50%.9681.496100.16%.8161.632fileIO319.70%.056.234463.54%.0566.319climate query-11.41%.692.613404.54%.00128.00645soil query3.25%4.3714.51312.04%.0118.0133logging1360.69%.0003.00472680.58%.00035.0959overhead395.14%.0143.0708740.02%.0155.13048Impact of varying worker threads with 16 M VMs on ensemble execution time49

m-bound8 cores:MMMMMMMMMMMMMMMM49RUSLE2 - Provisioning Variation50P1/V1MDFLP2/V2MDFLP3/V3MDFLP4/V4MLFD

50VirtualizationVirtual Machines (guests)Software programs hosted by a physical computerAppear as a single process on the host machineNo direct access to physical devicesDevices are emulatedIncurrs varying degrees of overheadProcessorDevice I/O5151Types of VirtualizationParavirtualization (XEN - Amazon)Device emulation provided using special Linux kernelsAlmost direct access to some physical resources leads to faster I/O performanceFull virtualization (KVM Eucalyptus, others)Device emulation provided natively with on-CPU supportSpecial kernels not requiredCPU mode switching for device I/O leads to slower I/O performance Container based virtualization (OpenVZ, Linux-VServer)Not true virtualization, but operating system containers, where all use same kernelNo commercial vendor support5252Testing InfrastructureEnsemble runsGroups of RUSLE2 model runs packaged together as a single JSON object25, 100, and 1000 model runsRandomized model parameterization Slope length, steepness, management practice, latitude, longitudeDefeating CachingAll services restarted prior to each testEliminates training effect from repeat execution of model test sets 5353Questions54

Documents

Migration of Multi-tier Applications to Infrastructure-as-a-Service Clouds: