OSU – CSE 2014 Supporting Data-Intensive Scientific Computing on Bandwidth and Space Constrained Environments Tekin Bicer Department of Computer Science

Embed Size (px)

DESCRIPTION

OSU – CSE 2014 Hybrid Cloud Motivation Properties of cloud technologies –Elasticity –Pay-as-you-go Model Types of resources –Computational resources –Storage resources Hybrid Cloud –Local Resources: Base –Cloud Resources: Additional 3

Citation preview

OSU CSE 2014 Supporting Data-Intensive Scientific Computing on Bandwidth and Space Constrained Environments Tekin Bicer Department of Computer Science and Engineering The Ohio State University Advisor: Prof. Gagan Agrawal OSU CSE 2014 Introduction Scientific simulations and instruments X-ray Photon Correlation Spectroscopy CCD Detector: 120MB/s now; 10GB/s by 2015 Global Cloud Resolving Model 1PB for 4km grid-cell Performed on local clusters Not sufficient Problems Data Analysis, Storage, I/O performance Cloud Technologies 2 OSU CSE 2014 Hybrid Cloud Motivation Properties of cloud technologies Elasticity Pay-as-you-go Model Types of resources Computational resources Storage resources Hybrid Cloud Local Resources: Base Cloud Resources: Additional 3 OSU CSE 2014 Cloud Storage Usage of Hybrid Cloud 4 Local Storage Data Source Local Nodes Cloud Compute Nodes OSU CSE 2014 Challenges Data-Intensive Processing Transparent Data Access and Analysis Meeting User Constraints Minimizing Storage and I/O Cost Domain Specific Compression Parallel I/O with Compression 5 MATE-HC: Map-reduce with AlternaTE API Dynamic Resource Allocation Framework with Cloud Bursting Compression Methodology and System for Large-Scale App. OSU CSE 2014 MATE-HC: Map-reduce with AlternaTE API over Hybrid Cloud Transparent data access and analysis Metadata generation Programmability of large-scale applications Variant of MapReduce MATE-HC Selective job assignment Consideration of data locality Different data objects Multithreaded remote data retrieval Asynchronous informed prefetching and caching 6 OSU CSE 2014 MATE vs. Map-Reduce Processing Structure 7 Reduction Object represents the intermediate state of the execution Reduce func. is commutative and associative Sorting, grouping.. overheads are eliminated with red. func/obj. OSU CSE 2014 Middleware for Hybrid Cloud 8 Remote Data Analysis Job Assignment Global Reduction OSU CSE 2014 Dynamic Resource Allocation for Cloud Bursting Competition for local resources complicates applications with deadlines Cloud resources can be utilized Utilization of cloud resources incur cost Dynamic Resource Allocation Framework A model for capturing Time and Cost constraints with cloud bursting Cloud Bursting In-house resources: Base workload Cloud resources: Adopt performance requirements 9 OSU CSE 2014 Resource Allocation Framework Estimate required time for local cluster processing Estimate required time for cloud cluster processing All variables can be profiled during execution, except estimated # stolen jobs Estimate remaining # jobs after local jobs are consumed Ratio of local computational throughput in system 10 OSU CSE 2014 Execution of Resource Allocation Framework Slave Nodes Request and consume jobs Master Node Each cluster has one Collects profile info. During job req. time (De)allocates resources Head Node Evaluates profiled info. Estimates # cloud inst. Before each job assign. Informs Master nodes 11 OSU CSE 2014 Experimental Setup Two Applications KMeans (520GB): Local=104GB; Cloud=416GB PageRank (520GB): Local=104GB; Cloud=416GB Local cluster: Max. 16 nodes x 8 cores = 128 cores Cloud resources: Max. 16 nodes x 8 cores = 128 cores Evaluation of model Local nodes are dropped during execution Observed how system is adopted 12 OSU CSE 2014 KMeans Time Constraint # Local Inst.: 16 (fixed) # Cloud Inst.: Max 16 (Varies) Local: 104GB, Cloud:416GB System is not able to meet the time constraint because max. # of cloud instances is reached All other configurations meet the time constraint with