19
8/2/2018 1 Cloud Datacentre Reliability Cloud Resource Provisioning Talk Outline Cloud computing applications Hybrid Cloud Resource Provisioning Problem Overview Informal and informal definition of the problem Security and Reliability Challenges Addressing Reliability Problem Service Selection Algorithms Cloud Selection Algorithms Scheduling (Mapping VMs to PM) Algorithms Addressing Security Problem Trust Management Performance Evaluation Result and Discussion Conclusion Smart Living

Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

  • Upload
    dothien

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

1

Cloud Datacentre Reliability

Cloud Resource Provisioning

Talk Outline• Cloud computing applications

• Hybrid Cloud Resource Provisioning Problem Overview– Informal and informal definition of the problem

• Security and Reliability Challenges

• Addressing Reliability Problem– Service Selection Algorithms

– Cloud Selection Algorithms

– Scheduling (Mapping VMs to PM) Algorithms

• Addressing Security Problem– Trust Management

• Performance Evaluation

• Result and Discussion

• Conclusion

Smart Living

Page 2: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

2

Social Network Data

• Social network data is rich in content and relationships that are quite valuable to many third party consumers.– sociologists, (e.g., for studying social structure),

– epidemiologists (e.g., to understand infectious disease dynamics),

– businesses (e.g., to drive marketing campaigns and to enable better social targeting of advertisements) and

– criminologists (e.g., identifying insurgent networks and determining leaders and active cells, fraud detection).

• SN operators routinely publish sanitized versions of the social network data collected.

Smart Agricultural

Source: M. Imran, R. Zurita-Milla, R. de By, ITC – Univ. Twente, AGILE Conference 2011

• Networked physical objects (devices, vehicles, buildings, etc.) capable of collecting and exchanging data.

• Cloud serves as a backend infrastructure

– Imagine your network with 1,000,000 more devices

– Any compromised device is a foothold on the network

Biplob R. Ray, Jemal H. Abawajy, Morshed U. Chowdhury, Abdulhameed Alelaiwi: Universal and secure object ownership transfer protocol for the Internet of Things. Future Generation Comp. Syst. 78: 838-849 (2018)

Page 3: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

3

Application of Cloud in Agriculture

• Exploiting cloud computing with technologies such as wireless sensor networking and mobile computing – Cloud Computing based Livestock Monitoring and Disease

Forecasting System

– Cloud Based Autonomic Information System for delivering agriculture related information as a service to improve sustainability, efficiency and quality.

– Taking market to smallholder farmers to change formal markets faces challenges with quality, quantity, and high transaction cost.

Application of Cloud in Health

• Cloud computing is used for a wide variety of health applications

Tahsien Al-Quraishi, Jemal H. Abawajy, Morshed U. Chowdhury, SutharshanRajasegarar, Ahmad Shaker Abdalrada: Breast Cancer Recurrence Prediction Using Random Forest Model. SCDM 2018: 318-329

Application of Cloud in Health

Sara Ghanavati, Jemal H. Abawajy, Davood Izadi, Abdulhameed Alelaiwi: Cloud-assisted IoT-based health status monitoring framework. Cluster Computing 20(2): 1843-1853 (2017)

Federated Internet of Things and Cloud Computing Pervasive Patient Health Monitoring System. IEEE Communications Magazine 55(1): 48-53 (2017)

• Taking hospitals and clinks to the people not the people to the hospitals

Page 4: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

4

• Businesses - Discovering valuable new insights (e.g., consumer purchasing trends to better target marketing).

• Security - Support decision making (e.g., detect fraud, disaster management, etc.).

• Medical - Reveal new trends and patterns that were previously hidden (e.g., likelihood of being predisposed to an incurable disease).

• Agriculture – (e.g., spot unusual changes in land use pattern)• E-governance - Deliver personalised and streamlined services, that

accurately and specifically meet individual’s needs, in a timely manner. • Future applications - where users and machines will need to

collaborate in intelligent ways together (e.g., smart city).• Intelligence and scientific discovery• …..

Cloud Computing

• Data centres consume high energy costs and huge carbon footprints. – Financial Issues: In excess of $11 billion in 2010 and cost

doubles every five years.

– Reliability issues: For every 10°increase in temperature, the failure rate of a system doubles.

– Environmental issues: Closer to 2% emission of CO2 when data center systems are factored into the equation.

• Problem Statement– How can we improve energy efficiency without QoS

reduction?

Introduction

Page 5: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

5

Problem Overview

• A shared pool of rentable (public) clouds

, , …

• A shared pool of personal (private) clouds

, , …• A set of , , …

cloud computing users

• A hybrid cloud is a partially sharable pool of that systematically integrates and

⋃ , ,…

Hybrid Cloud Computing….

• The most widely used Cloud computing models.

• Broker is core component for selecting suitable resource providers

• We developed broker

InterGrid Gateway

Per

sist

en

ce D

BJa

va D

erb

y

Co

mm

un

icat

ion

Mo

du

leM

essa

ge-

Pa

ssin

g

Management & MonitoringJMX

Scheduler

(Provisioning Policies & Peering)

Virtual Machine Manager

EmulatorLocal

ResourcesIaaS

ProviderGrid

Middleware

Workload• A set of tightly coupled jobs submitted by private cloud computing

users , , …

• Each job ∈ is described by a tuple:

, , , ,• :Type of required virtual machines

• = estimated service demand

• = Deadline of the request

• = arrival time

• = number of VMs needed (All VMs must be available for the whole required duration)

Matrix multiplication

Gene Sequencing

Page 6: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

6

Problem Overview…

• Informally, the problem is how to execute user application on the hybrid cloud computing such that both user and cloud owners are happy.• Where to execute the application (either public or private

cloud)?

• How can we select the best public cloud to execute the application?

• How should jobs scheduled locally (mapping VMs to Physical Machines)?

NP-Hard Problem

• Parties objectives and requirements differ: (a) maximizing their return-on-investment (Cloud owners); (b) minimizing their cost (Cloud customers). For example, cloud providers are interested in the following:

• Cumbersome task for user to select the best services from many functionally similar services.

• Might differ by the quality of service offered, by the semantics for data access - both for reading and writing - and by interfaces or security mechanisms implemented.

Hybrid Cloud Reliability Challenges

Temporal correlation: the failure rate is time-dependent and some periodic failure patterns can be observed in different time-scales

Spatial correlation : multiple failures occur on different nodes within a short time interval

• Resource failure is inevitable• Redundant components in public Clouds (much more reliable

service than private cloud)• Leads to service failure in private Clouds

Page 7: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

7

Hybrid Cloud Security Challenges • Problem 1: How can we improve usability and security

of end entity credential management on the Cloud?• Security of a system often depends on how securely user

credentials are managed.

• Problem 2: Hybrid Cloud opens up the possibility of misusing information to a degree never seen before.

• Insider threats are very serious problem• Problem 3: Detecting and Mitigating HX-DoS attacks

against Cloud Web Services

Reliability-aware Hybrid Cloud Resource Provisioning

Cloud Resource

• A shared pool of , , … rentable clouds

• Each cloud ∈ consistsofa pool of shared resources , , …

• Each resource ∈ is described by a tuple: , , ,

• where• = available service capacity, 0 100%

• = service unit price

• = service type (CPU, Storage, Network}

• = service status (failed, working}

Page 8: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

8

Resource allocation problem formulation

• Given – A shared pool of , , … rentable cloud

computing resources

– A set of , , … jobs

• Objective– min

∀ ∈ . . (User-centric)

– min∀ ∈ , . . , (User centric)

– m∀ ∈

. . (System –centric)

Cloud Service Cost

• The cost for provisioning every public cloud resource type for ∈ is:

, ∗∈

• Where– : the amount of resource type (i.e., CPU, Storage, Network)

needed by VM .

– : service type unit price by cloud service provider

Job Deadline

• Each job ∈ has a deadline .

• stresses that the results of job must be ready before no later than the deadline as expressed in the following equation:

• Where• = actual time taken to complete execution of job

• = arrival time of job

Page 9: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

9

Where to execute the application (either public or private cloud)?

Size-based Brokering Strategies

• Based on fact that the number of VMs requested by a job follows two-stage uniform distribution with (l,m,h,q)

• Schedule wider requests to public & narrow requests to private– Uses mean number of VMs per request to distinguish

between wide and narrow requests

• Ideal for spatial correlation, where multiple failures occur on different nodes within a short time interval

Size-based Brokering Strategies1. Algorithm: Size-based2. BEGIN

3. Compute mean value of the two-stage uniform distribution

4.∗ ∗

(l,m,h,q are two-stage uniform distribution)

5. FOR each ∈ DO 6. Compute mean number ̅ of VMs required7. ̅ 2 2 18. IF ̅ THEN9. Send it to public cloud10. ELSE11. Send it to private cloud12. ENDFOR13. END Algorithm

Page 10: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

10

Time-based Brokering Strategies

• Based on the observation that the requests duration (job runtime) in real distributed systems are long-tailed

– This means that a very small fraction of all requests are responsible for the main part of the load.

• Ideal for temporal correlation, where the failure rate is time-dependent and some periodic failure patterns can be observed in different time-scales

Shortest 80% requests contribute only the 20% of the total load

Longest 20% requests contribute only the 80% of the total load

Time-based Brokering Strategies

1. Algorithm: Time-based2. BEGIN

3. Request duration follows lognormal distribution with μ and σ parameters

4. FOR each ∈ DO

5. //Compute mean durtion of a job6. IF THEN7. Send it to public cloud8. ELSE9. Send it to private cloud10. ENDFOR11. END Algorithm

Area-based Brokering Strategies

• Utilize the area of a request which is the area of the rectangle with length and width as the decision point.

• Making a compromise between the size-based and time-based strategy

– This strategy sends long and wide requests to the public Cloud,

– It would be more conservative than a size-based strategy and less conservative than a time-based strategy.

Page 11: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

11

Area-based Brokering Strategies

• Making a compromise between the size-based and time-based strategy

• The mean area of the requests

• This strategy sends long and wide requests to the public Cloud,

• It would be more conservative than a size-based strategy and less conservative than a time-based strategy.

Area-based Brokering Strategies1. Algorithm: Area-based2. BEGIN

3. Calculate the mean request area of a job

4. ̅ · ̅

5. FOR each ∈ DO 6. ·

7. IF ̅ THEN8. Send it to public cloud9. ELSE10. Send it to private cloud11. ENDFOR12. END Algorithm

How should jobs scheduled locally (mapping VMs to PM)?

Page 12: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

12

Server Level Scheduling Algorithms1. Algorithm: Slowdown-based 2. Begin3. 0 // threshold4. 0 // waiting time of job 5. 0 // run time of job 6. //Slowdown of job 7. FOR each ∈ DO 8.

9. IF 10. grants a reservation to .11. ENDIF12. ENDFOR13. END Algorithm

Server Level Scheduling Algorithms

1. Algorithm: Advance-based (AB) Cautiously 2. Begin3. //job at the head of the queue4. //Slowdown of job k5. FOR each ∈ DO 6. IF 7. Move ahead of 8. ENDIF9. ENDFOR10. END Algorithm

Performance Evaluation

Page 13: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

13

Experimental setup• Used real failure traces and a workload model.

• Performance Metrics– Deadline violation rate

– Slowdown

– Cloud Cost on EC2

• Failures from Failure Trace Archive (Grid’5000 traces)– 18-month, 800 events/node– Average availability: 22.26 hours– Average unavailability: 10.22 hours

Slowdown Metrics• Bounded slowdown is response time normalized by

running time and can be defined as follows

• where – Wi is the waiting time

– Ti is the run time of request i,

Usage Cost Metrics• The cost of using EC2 for policy pl can be calculated as

follows:

• Where– Hpl: the public Cloud usage per hour

– Mpl: the fraction of requests redirected to the public Cloud

– Hu: startup time for initialization of OS on a virtual machine (80s)

– Cn: The cost of one specific instance on EC2 (0.085 USD per virtual machine per hour for a small instance)

– Bn: amount of data which transfers to Amazon’s EC2 for each request (0.1 USD per GB)

Page 14: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

14

Deadline Metrics

• Deadline for application is set as follows:

– : job submission time– : job completion time– : job turn around time– : stringency factor ( 1 is normal deadline (e.g., f=1.3))

Result and Discussion

Deadline Violation Rate Analysis

• Violation rate as a function of the job arrival rate

Page 15: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

15

Deadline Violation Rate Analysis

• Violation rate as a function of the request size

Deadline Violation Rate Analysis

• Violation rate as a function of the job duration

Slowdown Analysis

• Slowdown for all provisioning policies as a function of job arrival rate.

Page 16: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

16

Slowdown Analysis

• Slowdown for all provisioning policies as a function of job size.

Slowdown Analysis

• Slowdown for all provisioning policies as a function of job service demand.

Public Cloud Usage Cost Analysis

• Cloud Cost on EC2 for all provisioning policies as a function of job arrival time.

Page 17: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

17

Public Cloud Usage Cost Analysis

• Cloud Cost on EC2 for all provisioning policies as a function of job size.

Public Cloud Usage Cost Analysis

• Cloud Cost on EC2 for all provisioning policies as a function of job service demand.

Cloud Resource Management• Scheduling is a problem that has many variants.– Optimizing one objective has been widely studied for many

combinatorial problems including scheduling.– The most popular objective is the makespan which is

informally defined as the time of the last finishing task (completion time) of an application represented by a precedence task graph.

Page 18: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

18

Understanding Cloud Computing

Infrastructure as a Service

Users

Ren

t on deman

d

Storage

Software as a Service

Platform as a Service

Cloud ServicesSalesforce.com

Amazon S3, EC2

Web 2 application, email, etc.

Develop. & test, Develop. & Integration, etc.

Advantages• Infinite compute resource on demand (virtualization)• Accessibility anytime and anywhere• Elimination of the upfront commitment of users

• Reduced costs due to dynamic hardware provisioning• Pay per use basis (and also other models)• No need to plan for peak load in advance

• Easy management: Software versioning and upgrading

Risks• Performance How to garauantee

perfromance?• Security

• How much you trust your provider?• What about recovery, tracing, and

data integrity?• Who access your data?

“A utility-oriented distributed computing system consisting of acollection of inter-connected and virtualised computers that aredynamically provisioned and presented as one or more unifiedcomputing resources based on service-level agreements (SLA)established through negotiation between the service providerand consumers.”

Cloud Computing ???

Microsoft Azure

How to make cloud computing energy efficiency?

Source: Raj

Hybrid Cloud InfrastructureResource manager for the private Cloud Able to start, pause, resume, and stop VMs on the physical resources.Able to migrate VMs for consolidation purpose

Size‐based Brokering Strategies

Time‐based Brokering Strategies

Area‐based Brokering Strategies

Greedy Algorithm

Greedy Particle Swarm Optimization (GPSO)

Page 19: Cloud Datacentre Reliability Cloud Resource Provisioning · – Cloud Selection Algorithms – Scheduling (Mapping VMs ... wireless sensor networking and mobile computing – Cloud

8/2/2018

19

Think Yourself

Thank You...

Jemal [email protected]

Collaborative Work?