56
Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions Coriolis: Scalable VM Clustering in Clouds Daniel Campello 1 <dcamp020@fiu.edu> Carlos Crespo 1 Akshat Verma 2 Raju Rangaswami 1 Praveen Jayachandran 2 1 School of Computing and Information Sciences Florida International University 2 IBM Research - India 1/21

Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis: Scalable VM Clustering in Clouds

Daniel Campello 1<[email protected]> Carlos Crespo 1

Akshat Verma 2 Raju Rangaswami 1 Praveen Jayachandran 2

1School of Computing and Information SciencesFlorida International University

2IBM Research - India

1 / 21

Page 2: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

The Benefits of Virtualization

2 / 21

Page 3: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtualization in Data Centers

3 / 21

Page 4: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtualization in Data Centers

3 / 21

Page 5: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtualization in Data Centers

VirtualMachineSprawl

3 / 21

Page 6: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Cloud Computing

Age of the Cloud

◮ IT is no longer capital-intensive

◮ Commodity acquired on-demand

◮ Paid as per usage

4 / 21

Page 7: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Cloud Computing

Age of the Cloud

◮ IT is no longer capital-intensive

◮ Commodity acquired on-demand

◮ Paid as per usage

Emerging problem

◮ Virtual machine sprawl

4 / 21

Page 8: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Cloud Computing

Age of the Cloud

◮ IT is no longer capital-intensive

◮ Commodity acquired on-demand

◮ Paid as per usage

Emerging problem

◮ Virtual machine sprawl

Standardization is the key

◮ Allow services on-demand

◮ Reduce system management costs in the software stack

4 / 21

Page 9: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Motivating Virtual Machine Clustering

Classifying (possibly) diverse virtualized servers in a cloud into

clusters of similar virtual machines (VMs) can improve the

planning of many system management activities

5 / 21

Page 10: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Outline

1 Introduction

2 Virtual Machine Similarity

Content

Semantic

Use Cases

3 Clustering Techniques

k-means and k-medoids

Coriolis’ tree-based

4 Evaluation

5 Related Work

6 Summary

6 / 21

Page 11: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Content Similarity

◮ Refers to data similarity in the raw files

X Subset of bytes contained within images are identical

7 / 21

Page 12: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Content Similarity

◮ Refers to data similarity in the raw files

X Subset of bytes contained within images are identical

◮ Extensively studied in the context of data deduplication

7 / 21

Page 13: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Content Similarity

◮ Refers to data similarity in the raw files

X Subset of bytes contained within images are identical

◮ Extensively studied in the context of data deduplication

Recent Findings

◮ Large-scale study of VM in a production IaaS cloud:

7 / 21

Page 14: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Content Similarity

◮ Refers to data similarity in the raw files

X Subset of bytes contained within images are identical

◮ Extensively studied in the context of data deduplication

Recent Findings

◮ Large-scale study of VM in a production IaaS cloud:

X Images tend to be similar to a small subset of collection.

7 / 21

Page 15: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Content Similarity

◮ Refers to data similarity in the raw files

X Subset of bytes contained within images are identical

◮ Extensively studied in the context of data deduplication

Recent Findings

◮ Large-scale study of VM in a production IaaS cloud:

X Images tend to be similar to a small subset of collection.X Computing pair-wise similarity is very expensive

7 / 21

Page 16: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Semantic Similarity

◮ Characterizes the similarity of the software functionality.

8 / 21

Page 17: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Semantic Similarity

◮ Characterizes the similarity of the software functionality.

◮ Some examples:

X Instances of same application

X Different versions of the same application

X Different applications with same goal (i.e MySQL and DB2)

8 / 21

Page 18: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Harnessing Image Similarity

System Management Scenarios

◮ Allocation of servers to system administrators

X Administrators can manage up to 80% more servers

9 / 21

Page 19: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Harnessing Image Similarity

System Management Scenarios

◮ Allocation of servers to system administrators

X Administrators can manage up to 80% more servers

◮ Troubleshooting

X Identify servers with similar software stack that respond

differently to an update to find and fix possible issues

9 / 21

Page 20: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Harnessing Image Similarity

System Management Scenarios

◮ Allocation of servers to system administrators

X Administrators can manage up to 80% more servers

◮ Troubleshooting

X Identify servers with similar software stack that respond

differently to an update to find and fix possible issues

◮ Placement of virtual machines to hosts

X In-memory and storage deduplication

9 / 21

Page 21: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Harnessing Image Similarity

System Management Scenarios

◮ Allocation of servers to system administrators

X Administrators can manage up to 80% more servers

◮ Troubleshooting

X Identify servers with similar software stack that respond

differently to an update to find and fix possible issues

◮ Placement of virtual machines to hosts

X In-memory and storage deduplication

◮ Migration of enterprise applications across data centers

X Migration performed in batches or wavesX Minimize network transfer and re-configuration costs

9 / 21

Page 22: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

Clustering is NP-hard. Various heuristic exist.

10 / 21

Page 23: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

Clustering is NP-hard. Various heuristic exist.

k-means

◮ Popular technique employed in the real world

10 / 21

Page 24: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

Clustering is NP-hard. Various heuristic exist.

k-means

◮ Popular technique employed in the real world

◮ Each iteration:

X Assignment Step - Distance operation (kN)X Update Step - Merge operation (N − 1)

10 / 21

Page 25: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

Clustering is NP-hard. Various heuristic exist.

k-means

◮ Popular technique employed in the real world

◮ Each iteration:

X Assignment Step - Distance operation (kN)X Update Step - Merge operation (N − 1)

◮ In practice Distance and Merge operations are usually verysmall

10 / 21

Page 26: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

Clustering is NP-hard. Various heuristic exist.

k-means

◮ Popular technique employed in the real world

◮ Each iteration:

X Assignment Step - Distance operation (kN)X Update Step - Merge operation (N − 1)

◮ In practice Distance and Merge operations are usually verysmall

X Problems with 100 dimensions require only about 100addition and division operations

10 / 21

Page 27: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

◮ Distance can be calculated as 1 − SIM(Ii , Ij)

SIM(Ii , Ij) =wt(Ii ∩ Ij)

wt(Ii ∪ Ij)

11 / 21

Page 28: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

◮ Distance can be calculated as 1 − SIM(Ii , Ij)

SIM(Ii , Ij) =wt(Ii ∩ Ij)

wt(Ii ∪ Ij)

◮ Merge can be calculated as (Ii ∪ Ij)

11 / 21

Page 29: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

◮ Distance can be calculated as 1 − SIM(Ii , Ij)

SIM(Ii , Ij) =wt(Ii ∩ Ij)

wt(Ii ∪ Ij)

◮ Merge can be calculated as (Ii ∪ Ij)

Image Size Similarity (Content) Merge

8.8 GB 45.5 sec 14.7 sec12.3 GB 75.2 sec 24.1 sec13.6 GB 98.5 sec 31.2 sec16.3 GB 142.3 sec 44.2 sec19.7 GB 172.2 sec 53.5 sec22.1 GB 232.7 sec 64.9 sec

11 / 21

Page 30: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Virtual Machine Clustering

◮ Distance can be calculated as 1 − SIM(Ii , Ij)

SIM(Ii , Ij) =wt(Ii ∩ Ij)

wt(Ii ∪ Ij)

◮ Merge can be calculated as (Ii ∪ Ij)

Image Size Similarity (Content) Merge

8.8 GB 45.5 sec 14.7 sec12.3 GB 75.2 sec 24.1 sec13.6 GB 98.5 sec 31.2 sec16.3 GB 142.3 sec 44.2 sec19.7 GB 172.2 sec 53.5 sec22.1 GB 232.7 sec 64.9 sec

◮ A data center with 1000 images would have to perform 10003 similarity

operations, about 2000 years

◮ By using in-memory data structures, about 40 years

11 / 21

Page 31: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Approximate Clustering

k-medoids

◮ k-medoids is a variant of k-means

X Restricts the cluster center to be one of the existing points

(images)X Pair-wise similarity can be computed in advance

X Similarity computation required for all images (N2)

12 / 21

Page 32: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Approximate Clustering

k-medoids

◮ k-medoids is a variant of k-means

X Restricts the cluster center to be one of the existing points

(images)X Pair-wise similarity can be computed in advance

X Similarity computation required for all images (N2)

◮ A data center with 1000 images would have to perform 10002 similarity

operations, about 2 years

◮ By using in-memory data structures, about 15 days

12 / 21

Page 33: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Solution Idea: Asymmetric Clustering

Coriolis’ tree-based clustering

◮ Coriolis’ clustering approach involves constructing a tree

13 / 21

Page 34: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Solution Idea: Asymmetric Clustering

Coriolis’ tree-based clustering

◮ Coriolis’ clustering approach involves constructing a tree

X The tree is constructed by adding images to it one by one

13 / 21

Page 35: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Solution Idea: Asymmetric Clustering

Coriolis’ tree-based clustering

◮ Coriolis’ clustering approach involves constructing a tree

X The tree is constructed by adding images to it one by one

X Each node of the tree is either a cluster of images or a

single image

13 / 21

Page 36: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Solution Idea: Asymmetric Clustering

Coriolis’ tree-based clustering

◮ Coriolis’ clustering approach involves constructing a tree

X The tree is constructed by adding images to it one by one

X Each node of the tree is either a cluster of images or a

single imageX Each level in the tree represents a minimum extent of

similarity within a node

13 / 21

Page 37: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ Tree-based Clustering

A,B,C

D,E

A,B C,D,E

C,EA B D

C E

S > 0.5

S = 1.0

S > 0.9

S >= 0

14 / 21

Page 38: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ tree-based clustering

Two key ideas

◮ Speed up similarity computation

15 / 21

Page 39: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ tree-based clustering

Two key ideas

◮ Speed up similarity computation

X An asymmetric similarity function S: Coverage offered by alarger node B (typically a cluster) to a new node A

15 / 21

Page 40: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ tree-based clustering

Two key ideas

◮ Speed up similarity computation

X An asymmetric similarity function S: Coverage offered by alarger node B (typically a cluster) to a new node A

S =wt(A ∩ B)

min(wt(A),wt(B))

15 / 21

Page 41: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ tree-based clustering

Two key ideas

◮ Speed up similarity computation

X An asymmetric similarity function S: Coverage offered by alarger node B (typically a cluster) to a new node A

S =wt(A ∩ B)

min(wt(A),wt(B))

◮ Reuse similarity computations

15 / 21

Page 42: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ tree-based clustering

Two key ideas

◮ Speed up similarity computation

X An asymmetric similarity function S: Coverage offered by alarger node B (typically a cluster) to a new node A

S =wt(A ∩ B)

min(wt(A),wt(B))

◮ Reuse similarity computations

X We only compute the similarity of the new image to each

children of the nodes where the image has been merged

15 / 21

Page 43: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Coriolis’ tree-based clustering

Two key ideas

◮ Speed up similarity computation

X An asymmetric similarity function S: Coverage offered by alarger node B (typically a cluster) to a new node A

S =wt(A ∩ B)

min(wt(A),wt(B))

◮ Reuse similarity computations

X We only compute the similarity of the new image to each

children of the nodes where the image has been mergedX Similarity and Merge operations are proportional to the

depth of the tree

15 / 21

Page 44: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Clustering a new Image F

A,B,C

D,E,F

A,B,F C,D,E

C,EA B,F D

C E

S > 0.5

S = 1.0 B F

S > 0.9

S >= 0

16 / 21

Page 45: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Clustering a new Image F

A,B,C

D,E,F

A,B,F C,D,E

C,EA B,F D

C E

S > 0.5

S = 1.0 B F

S > 0.9

S >= 0

Which clusters are formed with similarity greater than 0.5?

16 / 21

Page 46: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Scalability Evaluation

Experimental Setup

◮ We used VM images from 2 production data centers

17 / 21

Page 47: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Scalability Evaluation

Experimental Setup

◮ We used VM images from 2 production data centers

X 9 images from a large-scale enterprise data center at IBM

17 / 21

Page 48: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Scalability Evaluation

Experimental Setup

◮ We used VM images from 2 production data centers

X 9 images from a large-scale enterprise data center at IBM

X 12 images from the Computer Science department’s smallscale data center at FIU

17 / 21

Page 49: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Scalability Evaluation

Experimental Setup

◮ We used VM images from 2 production data centers

X 9 images from a large-scale enterprise data center at IBM

X 12 images from the Computer Science department’s smallscale data center at FIU

X We randomly sampled files contained in 3 of the 21 images

and generated new synthetic images

17 / 21

Page 50: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Scalability of k-medoids and Tree-based Clustering

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0 5 10 15 20 25 30 35

9978484236302418

Clu

ste

ring T

ime (

min

ute

s)

Total Files (in Millions)

Number of Images

k-medoidsTree-based

18 / 21

Page 51: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Related Work

Finding Similar Clusters

◮ VMFlock: Virtual Machine Co-migration for the Cloud[IEEE/ACM

HPDC’11]

X Applies standard de-duplication techniques for images

X Eliminate raw data duplicates across a given set of VM images

X It does not tackle identifying images with high redundancy or

leveraging semantic similarity

19 / 21

Page 52: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Conclusions and Future Work

Conclusions

◮ We described different types of similarity metrics for VMs andtheir use to aid administrators in their management activities

20 / 21

Page 53: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Conclusions and Future Work

Conclusions

◮ We described different types of similarity metrics for VMs andtheir use to aid administrators in their management activities

◮ We argued that state-of-the-art k-medoids clustering algorithmincurs quadratic complexity infeasible for cloud scale data

centers

20 / 21

Page 54: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Conclusions and Future Work

Conclusions

◮ We described different types of similarity metrics for VMs andtheir use to aid administrators in their management activities

◮ We argued that state-of-the-art k-medoids clustering algorithmincurs quadratic complexity infeasible for cloud scale data

centers

◮ We described the Coriolis framework and system specificallydesigned for scalable clustering of VM images while supporting

arbitrary similarity metrics

20 / 21

Page 55: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Conclusions and Future Work

Conclusions

◮ We described different types of similarity metrics for VMs andtheir use to aid administrators in their management activities

◮ We argued that state-of-the-art k-medoids clustering algorithmincurs quadratic complexity infeasible for cloud scale data

centers

◮ We described the Coriolis framework and system specificallydesigned for scalable clustering of VM images while supporting

arbitrary similarity metrics

Future Work

◮ Our future work will explore the utility of Coriolis for data center

administrator allocation, troubleshooting, and large-scale VMmigration

20 / 21

Page 56: Coriolis: Scalable VM Clustering in Clouds · Coriolis’ clustering approach involves constructing a tree X The tree is constructed by adding images to it one by one X Each node

Introduction Virtual Machine Similarity Clustering Techniques Evaluation Related Work Summary Questions

Thank you!

(I’ll be happy to take questions)

21 / 21