27
Sponsored by: & & Real-time scheduling for virtual machines in SK Telecom Eunkyu Byun Cloud Computing Lab., SK Telecom

Realtime scheduling for virtual machines in SKT

Embed Size (px)

DESCRIPTION

The needs for immediate responsiveness of VMs in the virtualized environments have been on the rise. Several services in SKT also require soft realtime support for virtual machines to substitute the physical machines to achieve high utilization and adaptability. However, consolidated multiple OSes and irregular external events might render the hypervisor infringe on a VM's promptitude. As a solution of this problem, we are improving Xen's credit scheduler by introducing the RT_PRIORITY that guarantees a VM's running at any given point in time as long as credits remains to be burn. It would increase the quality of service and make a VM's behavior predictable on the consolidated environment. In addition, we extend our suggestion to the multi-core environment and even a large number of physical machines by using live migrations.

Citation preview

Page 1: Realtime scheduling for virtual machines in SKT

Sponsored by:

& &

Real-time scheduling for virtual machines in SK Telecom

Eunkyu Byun

Cloud Computing Lab., SK Telecom

Page 2: Realtime scheduling for virtual machines in SKT

Cloud by Virtualization in SKT•

• Provide virtualized ICT infra to customers like Amazon EC2 from SKT’s cloud resource pool exploiting server virtualization

• Resources : Servers/PC, Network, Storage, …• Functionalities : load balancing, security solution, back-up, …

• Private cloud inside SKT – virtualized servers, virtualized I/O

• Migrate IT services on legacy servers to virtualized servers

• Provide employees with PaaS for software development

• Virtual desktop infrastructure for employees

Page 3: Realtime scheduling for virtual machines in SKT

One Common Cloud Computing Infrastructure

Cloud as Telecommunication Operator• SK Telecom is a Telecommunication operator as well as a

Cloud service provider

Legacy Network/Telecom Serviceson dedicated/reliable equipments

General-purpose Server farmsfor Cloud Hosting based on

virtualization technique

Advantage - Scale dynamically with demand - High utilization - Easy start-up of new services

Requirements 1. Guaranteeing time readiness 2. Scalability of services 3. Cost-effective secure storage

Virtual Telco

Page 4: Realtime scheduling for virtual machines in SKT

Case Study – Virtual Telco.• IMS (IP multimedia subsystems) on Cloud

• Delivering IP multimedia services (VoIP, VOD, Instance Message, …) requiring session initiation between participants on Internet to users connected to wireless telecom networks

• Launch easily Internet services on wireless network/telecom infra

• Migrate servers for components into Cloud – require high reliability

Application Service

Session Management User Info. Database

Media Processing

IP network

SIP

Page 5: Realtime scheduling for virtual machines in SKT

Challenges1. Guaranteeing time readiness2. Scalability of services3. Cost-effective secure storage

• Which virtualization technique, i.e., hypervisor, is most suitable for supporting real-time VM?

• We choose….• Xen Hypervisor – best in responsiveness benchmark, open source

• Credit scheduler – default in Xen 4.1, known to be stable• Second option : Credit 2 scheduler

Page 6: Realtime scheduling for virtual machines in SKT

Limitation of credit scheduler

no_contention

256 512 1024 1536 2048 40960

10

20

30

40

50

60

CPU usage(sec) – Media Player VM

(weight)

Real-time VM can not occupy the proper amount of CPU even though with very high weight

CPU intensive VM makes use of the residual credits of non CPU-intensive (i.e. media player) VM’s credits

(sec) Contention between 6 CPU-intensive VMs(weight = 256) and 1 Media player VM

Need improvement!!

Page 7: Realtime scheduling for virtual machines in SKT

Research Goal• Find improved soft real-time schedulers based on stable

credit scheduler • Fair CPU sharing – each VM occupies CPU (almost) exactly

proportional to its weight + work-conserving

• Real-time support – fast responsiveness of real-time VMs• Modify credit scheduler to distinguish realtime VM and non-realtime VM• Realtime VMs are marked externally and treated specially to provide fast

responsiveness

• Co-work with

Page 8: Realtime scheduling for virtual machines in SKT

Preempt based scheduling

CPU 0

Run Queue

: Under Priority: Over Priority: Boost Priority

VCPU 2 VCPU 7 VCPU 1 VCPU 0 VCPU 5VCPU 4

VCPU 6VCPU 3 VCPU 8

New Job to Schedule: RT Priority

BOOST > RT > UNDER > OVER Idea - Realtime VM’s VCPU is inserted to the runQ of a physical cpu

at right after BOOST priority Non-realtime VMs can run when RT VMs consume all given credits or

are blocked

Page 9: Realtime scheduling for virtual machines in SKT

BOOST based scheduling (Min Lee, VEE’10 )

In the credit scheduler, VMs can get the highest priority (BOOST) when they receives events if they were blocked

However, VMs in runQ is not boosted BOOST realtime VMs always they receives external event even they

are in already in runQ

CPU 0

Run Queue

: Under Priority

: Over Priority

: Boost Priority

VCPU 2 VCPU 7 VCPU 1 VCPU 0 VCPU 5VCPU 4

: RT Priority

External Event

Page 10: Realtime scheduling for virtual machines in SKT

Multi BOOST (by Korea Univ. at XenSummit, Aug, 2011)

Multiple BOOSTs at the same time Driver domain and realtime VM cannot always get the highest

priority DRIVER_BOOST > RT_BOOST > BOOST > RT > UNDER > OVER

CPU 0

Run Queue

: Under Priority: Over Priority

: Boost Priority

VCPU 2 VCPU 7 VCPU 1 VCPU 0 VCPU 5

VCPU 6VCPU 3 VCPU 8

: RT Priority External Event !

Page 11: Realtime scheduling for virtual machines in SKT

Performance EvaluationPhysical server spec.

CPU AMD Phenom™ II X6 1055T (6 cores)

Memory 16GB

NET Gigabit Ethernet

Xen 4.1.1

VM VCPU:4, MEM:1GB

XenHypervisor

PCPU1

PCPU2

PCPU3

PCPU4

PCPU5

PCPU6

VM1(micro bench)

VM2(micro bench)

VM3(micro bench)

VM4(micro bench)

VM5(micro bench)

VM6(micro bench)

RT VM(media player)

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4V1

Micro bench - Not set as RT priority - repeat CPU-intensive computing during random time and sleep for random time - above 98% CPU usage

Page 12: Realtime scheduling for virtual machines in SKT

CPU, Network Usages

no_contention

unmodified

weight(2048)

weight(2048)+preempt

weight(2048)+preempt+boost

weight(2048)+multiboost

weight(2048)+preempt+multiboost

0

10

20

30

40

50

60

51.96

12.3

31.58

39.7 39.22 38.72

51.52

CPU usage(sec)

no_contention

unmodified

weight(2048)

weight(2048)+preempt

weight(2048)+preempt+boost

weight(2048)+preempt+multiboost

0

50

100

150

200

250

300

350

400

450

500 465.17

42.22

193.05

259.04 255.97

421.7

NET usage(MB)

Page 13: Realtime scheduling for virtual machines in SKT

Fairness CPU sharing according to Weight

unmodified weight(256)+preemption+boost+multiboost

weight(512)+preemption+boost+multiboost

weight(1024)+preemption+boost+multiboost

weight(2048)+preemption+boost+multiboost

0

50

100

150

200

250

300

350

400

50.81 51.3388.91

136.43184.54

50.97 50.58

44.82

36.65

28.47

50.64 50.6444.71

36.55

28.38

50.76 50.8544.64

36.628.46

50.78 50.4244.63

36.5628.5150.8 50.84 44.73 36.67 28.52

50.77 50.64 44.71 36.67 28.44

RT VM1 VM2 VM3 VM4 VM5 VM6RT VM runs CPU-intensive process – fully utilizing CPU

• CPU usages are proportional to weight value• Fair between RTVM and no-RTVM with equal weight

Page 14: Realtime scheduling for virtual machines in SKT

Work-conserving(sec)

no_contention unomdified weight weight+preemption weight+preemption+boost

weight+preemption+boost+multiboost

0

50

100

150

200

250

300

350

400

51.95812.3

31.582 39.6980000000001 39.219 51.52457.113

53.712 52.258 52.37749.599

57.30653.699 52.113 52.35

49.572

57.1653.741 52.152 52.381

49.586

56.97953.588 52.171 52.338 49.57

57.237 53.576 52.205 52.361 49.589

57.314 53.689 52.129 52.294 49.548

Media Dom0 VM1 VM2 VM3 VM4 VM5 VM6

Upper bound for media player

358.661 358.098358.97358.791358.645

65.951

78.79%

15.98%

3.43%

15.93%

15.89%

15.96%

15.98%

15.94%

14.98%

8.81%

14.98%

14.95%

14.94%

14.97%

14.99%

14.53%

11.07%

14.57%

14.55%

14.56%

14.53%

14.54%

14.59%

10.93%

14.6%

14.59%

14.59%

14.57%

14.6%

13.85%

14.39%

13.86%

13.85%

13.85%

13.84%

13.85%

Weight for VM1~VM6 (non realtime VM) : 256Weight for Media player VM( realtime VM) : 2048

Page 15: Realtime scheduling for virtual machines in SKT

Responsiveness – Test environment

Xen Hypervisor

PCPU1

PCPU2

PCPU3

PCPU4

PCPU5

PCPU6

VM1(micro bench)

VM2(micro bench)

VM3(micro bench)

VM4(micro bench)

VM5(micro bench)

VM6(micro bench)

RT VM(ping target)

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4

V1 V2

V3 V4V1

External server

ping request(per 0.01sec, 10000 counts)

Physical server spec.

CPU AMD Phenom™ II X6 1055T (6 cores)

RAM 16GB

NET Gigabit Ethernet

Xen 4.1.1

VM VCPU:4, MEM:1GB

Page 16: Realtime scheduling for virtual machines in SKT

Responsiveness (Ping RTT, Credit)• The cumulative distribution of ping RTT as the number of simultaneous CPU-

intensive VMs increases

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

no_contention contention_VM1 contention_VM2 contention_VM3contention_VM4 contention_VM5 contention_VM6(cumulative

distribution)

(ping RTT, ms)

+

All pings take only 0.5ms without contention

20% of ping takes longer than 10 mswhen 7 VMs run simultaneously

Page 17: Realtime scheduling for virtual machines in SKT

Responsiveness (Ping RTT, modified)(cumulative distribution)

(ping RTT, ms)0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

no_contention contention_VM6 contention_VM6+preempt

contention_VM6+rtboost

contention_VM6+multiboost

contention_VM6preempt+multiboost

weight of realtime VM = 256 weight of non-realtime VM = 256

After applying our modification, all pings take only 0.5ms even with contention

Page 18: Realtime scheduling for virtual machines in SKT

What about Credit2 scheduler?CPU usage(sec) – Media Player VM Credit2’s approach – (from white paper)

VM burn credits based on their weight Higher weight means credits burn

more slowly VCPUs are inserted into the runQ by

credit order VM with more credits runs first

Credits are “reset” when the next vcpu in the runqueue is less than or equal to zero

(sec)

no_contention

256 512 1024 1536 20480

10

20

30

40

50

60

(weight)

Achieve both fairness and work-conserving

Page 19: Realtime scheduling for virtual machines in SKT

Responsiveness of Credit 2 scheduler(cumulative distribution)

(ping RTT, ms)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

weight(256) weight(512) weight(768) weight(1024) weight(1536) weight(2048)

For fast responsiveness, VM needs higher weight. If we want to divide CPU cycle equally between VMs? Using special policy for realtime VM is necessary.

Page 20: Realtime scheduling for virtual machines in SKT

Ongoing Research

• What if there are several realtime VMs competing limited physical server/core ?• Prediction based scheduling between real-time VMs• Load balancing between physical cores• Efficient placement policy of RTVMs between physical

servers• Load balancing between physical servers using live

migration of VMs

Page 21: Realtime scheduling for virtual machines in SKT

Summary• SK telecom is trying to operate Telco services on cloud

resources

• Realtime support in hypervisor is essential

• Analyzed the performance of modifications of Credit scheduler of Xen hypervisor• For one realtime VM per physical core, fair sharing and fast

responsiveness

• Plan to improve for more complex and practical cases

Page 22: Realtime scheduling for virtual machines in SKT

Thank you

Contact: [email protected]

Page 23: Realtime scheduling for virtual machines in SKT

Comparison of hypervisers• Evaluation environment

• Physical server : Dell R410 (Xeon 8 cores, 16GB Memory)

• Virtual Machine : 1 Core, 1 GB Memory, 20GB HDD

• Increase the number of VMs running benchmarks• Benchmarks : PCMARK 2005, kernel compile, SPEC-CPU 2006

• A real-time application measure the delay of the timer interrupt handling in OS of VM

• Measure every 5 sec. for ten minutes

Page 24: Realtime scheduling for virtual machines in SKT

KVM 0.12.5

Page 25: Realtime scheduling for virtual machines in SKT

Vmware ESX 4.1

Page 26: Realtime scheduling for virtual machines in SKT

Xen 4.0

•Xen is the best one, but not sufficient

•Contention of non real-time VMs affects the responsiveness of real-time VM

Page 27: Realtime scheduling for virtual machines in SKT

Approach

• VM scheduler in the hypervisor is important• Credit scheduler

• Stable (default scheduler in Xen hypervisor 4.0) , SMP support• Need improvement for latency-sensitive VM

• Credit 2 scheduler• Proportional sharing according to weight of each VM• Provide responsiveness to VMs with larger weights• Not so stable yet, need more analysis