Download pptx - Research on Embedded Hypervisor Scheduler Techniques

Research on Embedded Hypervisor Scheduler TechniquesMidterm Report

2014/06/25

1

BackgroundAsymmetric multi-core is

becoming increasing popular over homogeneous multi-core systems.◦An asymmetric multi-core platform

consists of cores with different capabilities, for example, ARM big.LITTLE architecture.

2

ARM big.LITTLE CoreDeveloped by ARM in Oct. 2011.Combine two kinds of

architecturally compatible cores.To create a multi-core processor that can

adjust better to dynamic computing needs and use less power than clock scaling alone.

big cores are more powerful but power-hungry, while LITTLE cores are low-power but (relatively) slower.

3

Three Types of ModelsCluster migrationCPU migration(In-Kernel

Switcher)Heterogeneous multi-processing

(global task scheduling)

MotivationTraditional scheduler for

homogeneous multi-core platform focus on load-balancing.◦Each core has the same computing

ability, workloads are distributed evenly in order to obtain maximum performance.

5

Motivation(Cont.)Need new scheduling strategies

for asymmetric multi-core platform.◦Cores with different power and

computing characteristics.

6

OS Kernel

GUEST2

Scheduler

VCPU VCPU

OS Kernel

GUEST2

Scheduler

VCPU VCPU

7

ARM Cortex-A15

ARM Cortex-A7

OS Kernel

GUEST1

Scheduler

VCPU VCPU

Hypervisor

vCPU Scheduler

Performance Power-saving

Low computing resource requirement

High computing resource requirement

If Guest OS scheduler is not big.LITTLE-aware, it will assign tasks to vCPUs evenly in order to achieve load balancing.

Task 1

Task 2

Task 3

Task 4

Hypervisor vCPU scheduler will assign vCPUs evenly to physical ARM cores since it is not big.LITTLE-aware.

Cannot take advantage on

big.LITTLE core architeture

Current Hypervisor Architecture and Problem

OS Kernel

GUEST2

Scheduler

VCPU VCPU

8

ARM Cortex-A15

ARM Cortex-A7

OS Kernel

GUEST1

Scheduler

VCPU VCPU

Hypervisor

vCPU Scheduler


Assume that the scheduler in the Guest OS is big.LITTLE-aware. The vCPU are either big or little.

Hypervisor vCPU scheduler will assign vCPUs evenly to physical ARM cores in order to achieve load-balancing.



Current Hypervisor Architecture and Problem(Cont.)

VCPUVCPU VCPUVCPUWaste energyPerformance Degradation

Project GoalResearch on the current scheduling

algorithms.Design and tune the hypervisor

scheduler on asymmetric multi-core platform.Assign virtual cores to physical cores for

execution.Minimize the power consumption with

performance guarantee.

9

Challenge The hypervisor scheduler cannot

take advantage of big.LITTLE architecture if the scheduler inside guest OS is not big.LITTLE aware.

10

OS Kernel

GUEST2

Android Framework

Scheduler

VCPU VCPU

11

ARM Cortex-A15

ARM Cortex-A7

OS Kernel

GUEST1

Android Framework

Scheduler

VCPU VCPU

Hypervisor


If Guest OS scheduler is not big.LITTLE-aware, it will assign tasks to vCPUs evenly in order to achieve load balancing.



Task 1 Task 2Task 3 Task 4

Both on big coreOt both on LITTLE core

b-L vCPU Scheduler

Even if hypervisor vCPU scheduler is big.LITTLE-aware, it will schedule these vCPUs to either big cores or LITTLE cores since they have the same loading.

Current Hypervisor Architecture and Problem(Cont.)

Possible SolutionApply VM introspection(VMI) to

retrieve the process list in a VM.◦VMI is a technique that allows the

hypervisor to inspect the contents of the VM in real-time.

Modify the CPU masks of tasks in the VM in order to create an illusion of “big vCPU” and “LITTLE vCPU”.

Hypervisor scheduler can assign the vCPU to corresponding big or LITTLE cores.

12

Linaro Linux Kernel

GUEST2

Android Framework

Scheduler

VCPU VCPU

13

ARM Cortex-A15

ARM Cortex-A7

OS Kernel

GUEST1

Android Framework

Scheduler

VCPU VCPU

Hypervisor


OS Kernel

GUEST2

Android Framework

Scheduler

VCPU VCPU

Low computing resource requirement

High computing resource requirement

Task 2 Task 4

VM Introspector

b-L vCPU Scheduler

VM Introspector gathers task information from

Guest OS

Task-to-vCPU

Mapper

Modify the CPU mask of each task according

to the task information from VMI

[1|0][1|0]

[0|1][0|1]

Treat this vCPU as LITTLE core since tasks with low computing requirement are scheduled here.

Hypervisor vCPU scheduler will schedule big vCPU to A15, and LITTLE vCPU to A7.

VCPU

Task 3Task 1

Hypervisor Architecture with VMI

OS Kernel

GUEST2

Android Framework

Scheduler

VCPU VCPU

14

ARM Cortex-A15

ARM Cortex-A7

OS Kernel

GUEST1

Android Framework

Scheduler

VCPU VCPU

Hypervisor


Task 2 Task 4

VM Introspector

b-L vCPU Scheduler

Task-to-vCPU

Mapper

[1|0][1|0]

[0|1][0|1]

Hypervisor vCPU scheduler will schedule big vCPU to A15, and LITTLE vCPU to A7.

VCPU

Task 3Task 1

VCPU VCPU

Task 1[1|1] [1|1]

Guest OS 2 has two task with low computing requirement

VM Introspector gathers task information from

Guest OS

Modify the CPU mask of each task according

to the task information from VMI

Task 2

Treat this vCPU as LITTLE core since tasks with low computing requirement are scheduled here.

Guest OS 1 has two task with high computing

requirement and two task with low computing

requirement

Hypervisor Architecture with VMI(Cont.)

Hypervisor SchedulerSchedules the virtual cores to

physical cores for execution.◦Decides the execution order and

amount of time assigned to each virtual core according to some scheduling policies.

◦Xen - credit-based scheduler◦KVM - completely fair scheduler

15

Credit-Based Scheduler Each domain(OS) is assigned

with a weight and a cap.◦The weight decides the amount of

time a domain will get in a time interval.

◦The cap optionally fixes the maximum amount of CPU a domain will be able to consume.

16

Credit-Based Scheduler(Cont.) Each CPU manages a local run

queue of runnable virtual cores.Queue is sorted by virtual cores

priority. ◦virtual cores priority is either over or under. Exceeded its fair share of CPU resource in

a time interval or not. While inserting a virtual core, it is put

after all other virtual cores of equal priority.

17

Credit-Based Scheduler(Cont.) As a virtual core runs, it

consumes credits.The next virtual core to run is

picked from the head of the run queue.

A CPU will look on other CPUs for runnable virtual cores before going idle.

18

Credit-Based Scheduler on Asymmetric multi-coreCredit-Based Scheduler consider

only “fair share” of time slices.Assigning the same amount of

time slices on big and little cores results in different performance and power consumption.

19

Virtual Core Scheduling Problem

For every time period, the hypervisor scheduler is given a set of virtual cores.

Given the operating frequency of each virtual core, the scheduler will generate a scheduling plan, such that the power consumption is minimized, and the performance is guaranteed.

20

Core ModelsThere are two types of cores –

virtual cores and physical cores.

◦vj: frequency of the virtual core

◦fi: frequency of the physical core

◦ti, tj: type of the core21

),(

),(

iii

jjj

tfpC

tvvC

Power ModelTo decide the power model, we

have done some preliminary experiments to measure the power consumption of cores.◦On ODROID-XU board

22

Result – bzip2

0 20 40 60 80 100 1200

0.5

1

1.5

2

2.5

250MHzLinear (250MHz)600MHzLinear (600MHz)8000MHzLinear (8000MHz)1600MHzLinear (1600MHz)

Loading(%)

功耗

(Watt)

Power Model(Cont.)The power consumption of a

physical core is a function of core type, core frequency, and the load of the core.◦The load of a core is the percentage

of time a core is executing virtual cores.

24

),,( loadtfpPower iii

PerformanceA ratio between the computing

resource assigned, to the computing resource requested.◦Ex: a virtual core running at 800MHz

runs on a physical core of 1200MHz for 60% of a time interval.

◦The performance of this virtual core is 0.6*1200/800 = 0.9.

25

Objective FunctionGenerate a scheduling plan, such

that the power consumption is minimized, and the performance is guaranteed.◦Assume there are n physical cores.

)min(1

n

iiPower

Scheduling PlanA set of ai,j which indicates the

amount of time executing virtual core j on physical core i in a time interval.

A feasible scheduling plan must satisfies some constraints.

27

ConstraintsEach virtual cores should be

assigned with sufficient computing resources in order to meet performance guarantee.

1

guarantee eperformanc

resource computing

1,

1,

1,

n

iji

j

n

iiji

n

iiji

a

v

fa

fa

Constraints(Cont.)A physical core has a fixed

amount of computing resources in a time interval.◦According to its frequency

10

1

,

1,

ji

i

m

jji

a

loada

Current SolutionGiven the objective function and

the constraints, we can use integer programming to find a feasible scheduling plan.◦Divide each time interval into 100

time slices.◦The ai,j in the scheduling plan can be

transformed into the amount of time slices a virtual core j on physical core i.

30

Assign Virtual Cores to Physical CoresWith the scheduling plan from

integer programming is not enough.

Need to find a way to assign these virtual cores according to the scheduling plan.◦A virtual core cannot appear in two

or more physical core on the same time.

31

Example – 3 vCPUs, 2 Physical Core

… …

vCPU0

(60, 20)vCPU1

(0, 50)vCPU2

(20, 30)

t=100

t=0

Assign Virtual Cores to Physical Cores(Cont.)Given a feasible scheduling plan,

we can schedule the virtual cores to physical cores without violating the constraints.◦Consider n physical cores with m

virtual cores, n < m.

33

ExamplevCPU0

(50,40,0, 0)

t=100

t=0

vCPU3

(10,10,20, 20)

vCPU1

(20,20,20, 20)vCPU4

(10,10,10, 10)

vCPU2

(10,10,20, 20)vCPU5

(0, 0,10, 10)

Flow of Each Interval

35

Guest Oses schedules and adjusts the core

frequencies

Hypervisor scheduler generates a scheduling

plan

Virtual Core Frequencies

Execute virtual cores on physical

cores

Tasks running in Guest OSes

Loading and/or QoS

Scheduling plan

Affect task performance

Trigger DVFS mechanism on physical cores

SimulationConduct simulations to compare

the power consumption of our asymmetry-aware scheduler with that of a credit-based scheduler.

36

Simulation EnvironmentTwo types of physical cores

◦power-hunger “big” cores frequency: 1600MHz

◦power-efficient “little” cores frequency: 600MHz

◦The DVFS mechanism is disabled.

37

Scenario I – 2 Big and 2 LittleEach VM has two virtual cores.Two sets of input:

◦Case 1: Both VMs with light workloads. 250MHz for each virtual core.

◦Case 2: One VM with heavy workloads, the other with modest workloads. Heavy:1200MHz for each virtual core Modest:600MHz for each virtual core.

38

Scenario I - Results

◦Case 1: asymmetry-aware method is about 43.2% of that of credit-based method.

◦Case 2:asymmetry-aware method uses 95.6% of energy used by the credit-base method.

39

Power(Watt)

Case 1Asymmetry-aware

0.295

Credit-based 0.683


2.382

Credit-based 2.491

Scenario 2 – 4 Big and 4 LittleThe hardware specification of ARM 64-

bit boardEach case has three Quad-core

VM:

40

VM1 VM2 VM3

Case 1 All 250 MHz All 250 MHz All 250 MHz

Case 2 All 600MHz All 600 MHz All 250 MHz

Case 3 All 1600MHz All 1600MHz All 1600MHz

Case 4

800,800,400,400 MHz

1000,800,600,400 MHz

600,600,250,250 MHz

Scenario 2 - Results

In case 3, the loading of physical cores are 100% using both methods. Cannot save power if the computing resources

are not enough.

41

Power(Watt) Savings


1.20541.2

%Credit-based 2.049


3.52411.1

%Credit-based 3.960

Case 3*Asymmetry-aware

6.0090%

Credit-based 6.009


4.4356%

Credit-based 4.711

SummaryWe develop an energy-efficient

asymmetry-aware scheduler for asymmetric multi-core platforms.

The goal is to generate an energy-efficient scheduling plan with performance guarantee.

Our simulation results show that the asymmetry-aware strategy saves up to 57.2% energy against credit-based method, while still providing performance guarantee.

42

Q&A

43