84
Advanced Topics on Heterogeneous System Architectures Politecnico di Milano Seminar Room A. Alario 27 November, 2015 Antonio R. Miele Marco D. Santambrogio Politecnico di Milano Runtime Resource Management

Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Embed Size (px)

Citation preview

Page 1: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Advanced Topics on Heterogeneous System Architectures

Politecnico di Milano

Seminar Room A. Alario

27 November, 2015

Antonio R. Miele

Marco D. Santambrogio

Politecnico di Milano

Runtime Resource Management

Page 2: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Modern architectures are heterogeneous

• A modern computer architecture

includes

– One or more CPUs

– One or more GPUs

– Optionally other HW accelerators

2

NVIDIA Tegra X1

Intel Skylane Samsung Exynos 5422

Page 3: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Each unit presents a different profile in terms of

– Computational efficiency

– Power consumption

• Example: ARM big.LITTLE

3

Modern architectures are heterogeneous

bigpower-hungry high performance cores

LITTLEpower-saving low

performance cores

Page 4: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Each unit presents a different profile in terms of

– Computational efficiency

– Power consumption

• Example: AMD Accelerated Processing Unit

4

Modern architectures are heterogeneous

Computational efficiency depends also on the “internal structure” of the application

Page 5: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Workloads are dynamic

• System’s workload are highly evolving, variable and

heterogeneous in many application scenarios (from

mobile phones to HPC servers)

– Different types of applications

– Different amount of data and input parameters per each run of

the application

• Applications with different use-mode

– Different execution times

• Depending on the application and the amount of processed data

– Different possible Quality-of-Service (QoS) requirements

• In terms of throughput, turnaround time, deadlines, ...

– Unknown arrival times

• Depending on the user requests

5

Page 6: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Workloads are dynamic

• On a mobile phone:

– Phone calls

– Short message service

– Web browsing

– Audio/video playing

– Gaming

6

Generally short execution times,

low amount of processed data,

actually no QoS requirements or

not-challenging ones

Generally considerable amount

of data to be processed with

specific throughputs to be

fulfilled, high demanding

elaborations

Page 7: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Workloads are dynamic

• On a HPC server:

– Financial modeling and analysis

– Fluid dynamic simulations

– Weather and climatic modeling

– ...

7

Page 8: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Energy and power consumption - Are they a

constraint for…

Energy/power consumption issues

8

Page 9: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

Energy/power consumption issues

9

Page 10: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Batteries capacities

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

Energy/power consumption issues

10

Page 11: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Batteries capacities

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

– Desktops

– Servers

– HPC Clusters

Energy/power consumption issues

11

Page 12: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Energy costs

Batteries capacities

Energy/power consumption issues

12

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

– Desktops

– Servers

– HPC Clusters

Page 13: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Energy costs

Batteries capacities

Energy/power consumption issues

13

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

– Desktops

– Servers

– HPC Clusters

+ Thermal issues!

AMD Athlon II x2 240 – Dual Core running SPEC CPU 2006

Page 14: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Energy costs

Batteries capacities

Energy/power consumption issues

14

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

– Desktops

– Servers

– HPC Clusters

+ Thermal issues!• Main cause is the power consumption

• Modern chips are characterized by a

Thermal Design Power (TDP)

Page 15: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Energy costs

Batteries capacities

Energy/power consumption issues

15

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

– Desktops

– Servers

– HPC Clusters

+ Thermal issues!

• Consequences of over-heating

– Higher cooling costs

– Accelerated aging

– Chip burn

Page 16: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Energy and Power

budgets

• Energy and power consumption - Are they a

constraint for…

– Embedded & IoT

– Mobile devices

– Desktops

– Servers

– HPC Clusters

• Any optimization needs a deep understanding of

the phenomenon

Energy/power consumption issues

16

+ other system’s

Issues ...

Page 17: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Overall picture

Various types of applications

17

Computing System

Different arrival times

Application-specific QoS

requirements

Power consumption

requirements

Energy consumption

requirements

Thermal constraints

• How can we accurately handle all these

requirements?

Page 18: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Programming has become very difficult– Impossible to balance all constraints manually

A problem as a new opportunity

Page 19: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Programming has become very difficult– Impossible to balance all constraints manually

• More computational horse-power than ever before– Cores are free

A problem as a new opportunity

Page 20: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Programming has become very difficult– Impossible to balance all constraints manually

• More computational horse-power than ever before– Cores are free

• Energy is new constraint– Software must become energy and space aware

A problem as a new opportunity

Page 21: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Programming has become very difficult– Impossible to balance all constraints manually

• More computational horse-power than ever before– Cores are free

• Energy is new constraint– Software must become energy and space aware

A problem as a new opportunity

We cannot handle all these aspects manually and

at design time due to the unknown and

unpredictable runtime working scenario

Page 22: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Programming has become very difficult– Impossible to balance all constraints manually

• More computational horse-power than ever before– Cores are free

• Energy is new constraint– Software must become energy and space aware

• Modern computing systems need context-awareness– be aware of the surrounding environment conditions

– know internal state

• To optimize and meet their requirements taking advantage as much as possible of the context to pursue concurrent goals

A problem as a new opportunity

Page 23: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Need for runtime resource management

• In order to deal with a highly dynamic and evolving

working scenario we need a runtime strategy

23

New software component

on top of (or within) the

operating system

observing the system

behavior and acting on it

Page 24: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Self-adaptive resource management

• Self-adaptive systems: systems that can observe

their runtime behavior, learn, and take actions to

meet desired goals

• Characteristics:

– Goal-oriented

• Tell the system what you want

• System’s job to figure out how to get there

– Approximate

• Does not expend any more effort than necessary to meet

goals

24

Page 25: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

25

Page 26: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

26

The controller collects

raw observations from

the available

sensors/monitors and

computes high level

metrics

Page 27: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

27

Examples of raw

observations are the

heartbeat, the current

power consumption,

the current

temperature

Page 28: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

28

Examples of metrics

the throughput, the

average power

consumption, the

average temperature

Page 29: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

29

The controller takes

decisions on the

resource management

on the basis of an

integrated policy

Page 30: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

30

Inputs of the decision

policy are the

computed metrics and

the specified

goals/constraints

Page 31: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

31

Examples of

goals/constraints are

an application QoS or

a power/energy

budget

Page 32: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

32

Many strategies can

be used to implement

the policy:

- Heuristics

- Control theory

- Machine learning

- ...

Page 33: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

33

Finally decisions are

actuated by controlling

the available

architecture/application

knobs

Page 34: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

34

Examples of architecture

knobs are the DVFS or

the core power gating

Page 35: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

• Self-adaptive systems generally implements the

so-called Observe-Decide-Act loop

35

Examples of application

knobs are the selection of

the number of threads or

the application mapping

Page 36: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Observe-decide-act loop

36

The runtime resource

management layer is

placed on top of (or

within) the operating

system

Page 37: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Resource management policies in

modern OS• The definition of policies for runtime resource

management is still a research topic

• Modern OS employs simple policies not-aware

about the application requirements

• E.g.:

– Completely Fair Scheduler

– DVFS governor

– Heterogeneous MultiProcessing (HMP) scheduler

37

Page 38: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Resource management policies in

modern OS• Completely Fair Scheduler is the Linux

scheduler

– Dispatches running processes on available

processors

– Targeted for symmetric multiprocessor systems

– Aims at maximizing overall CPU utilization

– Aims at maximizing interactive performance

38

Page 39: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Resource management policies in

modern OS• DVFS governor

– Controls the voltage/frequency of the cores

– Decides the voltage/frequency according to the

observed CPU utilization level

– Aimed at

• Providing computational power when applications

require

• Saving power when the system is unloaded

39

Page 40: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Resource management policies in

modern OS• HMP scheduler is the Linux scheduler for big.LITTLE

architectures

– Dispatches running processes on processors

– Move computational intensive processes on the big cores

– Move non-computational intensive on the LITTLE cores

– Aimed at guaranteeing high performance only to processers that

require it while saving power

40

Page 41: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Research on runtime resource

management • Academic/industrial research is intensively

investigating adaptive policies for runtime

resource management

• Studies on

– Mechanisms to monitor end-to-end QoS and define

related requirements

– Mechanisms to actuate control decisions

– Policies for the optimization of the trade-off between

application requirements and system ones

• Performance vs. power vs. energy vs ...

41

Page 42: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Application monitoring and actuation

• Definition of mechanisms for monitoring and

actuating on the application

42

Page 43: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

The Heart Rate Monitor

• Set performance goal

• Run the app and update progress

43

Issue an heartbeatStatistics automatically

updated

e.g.:

min: 25hb/sec

max: 35hb/sec

Page 44: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

The Heart Rate Monitor

• Heartbeats signal either progresses or availability

– video encoder: 1 heartbeat = 1 frame

– web server: 1 heartbeat = 1 request

– database server: 1 heartbeat = transaction

• Heart rate as a performance measure and goal

– High-level, application-specific performance

measurements and goals (e.g., video encoder: 30

heartbeats/s = 30 frames/s)

• This performance monitor is implemented with a

library used to instrument the application

44

Page 45: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Actuation on the application

• Setting of the execution specific parameters

– Select the implementation to be used (CPU, GPU,...)

– Set the number of threads

• Setting of algorithmic-specific parameters

– Some applications have parameters to trade off result

quality vs. execution latency

45

Page 46: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Actuation on the application

• Setting of the execution specific parameters

– Select the implementation to be used (CPU, GPU,...)

– Set the number of threads

• Setting of algorithmic-specific parameters

– Some applications have parameters to trade off result quality vs. execution latency

• All these parameters can be controlled only directly from the application source code

– Also in this case the runtime resource manager have to be connected with the application to actuate

46

Page 47: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Adaptive application template

47

• Template of adaptive application:int main() {

...

//initialization block

register_monitor();

...

for(i=0; i<NUM_OF_CHUNCKS; i++){

mapping = get_mapping();

if(mapping = CPU_MAPPING) {

...

} else if(mapping = GPU_MAPPING){

...

}

...

heartbeat();

}

//final block

...

deregister_monitor();

}

Communicate required performance

Get current mapping

Update performance measurements

Communicate termination

Page 48: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Self-adaptive policies

• Some examples of policies...

48

Page 49: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Run without any control

49

• Architecture: Intel Core i7 quad-core CPU

• Workload: 1 x264

Page 50: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Core allocations

• QoS requirement:

– fulfill a given throughput level

• Strategy:

– allocate a set of cores and

– set the proper number of threads

50

Page 51: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Core allocations

51

Page 52: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Performance-aware fair scheduler

• In a scenario where applications

– are competing for the same set of resources

– require predictable performance, expressed

through high-level, application-specific metrics

• The scheduler has to become Performance-

Aware to automatically allocate resources to

match performance goals

52

Page 53: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

• Architecture: Intel Core i7 quad-core CPU

• Workload: 1 x264 with throughput requirements

The controlled run

53

Consensus

object

core allocator

x264

Page 54: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Two controlled runs, different goals

54

Consensus

object

core allocator

x264 x264

• Architecture: Intel Core i7 quad-core CPU

• Workload: 2 x264 with different throughput requirements

Page 55: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Two controlled runs, different goals• Architecture: Intel Core i5 dual core CPU

• Workload: 2 Black&Scholes with different minimum

throughput requirements

• We are able to adapt to the working scenario

Linux Orchestrator

Page 56: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Three controlled runs, different goals• Architecture: Intel Core i5 CPU + 8 Maxeler DFEs

• Workload: 3 Black&Scholes with different deadlines

Page 57: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Five controlled runs, different goals• Architecture: Intel Core i5 dual core CPU + Nvidia GPU

• Workload: complex workload

Page 58: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Four controlled runs, different goals• Architecture: ARM big.LITTLE + ARM MALI GPU

• Workload: complex workload

• The side-effect is that we may save power since we are

slowing-down application execution when required

Page 59: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

DVFS tuning

• QoS requirement:

– fulfill a given throughput level and

– minimize power and/or energy consumption

• Strategy:

– allocate a set of cores

– set the proper number of threads and

– tune voltage and frequency levels

59

Page 60: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Scheduling and DVFS-control

• Resource allocation can be used in conjunction with

DVFS control to

– Meet application requirements

– Save power consumption

• Different possible strategies: race-to-idle, never-idle, ...

60

Page 61: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Scheduling and DVFS-control• Architecture: ARM big.LITTLE

• Workload: 1 blackscholes

2 applications:

Page 62: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

DVFS tuning

• QoS requirement:

– Control temperature

– Maximize performance

• Strategy:

– Tune voltage and frequency levels

– Inject idle cycles

62

Page 63: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

63

Sironi et al., ThermOS: System Support for Dynamic Thermal Management of Chip Multi-Processors,

To appear proc. of the 22nd conference on Parallel Architectures and Compilation Techniques, 2013

swaptions @ 2.80 GHzab @ 2.80 GHz

tem

pera

ture

incr

eas

e (°C

)

0

10

20

time (s)

0 100 200 300 400 500 600

2

Temperature Control/Management

set a temperature cap

Page 64: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Temperature Control/Management

DVFS is dangerous

64

Sironi et al., ThermOS: System Support for Dynamic Thermal Management of Chip Multi-Processors,

To appear proc. of the 22nd conference on Parallel Architectures and Compilation Techniques, 2013

Page 65: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Temperature Control/Management

The AcOS refreshment: ThermOS

65

Sironi et al., ThermOS: System Support for Dynamic Thermal Management of Chip Multi-Processors,

To appear proc. of the 22nd conference on Parallel Architectures and Compilation Techniques, 2013

Page 66: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

SAVE orchestrator

• The framework for runtime resource

management in the SAVE project...

66

Page 67: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

Page 68: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The targeted HSA is composed of a set of

heterogeneous processing resources

• Resources are organized in homogeneous clusters

Page 69: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The workload is composed of loop-

based and computational-intensive

applications

• Arrival times and amount of data to be

processed are unknown

• Applications are slightly instrumented

to enable adaptiveness

Page 70: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The orchestrator is the RTRM middleware

– It controls a set of processing element’s managers (PE

managers)

– It implements an Observe-Decide-Act loop

Page 71: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The orchestrator is the RTRM middleware

– It controls a set of processing element’s managers (PE

managers)

– It implements an Observe-Decide-Act loop

Page 72: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The orchestrator monitors the applications’

performance by means of specific API

• Metrics: throughput, latency,

performance/Watt

Page 73: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The orchestrator

monitors the resource

status by means of

the manager interface

• Metrics:

– Power consumption

– Energy consumption

– Utilization

– Temperature

– Voltage/frequency

Page 74: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The orchestrator collects:

– System-level requirements (e.g. energy/power

budget)

– Applications requirements (e.g. minimum

throughput)

Page 75: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The decision module consists of two

activities

– Dispatch applications to clusters

– Set resource constraints to clusters

Page 76: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator framework

• The orchestrator actuates decisions

by commanding

– The PE managers

– The applications through the APIs

Page 77: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Adaptive applications• Applications are slightly instrumented with a

specific API to interact with the orchestrator

– Communicate performance goal

– Monitor current performance (Heartbeat library)

– Actuate on the application dispatching and other

parameters (e.g. #threads)

Page 78: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

PE managers• PE managers are the actuators for the orchestrator

• A PE manager controls a cluster of homogeneous resources (CPUs, GPUs, DFEs)

• This hierarchical organization offers scalability and extendibility of the middleware

Page 79: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

PE managers• The orchestrator assigns to the PE managers

– The applications to be executed

– The “resources’ budget”

• Each manager implements an ODA loop

– It tries to fulfill applications’ requirements within the given resources’ budget

– The manager communicates possible goal failures to the orchestrator

– If necessary, the orchestrator explores other mapping solutions

Page 80: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

PE managers

• Available actuation knobs:

– CPU manager:

• Task mapping

• Dynamic voltage and frequency scaling

• Idle cycle injection

– GPU manager:

• Task mapping

• Dynamic voltage and frequency scaling

– DFE manager:

• Group setting

Page 81: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator implementation

• The orchestrator is implemented in C++ on Linux OS– It consists of a single process running in userspace

– CPU/GPU managers are implemented in the same process

– DFE manager is an external process

Page 82: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator implementation

• A custom application monitoring/control library has been

implemented

– Heartbeat mechanism to enable performance measurement

– Communications by means of shared memory

– Configuration descriptors in JSON

– Applications should implement a specific template to enable monitoring and

control

Page 83: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Orchestrator implementation• PE Managers have been implemented relying on OS

facilities– Application mapping on CPUs performed by set_affinity

– HW monitoring and control by means of Linux virtual file systems

– DFE controller adapted and extended

Page 84: Runtime Resource Management - Politecnico di Milanohome.deib.polimi.it/santambr/dida/phd/hsa/2015/doc/PD… ·  · 2015-11-27knobs are the DVFS or ... • Completely Fair Scheduler

Case studies

• Considered architectures:

– Intel Core i5 dual core CPU + Nvidia GeForce GPU

– ARM big.LITTLE + ARM MALI GPU

– Intel Core i5 dual core CPU + 8 Maxeler DFEs

• Live demo...