Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Resource Monitoring for cgroup Hierarchies
Kanaka Juvva, Intel Corporation
Agenda
• Motivation • Multi-core Cache Hierarchies
• Application Domains
• Intel Xeon’s Shared Resource Monitoring Technology
• Implementation
• Use Cases
• cgroups
• Hierarchical Monitoring
Motivation (Why Monitoring ?)
Moore’s Law Drives Exponential growth in Compute and Data
Big Data ProliferationDynamic Data Centers CloudHPC
Multi-Core is the Horse Power behind the Information Highway
Linear Scalability for CPU boundContention for shared resources
• Impacts Scalability
Multi-core cache hierarchies
Cache Hierarchies help performance and mitigate contention
Also Data Orchestration improves Scalability to applications
core(s)core(s)core(s)core(s)core(s)core(s)core(s)
MEMORY
Motivation Contd….
Intel® Xeon® processor Shared Resource Monitoring
LLC Occupancy
Memory Bandwidth
Monitoring at multiple levelso Thread
o Process
o APP/Workload/cgroup/Container
o VMM
o System
• Quantify Performance
C4C3C2C1
L1/L2L1/L2L1/L2L1/L2
L3- LLC Occupancy
Memory - Memory Bandwidth
APP VM CONTAINER
“
”
Intel Xeon’s Shared Resource Monitoring
Incarnation of Intel Xeon CPUs
Platform/Shared Resource Monitoring Broadwell-EP and Broadwell-EX
Grantley and Brickland Codenames
Broadwell – EP Microarchitecture Features
o Cache Monitoring Technology
o Memory Bandwidth Monitoring
o QuickPath Interconnect:
o 2 QPI 1.1 channels
o Cores Per Socket: Up to 22
o Threads Per Socket: Up to 44
o Sockets: 2
o Last Level Cache (L3): ~55MB
BROADWELL-EP
Core Core
Core Core
Core Core
Shared Cache
QPI
QPI
2 QPI 1.1
4 Channels
DDR4
40 Lanes PCIe* 3.0
DMI2
DDR4
DDR4
DDR4
DDR4
Intel Xeon’s Shared Resource Monitoring Technology (Contd..)
Resource monitoring capability in each logical processor i.e. core. Two monitoring technologies are supported. Cache Monitoring Technology (CMT): Allows an OS, Hypervisor or VMM to monitor the usage of cache by applications Memory Bandwidth Monitoring Technology (MBM): Allows OS, Hypervisor or VMM to monitor bandwidth from one level
of cache hierarchy to the next – L3 cache which is backed directly by system memoryEither of them or both features can be present in a given CPU model. Have some common sub-features.
Linux (OS) can check the presence of the above features via a CPUID feature bit Details of each sub-feature via CPUID leaves and sub-leaves A mechanism for the Linux/OS or VMM to set a software-defined ID known as Resource Monitoring
ID (RMID) for each software thread A mechanism to monitor cache and bandwidth stats per RMID basis Linux/OS can Read back the collected stats such as L3 occupancy and Memory BW for an RMID Different event types are to distinguish the resource metrice.g. LLC_OCCUPANCY, TOTAL_BW, LOCAL_BWSoftware configures the event selection MSR (IA32_QM_EVTSEL) to specify the event or metric to be reported. RDMSR and WRMSR instructions are used to read and write the monitoring MSRs.
Implementation
Adoption into Linux Kernelhttps://lkml.org/lkml/2016/3/21/176https://github.com/kanakajuvva
Linux User Space APIs Perf_event APIo perf_event_open() struct perf_event
o Performance Monitoring Unit Fill and Register struct pmu Call back functions to start, read and stop stats
Perf events are exported to userland LOCAL_BWo perf stat –e intel_cqm/local_bw/ -p “pid of my_application” TOTAL_BWo perf stat –e intel_cqm/total_bw/ -p “pid of my_application”
LLC_OCCUPANCYo perf stat –e intel_cqm/llc_occupancy/ -p “pid of my_application”
Use CasesMonitoring Applicationcpumem benchmarkoConsumes constant memory bw of ~13.7GB/Sec
13000
13100
13200
13300
13400
13500
13600
13700
13800
13900
1 6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
10
1
10
6
11
1
11
6
12
1
12
6
13
1
Total BW MB/Sec
13780
13785
13790
13795
13800
13805
13810
13815
13820
13825
13830
13835
1 6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
10
1
10
6
11
1
11
6
12
1
12
6
13
1
13
6
Local BW MB/Sec
Use Cases Contd..Monitoring Multi-threaded Application cqm benchmarko 8 – 48 threads
0
5000
10000
15000
20000
25000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82
Loca
l BW
-M
B/S
ec
Time (Secs)
CQM Benchmark - 8,12,24 and 48 Threads
8 Threads 12 Threads 24 Threads 48 Threads
0
5000
10000
15000
20000
25000
30000
35000
1 4 7 10131619222528313437404346495255586164677073767982858891
Tota
l BW
-M
B/S
ec
Time (Secs)
CQM Benchmark -- 8, 12, 24 and 48 Threads
8 Threads 12 Threads 24 Threads 48 Threads
Use Cases Contd..Monitoring VMLight VM running Linux – 3 scenariosoVM Boot
oVM Idle
oAn APP running in VM
0
100
200
300
400
500
600
1
18
35
52
69
86
10
3
12
0
13
7
15
4
17
1
18
8
20
5
22
2
23
9
25
6
27
3
29
0
30
7
32
4
34
1
35
8
37
5
39
2
40
9
42
6
44
3
Loca
l BW
-M
B/S
ec
Time (Secs)
VM Boot
0
2
4
6
8
10
12
14
16
1
19
37
55
73
91
10
9
12
7
14
5
16
3
18
1
19
9
21
7
23
5
25
3
27
1
28
9
30
7
32
5
34
3
36
1
37
9
39
7
41
5
43
3
45
1
46
9
Loca
l BW
-M
B/S
ec
Time (Secs)
VM Idle
Monitoring VM Contd..
0
5
10
15
20
25
30
35
40
45
1 3 5 7 9 11131517192123252729313335373941434547495153555759616365676971737577798183858789919395
Loca
l BW
-M
B/S
ec
Time (Secs)
FWTS APP
cgroups
• Linux mechanism(s) to manage and monitor resources for processes
• cgroups blend very well with containers
• MBM and CMT counters are read per RMID cgroup perf_event subsystem is used by Linux Perf tool for reading stats
Thread level stats are maintained in driver
Obtain cgroup membership for each thread
cgroup -> RMID(s) mapping
• LLC Occupancy /Memory BW consumed by a cgroup
= ∑ Resource consumed for each RMID in cgroup
Hierarchical Monitoring
cgroups organize processes hierarchically Form a tree structure cgroup v1 and cgroup v2
Process belongs to at least one cgroupAll threads in a process belong to same cgroup in v2Traverse from parent to child to list the threads
• Resource consumed by parent cgroup = ∑ Resource consumed for each cgroup
= ∑ ∑ Resource consumed for each RMID in cgroup
Current MBM/CMT design is transparent to the version of thehierarchy
Conclusions
• Intel Xeon’s Platform Resource Monitoring
• Linux Implementation
•Use Cases and Examples Scenarios
•Monitoring for Linux cgroups
Q & A
Thank you