Upload
the-linux-foundation
View
557
Download
1
Embed Size (px)
DESCRIPTION
Samuel Thibault: Stub Domains
Citation preview
A Step Towards Dom0 Disaggregation
Samuel Thibault, Citrix/XenSource
Stub DomainsStub Domains
The Big Domain 0 Runs a lot of Xen components◦ Domain manager◦ Domain Builder◦ Device Models◦ PyGRUB
These are currently running as root◦ e.g. PyGRUB to access guest's disk
Security issues Scalability issues
What Are Stub Domains? Helper domains which run Xen components
Based on Mini-OS
Domain Builder (Derek Murray) Device Model PV-GRUB ...
What Are Stub Domains? Helper domains which run Xen components
Based on Mini-OS
Domain Builder (Derek Murray) Device Model PV-GRUB ...
POSIX Environmenton Top of Mini-OS
M MC o n s o l ef r o n t e n d
B l o c kf r o n t e n d
N e t w o r kf r o n t e n d
F Sf r o n t e n dS c h e d
M i n i - O S
f r o n t e n dF B
A p p l i c a t i o n
n e w l i b U n i x l w I P
X e n H y p e r v i s o r
getpid, sleep,read, select, ..
1200 lines
New Mini-OS Features Disk frontend FrameBuffer frontend FileSystem frontend◦ Imported from JavaGuest◦ Remote access to some /export (e.g. of dom0)
More advanced MM◦ Read-Only memory◦ CoW for zeroed pages
But still keep it simple◦ Single address space, mono-VCPU, no preemption
Bugfixes!
stubdom/
Makefile◦ Download and compile a cross-compilation
environment binutils, gcc, newlib, lwip
c/◦ 'Hello World!' C application
caml/◦ 'Hello World!' Caml application
README◦ Of course :)
Current HVM device model
d o m 0
q e m u
L i n u x
X e n H y p e r v i s o r
H V M d o m a i nI N / O U T
Current HVM dm Not always responsive◦ Have to wait for dom0 Linux to schedule qemu
Eats dom0 CPU time Uses dom0 resources from userland◦ Disk, tap network◦ Hence runs as root
HVM dm domain
s t u b d o m
L i n u x
q e m u
M i n i - O S
d o m 0
X e n H y p e r v i s o r
I N / O U TH V M d o m a i n
P V
HVM dm domain
Inb (Kcy) Boot time (s)0
5
10
15
20
25
30
35
40
45
Dom0Stubdom
HVM dm domain Disk Perfs
Read (MB/s) Write (MB/s)0
10
20
30
40
50
60
70
80
Dom0StubdomNative
HVM dm domain Disk CPU%
DomUDom0StubdomFree
DomUDom0StubdomFree
Write
Read dom0
stubdom
dom0
stubdom
HVM dm domain Net Perfse1000
Recv (MB/s) Send (MB/s)0
10
20
30
40
50
60
70
80
Dom0Stubdom
HVM dm domain Net CPU%e1000
DomUDom0StubdomFree
DomUDom0StubdomFree
Recv
Send
stubdom
stubdom
dom0
dom0
HVM dm domain Net Perfsbicore
Recv (MB/s) Send (MB/s)0
20
40
60
80
100
120
Dom0Stubdom
HVM dm domain Net CPU%bicore
DomUDom0StubdomFree
DomUDom0StubdomFree
Recv
Send
stubdom
stubdom
dom0
dom0
HVM dm domain Almost unmodified qemu◦ Disable e.g. sound support, plug Mini-OS PV drivers
Relieves dom0 Provides better CPU usage accounting◦ Can charge HVM domain with dm domain time
A lot safer ◦ Only privilege is having the HVM dom as target◦ Uses same resource access as PV guests
More efficient◦ Let the hypervisor schedule it directly◦ More lightweight OS
PyGRUB
L i n u x
X e n H y p e r v i s o r
d o m 0 P V d o m a i n
x e n d
P y G R U B
menu.lstvmlinuzinitrd
PyGRUB Needs to be root to access guest disk◦ Security issues
Does not currently provide network boot Reimplements GRUB
PV-GRUB start
L i n u x
d o m 0
G R U B l i b x c
M i n i - O S
x e n d
X e n H y p e r v i s o r
menu.lstvmlinuzinitrd
PV-GRUB loading
L i n u x
d o m 0
P V k e r n e l i n i t r d
l i b x cG R U B
M i n i - O S
x e n d
X e n H y p e r v i s o r
b l k f r o n t n e t f r o n t
menu.lstvmlinuzinitrd
PV-GRUB loaded
L i n u x
d o m 0
P V k e r n e l i n i t r d
l i b x cG R U B
M i n i - O S
x e n d
X e n H y p e r v i s o r
Kexec!
PV-GRUB
L i n u x
d o m 0
P V k e r n e l i n i t r d
P V d o m a i n
x e n d
X e n H y p e r v i s o r
PV-kexec
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
PV-kexec
P V k e r n e l
i n i t r d
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
T a r g e t P V g u e s tv i r t u a l m e m o r y
0 x c 0 0 0 0 0 0 0
s t a c k
p g t a b l e
PV-kexec
P V k e r n e l
i n i t r d
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
T a r g e t P V g u e s tv i r t u a l m e m o r y
0 x c 0 0 0 0 0 0 0
s t a c k
p g t a b l e
b o o t
PV-kexec
P V k e r n e l
i n i t r d
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
T a r g e t P V g u e s tv i r t u a l m e m o r y
0 x c 0 0 0 0 0 0 0
s t a c k
p g t a b l e
b o o tb o o t
PV-kexec
P V k e r n e l
i n i t r d
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
T a r g e t P V g u e s tv i r t u a l m e m o r y
0 x c 0 0 0 0 0 0 0
s t a c k
p g t a b l e
b o o tb o o t
PV-kexec
P V k e r n e l
i n i t r d
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
T a r g e t P V g u e s tv i r t u a l m e m o r y
0 x c 0 0 0 0 0 0 0
s t a c k
p g t a b l e
b o o tb o o t
PV-kexec
P V k e r n e l
i n i t r d
l i b x c
k e x e c
P V k e r n e l
i n i t r d
M i n i - O Sv i r t u a l m e m o r y
M i n i - O S
G R U B
b o o t
T a r g e t P V g u e s tv i r t u a l m e m o r y
0 x c 0 0 0 0 0 0 0
s t a c k
p g t a b l e
b o o tb o o t
PV-GRUB Executes upstream GRUB◦ Replace native drivers with Mini-OS drivers◦ Add PV kexec implementation
Just uses the target PV guest resources Supports network Supports graphical menu
Conclusion Dm domain◦ Improves security◦ Improves accounting◦ Improves scalability◦ Improves performances
PV-GRUB◦ Improves security◦ Provides network boot
Mini-OS also being tested at Cisco for IOS
Available in the unstable tree
Future Work Dm domain◦ Live migration, PCI PT◦ IA-64 support◦ Group scheduling with HVM domain
PV-GRUB◦ Kexec 64bit guest from 32bit PV-GRUB◦ PVFB shutdown/restart
OCaml support◦ 'Hello World!' works◦ Needs runtime rebuild to properly hook into POSIX
layer