Download pdf - Seattle2015 xen

Transcript

PVHVM Linux guest

why doesn't kexec work?

Vitaly Kuznetsov

Red HatXen Developer Summit, 2015

2 PVHVM Linux guest: why doesn't kexec work?

Why?

● We support Red Hat Enterprise Linux.

● Bare hardware, virtualized and cloud environments, ...

● Kernel issues happen.

● Analyse stack traces.

● In complicated cases use kdump!

3 PVHVM Linux guest: why doesn't kexec work?

Kexec/kdump

● “kexec … is a mechanism of the Linux kernel that allows "live" booting of a new kernel "over" the currently running kernel”

● Kdump uses kexec:● Some memory is reserved at boot (crashkernel=)● Crash kernel/initrd are loaded to the area.● On crash we trigger crash kernel's boot.● Crash initrd dumps all domain's memory and reboots.● You have crash file to analyse! (profit!!!)

Doesn't work for Xen guests

5 PVHVM Linux guest: why doesn't kexec work?

Issues with Kexec on PVHVM

● Previously used structures cause problems, no good way to transfer knowledge to kexec kernel.

● and we need these interfaces working!● Xen/guest interfaces we need to re-establish:

● shared_info frame (XENMAPSPACE_shared_info)● VCPU_info (VCPUOP_register_vcpu_info)● Event channels (EVTCHNOP_bind_*, ABI)

● + Emuirq/pirq mappings (PHYSDEVOP_map_pirq)

● Granted pages

6 PVHVM Linux guest: why doesn't kexec work?

shared_info page:

● 4k page, belongs to Xen hypervisor.

● Required for events, vcpu_info for first 32 VCPUs lives here.

● Upon boot guest chooses one of its pages to sacrifice.● XENMEM_add_to_physmap(XENMAPSPACE_shared_info)

frees guest's frame and mounts shared_info there.

● kexec kernel does the same for another frame → we get a hole as shared_info is being unmapped from its previous place.

7 PVHVM Linux guest: why doesn't kexec work?

Event channels:

● Already bound event channels● “(XEN) event_channel.c:370:d2v0 EVTCHNOP failure: error -17”

● 2 level → FIFO ABI switch at boot

● Mapped control block, event array pages.● Some INTERDOMAIN channels are being set up by

the toolstack:

● Xenstore, xenconsole,..● EVTCHNOP_reset resets everything, there is no

way back.

8 PVHVM Linux guest: why doesn't kexec work?

Grant pages:

● Memory sharing mechanism in Xen.

● We can't do anything guest-side:

● Forcibly unmapping a page from backend domain will crash it.

● Requesting new pages requires additional memory.● Some grants are “persistent”.

● Maybe not-an-issue for kdump because its memory region is separated but

● We still need functional backends for kexec kernel!

Possible solutions

10 PVHVM Linux guest: why doesn't kexec work?

“Obvious solution”

● Implement set of hypercalls to tear all interfaces down:

● reset_vcpu_info● evtchn_switch_to_2l● unmap_shared_info● do_something_with_granted_pages● …

● Good from “if there is a way to set something up there should be one to tear it down” PoV.

● Good for hypervisor testing :-)

11 PVHVM Linux guest: why doesn't kexec work?

“Obvious solution”

● Issues:

● Domain needs to follow a special protocol – what if it doesn't?

● Granted pages story is complicated.● Not all bits are being set up by the domain.● Too many possible issues (including security).

12 PVHVM Linux guest: why doesn't kexec work?

“New domain with the same memory”

● Destroy the original domain leaving its memory intact.

● Create new domain, reassign all memory pages, copy vcpu contexts.

● Benefits:

● No cumbersome teardown required!● Migration path is being reused!● Supportability: new interfaces/objects should “just

work”.

13 PVHVM Linux guest: why doesn't kexec work?

“New domain with the same memory”

● Issues:

● Memory reassignment appears to be cumbersome :-(

● Superpages, PoD, mem_access issues.● No m2p on ARM.

● Non-trivial toolstack part repeating migration code.● Too complicated.

14 PVHVM Linux guest: why doesn't kexec work?

“Reset everything”

● No cumbersome memory reassignment.

● Explicit list of interfaces to reset with one hypercall:

● shared_info, vcpu_info, event channels, pirq_to_emuirq, ioreq servers.

● Toolstack involvement required:

● Restart device model.● Reopen xenstore/xenconsole event channels.● ..

● Hypervisor maintainers like it :-)

15 PVHVM Linux guest: why doesn't kexec work?

“Reset everything”

● Granted pages - let's do (almost) nothing!

● Remove the domain from xenstore and add it back – all backends are supposed to release all mappings.

● Xenconsoled doesn't release its mapping (but that's fine).

● Special debug print to find future issues.● Hunt for misbehaving backends! (if there are such)

Current status andfuture work

17 PVHVM Linux guest: why doesn't kexec work?

Current status and future work

● [PATCH v10 00/11] “toolstack-assisted approach to PVHVM guest kexec” is out waiting for reviewers!

● … and testers too!● PVH (as "HVM without device model") should "just

work".● Not tested, minor issues are possible.

● ARM-specific part is -ENOSYS stub for now.● shared_info page needs handling (same as x86).● Some GIC cleanup?

Thank you!Questions?

Vitaly [email protected]


Recommended