LinuxLists.cc - [PATCH v12 00/18] KVM: s390: pv: implement lazy destroy for reboot

2022-06-28 13:59:40

Subject: [PATCH v12 00/18] KVM: s390: pv: implement lazy destroy for reboot

Previously, when a protected VM was rebooted or when it was shut down,
its memory was made unprotected, and then the protected VM itself was
destroyed. Looping over the whole address space can take some time,
considering the overhead of the various Ultravisor Calls (UVCs). This
means that a reboot or a shutdown would take a potentially long amount
of time, depending on the amount of used memory.

This patchseries implements a deferred destroy mechanism for protected
guests. When a protected guest is destroyed, its memory can be cleared
in background, allowing the guest to restart or terminate significantly
faster than before.

There are 2 possibilities when a protected VM is torn down:
* it still has an address space associated (reboot case)
* it does not have an address space anymore (shutdown case)

For the reboot case, two new commands are available for the
KVM_S390_PV_COMMAND:

KVM_PV_ASYNC_CLEANUP_PREPARE: prepares the current protected VM for
asynchronous teardown. The current VM will then continue immediately
as non-protected. If a protected VM had already been set aside without
starting the teardown process, this call will fail. In this case the
userspace process should issue a normal KVM_PV_DISABLE

KVM_PV_ASYNC_CLEANUP_PERFORM: tears down the protected VM previously
set aside for asychronous teardown. This PV command should ideally be
issued by userspace from a separate thread. If a fatal signal is
received (or the process terminates naturally), the command will
terminate immediately without completing. The rest of the normal KVM
teardown process will take care of properly cleaning up all leftovers.

The idea is that userspace should first issue the
KVM_PV_ASYNC_CLEANUP_PREPARE command, and in case of success, create a
new thread and issue KVM_PV_ASYNC_CLEANUP_PERFORM from there. This also
allows for proper accounting of the CPU time needed for the
asynchronous teardown.

This means that the same address space can have memory belonging to
more than one protected guest, although only one will be running, the
others will in fact not even have any CPUs.

The shutdown case should be dealt with in userspace (e.g. using
clone(CLONE_VM)).

A module parameter is also provided to disable the new functionality,
which is otherwise enabled by default. This should not be an issue
since the new functionality is opt-in anyway. This is mainly thought to
aid debugging.

v11->v12
* rebase
* fix and improve comments and documentation
* introduce the module parameter at the end of the series
* merge old patch 14 and 15
* minor cosmetic changes to improve readability
* renamed some functions and constants (including in uapi)
* use the lock instead of xchg
* kvm_s390_pv_deinit_cleanup_all will now cleanup all leftovers, while
kvm_s390_pv_deinit will destroy the current protected VM without
further action
* kvm_s390_pv_deinit_cleanup_all now called unconditionally when KVM
is being torn down
* fix small memory leak if kvm_s390_pv_deinit_vm_fast fails

v10->v11
* rebase
* improve comments and patch descriptions
* rename s390_remove_old_asce to s390_unlist_old_asce
* rename DESTROY_LOOP_THRESHOLD to GATHER_GET_PAGES
* rename module parameter lazy_destroy to async_destroy
* move the WRITE_ONCE to be right after the UVC in patch 13
* improve handling leftover secure VMs in patch 14
* lock only when needed in patch 15, instead of always locking and then
unlocking and locking again
* refactor should_export_before_import to make it more readable

v9->v10
* improved and expanded comments, fix typos
* add new patch: perform destroy configuration UVC before clearing
memory for unconditional deinit_vm (instead of afterwards)
* explicitly initialize kvm->arch.pv.async_deinit in kvm_arch_init_vm
* do not try to call the destroy fast UVC in the MMU notifier if it is
not available

v8->v9
* rebased
* added dependency on MMU_NOTIFIER for KVM in arch/s390/kvm/Kconfig
* add support for the Destroy Secure Configuration Fast UVC
* minor fixes

v7->v8
* switched patches 8 and 9
* improved comments, documentation and patch descriptions
* remove mm notifier when the struct kvm is torn down
* removed useless locks in the mm notifier
* use _ASCE_ORIGIN instead of PAGE_MASK for ASCEs
* cleanup of some compiler warnings
* remove some harmless but useless duplicate code
* the last parameter of __s390_uv_destroy_range is now bool
* rename the KVM capability to KVM_CAP_S390_PROTECTED_ASYNC_DISABLE

v6->v7
* moved INIT_LIST_HEAD inside spinlock in patch 1
* improved commit messages in patch 2
* added missing locks in patch 3
* added and expanded some comments in patch 11
* rebased

v5->v6
* completely reworked the series
* removed kernel thread for asynchronous teardown
* added new commands to KVM_S390_PV_COMMAND ioctl

v4->v5
* fixed and improved some patch descriptions
* added some comments to better explain what's going on
* use vma_lookup instead of find_vma
* rename is_protected to protected_count since now it's used as a counter

v3->v4
* added patch 2
* split patch 3
* removed the shutdown part -- will be a separate patchseries
* moved the patch introducing the module parameter

v2->v3
* added definitions for CC return codes for the UVC instruction
* improved make_secure_pte:
- renamed rc to cc
- added comments to explain why returning -EAGAIN is ok
* fixed kvm_s390_pv_replace_asce and kvm_s390_pv_remove_old_asce:
- renamed
- added locking
- moved to gmap.c
* do proper error management in do_secure_storage_access instead of
trying again hoping to get a different exception
* fix outdated patch descriptions

v1->v2
* rebased on a more recent kernel
* improved/expanded some patch descriptions
* improves/expanded some comments
* added patch 1, which prevents stall notification when the system is
under heavy load.
* rename some members of struct deferred_priv to improve readability
* avoid an use-after-free bug of the struct mm in case of shutdown
* add missing return when lazy destroy is disabled
* add support for OOM notifier

Claudio Imbrenda (18):
KVM: s390: pv: leak the topmost page table when destroy fails
KVM: s390: pv: handle secure storage violations for protected guests
KVM: s390: pv: handle secure storage exceptions for normal guests
KVM: s390: pv: refactor s390_reset_acc
KVM: s390: pv: usage counter instead of flag
KVM: s390: pv: add export before import
KVM: s390: pv: clear the state without memset
KVM: s390: pv: Add kvm_s390_cpus_from_pv to kvm-s390.h and add
documentation
KVM: s390: pv: add mmu_notifier
s390/mm: KVM: pv: when tearing down, try to destroy protected pages
KVM: s390: pv: refactoring of kvm_s390_pv_deinit_vm
KVM: s390: pv: destroy the configuration before its memory
KVM: s390: pv: asynchronous destroy for reboot
KVM: s390: pv: api documentation for asynchronous destroy
KVM: s390: pv: add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE
KVM: s390: pv: avoid export before import if possible
KVM: s390: pv: support for Destroy fast UVC
KVM: s390: pv: module parameter to fence asynchronous destroy

Documentation/virt/kvm/api.rst | 31 ++-
arch/s390/include/asm/gmap.h | 39 ++-
arch/s390/include/asm/kvm_host.h | 4 +
arch/s390/include/asm/mmu.h | 2 +-
arch/s390/include/asm/mmu_context.h | 2 +-
arch/s390/include/asm/pgtable.h | 21 +-
arch/s390/include/asm/uv.h | 11 +
arch/s390/kernel/uv.c | 85 ++++++
arch/s390/kvm/Kconfig | 1 +
arch/s390/kvm/kvm-s390.c | 101 +++++++-
arch/s390/kvm/kvm-s390.h | 4 +
arch/s390/kvm/pv.c | 384 +++++++++++++++++++++++++++-
arch/s390/mm/fault.c | 23 +-
arch/s390/mm/gmap.c | 177 +++++++++++--
include/uapi/linux/kvm.h | 3 +
15 files changed, 825 insertions(+), 63 deletions(-)

--
2.36.1

2022-06-28 13:59:45

Subject: [PATCH v12 00/18] KVM: s390: pv: implement lazy destroy for reboot

Subject: [PATCH v12 03/18] KVM: s390: pv: handle secure storage exceptions for normal guests

Subject: [PATCH v12 06/18] KVM: s390: pv: add export before import

Subject: [PATCH v12 10/18] s390/mm: KVM: pv: when tearing down, try to destroy protected pages

Subject: [PATCH v12 09/18] KVM: s390: pv: add mmu_notifier

Subject: [PATCH v12 18/18] KVM: s390: pv: module parameter to fence asynchronous destroy

Subject: [PATCH v12 04/18] KVM: s390: pv: refactor s390_reset_acc

Subject: [PATCH v12 15/18] KVM: s390: pv: add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE

Subject: [PATCH v12 07/18] KVM: s390: pv: clear the state without memset

Subject: [PATCH v12 05/18] KVM: s390: pv: usage counter instead of flag

Subject: [PATCH v12 11/18] KVM: s390: pv: refactoring of kvm_s390_pv_deinit_vm

Subject: [PATCH v12 08/18] KVM: s390: pv: Add kvm_s390_cpus_from_pv to kvm-s390.h and add documentation

Subject: [PATCH v12 12/18] KVM: s390: pv: destroy the configuration before its memory

Subject: [PATCH v12 17/18] KVM: s390: pv: support for Destroy fast UVC

Subject: [PATCH v12 13/18] KVM: s390: pv: asynchronous destroy for reboot

Subject: [PATCH v12 01/18] KVM: s390: pv: leak the topmost page table when destroy fails

Subject: [PATCH v12 16/18] KVM: s390: pv: avoid export before import if possible

Subject: [PATCH v12 02/18] KVM: s390: pv: handle secure storage violations for protected guests

Subject: [PATCH v12 14/18] KVM: s390: pv: api documentation for asynchronous destroy

Subject: Re: [PATCH v12 00/18] KVM: s390: pv: implement lazy destroy for reboot

Subject: Re: [PATCH v12 00/18] KVM: s390: pv: implement lazy destroy for reboot